Welcome to File Management Utils’s documentation!
Install:
Install via pip
pip install fmutils
Dependencies
numpy
pandas
matplotlib
shutil
os
pathlib
tqdm
seaborn
Language
python > 3.6
OS
Linux
Windows
fmutils Package
Submodules:
directorytree module
- class src.directorytree.DirectoryTree(root_dir, dir_only=False, write_tree=True)[source]
Bases:
object
- Parameters
root_dir (string/path) – absolute/relative path to root directory containing all files.
dir_only (Bool, optional) – whether to only show sub-dirs in the dir-tree. The default is False.
write_tree (Bool, optional) – write the full dir-tree in a txt file in current working dir. The default is True.
- Return type
None.
- src.directorytree.clone_dir_tree(source_dir, dest_dir)[source]
- Parameters
source_dir (string/path) – dir form which to clone the dir tree.
dest_dir (string/path) – base dir location where the new dir tree will be cloned
- Return type
None. Creates the directories at new location without copying files
fmutils module
- src.fmutils.del_all_files(main_dir, confirmation=True)[source]
- Parameters
main_dir (path/string) – Absolute/relative path to root directory containing all files to be deleted.
confirmation (Bool, optional) – Whether to ask for confirmation before deleting all the files. The default is True.
- Return type
None.
Warning
If you set the confirmation to False then all the files inside the root directory will be permanently deleted.
- src.fmutils.file_name_replacer(main_dir, new_name, name2replace)[source]
- Parameters
data_dir (string/path) – main dir containig all the sub dir.
new_name (list of strings) – A list containing the new names which will replace the old ones.
name2replace (list of string) – A list containing the strings which will be replaced wiht new ones.
- Returns
None.
Information
————-
Changes the names of all files inside a dir by replacing the specific strings
in old file name with new ones, specified via 2 input lists.
Both lists should have same length
- src.fmutils.get_all_dirs(main_dir, sort=True)[source]
- Parameters
main_dir (string/path) – Absolute/relative path to root directory containing all sub-dirs.
sort (Bool, optional) – Whether to sort the output list of dirs or not. The default is True.
- Returns
file_list – List containing full paths of all the dirs.
- Return type
List,
- src.fmutils.get_all_files(main_dir, sort=True)[source]
- Parameters
main_dir (string/path) – Absolute/relative path to root directory containing all files.
sort (Bool, optional) – Whether to sort the output list of files or not. The default is True.
- Returns
file_list – List containing full paths of all the files.
- Return type
List
- src.fmutils.get_basename(full_path)[source]
- Parameters
full_path (string/path) – Absolute path to the file.
- Returns
name – basename of the file.
- Return type
string
- src.fmutils.get_dir_props(main_dir)[source]
- Parameters
main_dir (absolute/relative path to root directory containing all files.) –
- Returns
A Dictionary containing follwoing keys/info;
files_in_sub_dirs (an array containing number of file in all sub dirs of root.)
sub_dirs (name of all the sub-dirs/classes inside the root.)
total_files (total number of files in all the sub-dir/classes.)
- src.fmutils.get_num_of_files(main_dir)[source]
- Parameters
main_dir (string/path) – Absolute/relative path to root directory containing all files.
- Returns
A Dictionary containing follwoing keys/info.
files_in_sub_dirs (an array containing number of file in all sub dirs of root.)
sub_dirs (name of all the sub-dirs/classes inside the root.)
total_files (total number of files in all the sub-dir/classes.)
- src.fmutils.get_pdir(full_path)[source]
- Parameters
full_path (string/path) – Absolute path to the file.
- Returns
name – parent dir/sub-dir containing file.
- Return type
string
- src.fmutils.get_random_files(main_dir, count=1)[source]
- Parameters
main_dir (path/string) – Absolute/relative path to root directory containing all files.
count (int, optional) – Total numbner of randomly selected files. The default is 1.
- Returns
file_path – A list containig full paths to randomly selected files.
- Return type
list
- src.fmutils.get_suffix(full_path)[source]
- Parameters
full_path (string/path) – Absolute path to the file.
- Returns
name – extension of that file.
- Return type
string
- src.fmutils.move_matching_files(path2copy, path2match, path2paste)[source]
- Parameters
path2copy (string/path) – Absolute path to directory from where to copy files.
path2match (string/path) – Absolute path to directory from where to match files/file-names.
path2paste (string/path) – Absolute path to directory where to move the files having matched names.
- Returns
None.
Information
————
Example use case might be in your ML training data you have labels in one dir and
images in one dir but you deleted some blurred/damages images and now you only want
to keep labels that have their corresponding images.
- src.fmutils.numericalSort(value)[source]
Note
This function is just used for sorting the output file lists in alphabetical/numerical oreder.
- src.fmutils.remove_empty_dirs(main_dir)[source]
- Parameters
main_dir (string/path) – Absolute path to directory from where you want to remove all the empty directories.
- Return type
None.
- src.fmutils.rename_wrt_dirname(main_dir)[source]
Note
Change the names of all files inside the main_dir wrt their sub_dir names.
- Parameters
main_dir (string/path) – main directory containing all sub dirs.
- Return type
None.
- src.fmutils.tvt_split(img_dir, dest_dir, lbl_dir=None, test_split=0.2, val_split=0.1, mode='copy', multi_class=True)[source]
- Parameters
img_dir (string/path) – absolute/relative path to root directory containing all files.
dest_dir (string/path) – absolute/relative path to root directory containing all files..
lbl_dir (TYPE) – DESCRIPTION.
test_split (float beween [0, 1], optional) – Percentage of test split. The default is 0.2.
val_split (TYPE, optional) – Percentage of validation split. The default is 0.1.
mode (string, optional One of ['copy', 'move']) – Whether to copy the data or move it. The default is ‘copy’.
multi_class (boolean,) – Whether their are multiple classes in data or only one class.
- Return type
None.
Note
Create Train-Validation-Test splits of the data for ML models. in follwoing format
../split/ │ ├── test │ └── images │ ├── class_1 │ │ │ ├── class_2 ├── train │ └── images │ ├── class_1 │ │ │ ├── class_2 └── val └── images ├── class_1 │
├── class_2
plottingutils module
- src.plottingutils.plot_data_dist(main_dir, sort=1)[source]
- Parameters
main_dir (string/path) – main directory which contains all the classes
sort (One of [None, 1, 2].) – Whether to sort the data or not. __None__: wont sort the data and the dirs will also be shown __1__ : sorth by class name __2__ : sort by file count
- Return type
None. just plots the data distribution graph
Module contents
Examples:
A standard use case for DirectoryTree Generator
The following example shows a standard use case.
from fmutils import directorytree as fdt
dt = fdt.DirectoryTree(root_dir='..Downloads/test_dir', dir_only=False,
write_tree=True)
dt.generate()
# or clone the dir tree at another location
fdt.clone_dir_tree('/Downloads/test_dir/', '/Downloads/test_dir2/')
A Use case for getting the list of dirs
The following example shows a simple example to get the list of all the sub-dirs in the root dir.
from fmutils import fmutils as fmu
d_list = fmu.get_all_dirs(main_dir = '../Downloads/test_dir/', sort=True)
print(d_list)
Plotting Data Dist
The following example shows a simple example to get the bar plot of files present inside the root dir.
from fmutils import plottingutils as fpu
df = fpu.plot_data_dist('../Downloads/test_dir/', sort=None).
Train Validation Test Split
The following example shows a simple example to get the bar plot of files present inside the root dir.
from fmutils import fmutils as fmu
fmu.tvt_split(img_dir, dest_dir, lbl_dir=None, test_split=0.2, val_split=0.1, mode='copy')