Learn module¶
auto_ml module¶
This module is focused on Machine Learning purposes. And simplifying these workflows
- teemi.learn.auto_ml.autoML_on_partitioned_data(feature_cols: list, training_column: str, df_input_for_ml, path='', partitions=5, training_time=5, nfold=10) → None[source]
Runs over a pandas dataframe and trains MLs according to specified partition lenght.
- Parameters
feature_cols (list) – the column you want to train on fx : feature_cols = [‘0’, ‘1’, ‘2’, ‘3’]
training_column (str) – the column you want the models to be able to predict
df_input_for_ml (pd.DataFrame) – A pandas dataframe with your data
training_time (int) – the amount of time you want to train the models. If you set it to 0, it will not have a time limit and will train untill it reaches saturation i.e. best models
- Returns
- Return type
csv files in the specified path
plotting module¶
Module used for producing/reproducing plots
- teemi.learn.plotting.bar_plot(x: list, y: list, error_bar: Optional[list] = None, horisontal_line=True, save_pdf=True, color='white', path='', title=None, x_label=None, y_label=None, size_height: int = 25, size_length: int = 15) → None[source]
Plotting a bar_plot .
- Parameters
x (list) – x coordinates
y (list) – y coordinates
error_bar (list (Optional)) – lits of errorbars matching the x coordinates
horisontal_line (bool) –
save_pdf (bool) –
path (str) –
color (str) – can be matplotlib color names or hex color codes
title (str) –
x_label (str) –
y_label (str) –
- Returns
- Return type
bar_plot
- teemi.learn.plotting.bar_plot_w_hue(dataframe, x: str, y: str, save_pdf=True, path='', hue: str = 'category', palette='dark', title='', x_label='', y_label='', horisontal_line: bool = True, size_height: int = 10, size_length: int = 10) → None[source]
Plotting a correlation_plot.
- Parameters
dataframe (pd.DataFrame) – Dataframe with categories in seperate column from x and y
x (str) – x coordinates
y (str) – y coordinates
save_pdf (bool) –
path (str) – path to folder with name of the file
title (str) –
x_label (str) –
y_label (str) –
- Returns
- Return type
bar_plot_w_hue
- teemi.learn.plotting.carpet_barplot(pd_dataframe_cross_tab_prop, colorDict: dict, save_pdf=True, path='', xlabel='', ylabel='', size_height: int = 10, size_length: int = 20, bar_width=1.0, ylim=[0, 25]) → None[source]
Plotting stacked barplots from a pandas dataframe cross tab df
- Parameters
pd_dataframe_cross_tab_prop (pd.DataFrames) – pd.crosstab
colorDict (Dict) – key as coloumn name, value hex color code
save_pdf (bool) –
path (string) –
- Returns
- Return type
stacked barplot
- teemi.learn.plotting.color_range_dict() → dict[source]
Returns a dictionary of color ranges.
- Returns
A dictionary of color ranges containing keys ‘yellow’, ‘orange’, ‘blue’ and ‘green’
- Return type
dict
- teemi.learn.plotting.correlation_plot(dataframe, x: str, y: str, save_pdf: bool = True, path: str = '', title: str = '', size_height: int = 10, size_length: int = 10, y_axis_range: list = [0, 1], x_axis_range: list = [0, 1]) → None[source]
Plotting a correlation_plot.
- Parameters
dataframe (pd.DataFrame) –
x (str) – x coordinates
y (str) – y coordinates
save_pdf (bool) –
path (str) –
size_height (int) –
size_length (int) –
x_axis_range (list) –
y_axis_range (list) –
- Returns
- Return type
correlation_plot
- teemi.learn.plotting.grouped_bar_plot(x: list, y: list, colors: list, category_labels: list, title='', y_label='', x_label='', path='', axhline=True, size_height: int = 10, size_length: int = 10)[source]
Create a grouped bar plot from input data and save the plot in a pdf format
- Parameters
x (list) – A list of x-coordinates of the bars in the plot.
y (list) – A list of y-coordinates of the bars in the plot.
colors (list) – A list of color codes for the bars in the plot.
category_labels (list) – A list of labels for the categories represented by the bars in the plot.
title (str, optional) – The title of the plot. Default is an empty string.
y_label (str, optional) – The label for the y-axis of the plot. Default is an empty string.
x_label (str, optional) – The label for the x-axis of the plot. Default is an empty string.
path (str, optional) – The path to the directory where the plot will be saved. Default is an empty string.
axhline (bool, optional) – The flag to show an horisontal line
- teemi.learn.plotting.horisontal_bar_plot(x: list, y: list, vertical_line: bool = True, save_pdf: bool = True, path: str = '', color='white', title=None, x_label=None, y_label=None, size_height: int = 10, size_length: int = 8, legend=False) → None[source]
Plotting a horisontal_bar_plot .
- Parameters
x (list) – x coordinates
y (list) – y coordinates
vertical_line (bool) –
save_pdf (bool) –
path (str) –
color (str) –
be matplotlib color names or hex color codes (can) –
title (str) –
x_label (str) –
y_label (str) –
- Returns
- Return type
horisontal_bar_plot
- teemi.learn.plotting.plot_ml_learning_curve(x_partitioned_data: list, y_training: list, y_cv: list, training_sd, cv_sd: list, save_pdf=True, path='', size_height: int = 10, size_length: int = 10, title='', linewidth: int = 1.5, y_axis_range: list = [0, 25]) → None[source]
Plotting a learning curve from partitioned dataframes.
- Parameters
x_partitioned_data (list) – x coordinates
y_training (list) – y coordinates for trained models
y_cv (list) – y coordinates for cross-validated models
training_sd (list) – standard-deviation for the models
cv_sd (list) – standard-deviation for cross-validated models
save_pdf (bool) –
path (str) –
- Returns
- Return type
Learning curve
- teemi.learn.plotting.plot_phylo_tree(alignment_file, save_pdf=True, path='', height=10, wideness=8)[source]
Plotting a phylogenetic tree.
- Parameters
alignment_file (Bio.Align.MultipleSeqAlignment) – A multiple sequence alignemt made into a Bio.Align.MultipleSeqAlignment obejct Dataframe with categories in seperate column from x and y
save_pdf (bool) –
path (str) – path to folder with name of the file
- Returns
- Return type
plot_phylo_tree
- teemi.learn.plotting.plot_stacked_barplot_with_labels(df: pandas.core.frame.DataFrame, colors: list, title='', y_label='', x_label='', path='', size_length: int = 20, size_heigth: int = 10)[source]
Plots a stacked barplot from a dataframe.
- Parameters
df (pd.DataFrame) – The dataframe containing the data to be plotted.
colors (list) – A list of colors for the different bars in the plot.
title (str, optional) – The title of the plot. Default is an empty string.
y_label (str, optional) – The label for the y-axis of the plot. Default is an empty string.
x_label (str, optional) – The label for the x-axis of the plot. Default is an empty string.
path (str, optional) – The path to the directory where the plot will be saved. Default is an empty string.