Learn module

auto_ml module

This module is focused on Machine Learning purposes. And simplifying these workflows

teemi.learn.auto_ml.autoML_on_partitioned_data(feature_cols: list, training_column: str, df_input_for_ml, path='', partitions=5, training_time=5, nfold=10)None[source]

Runs over a pandas dataframe and trains MLs according to specified partition lenght.

Parameters
  • feature_cols (list) – the column you want to train on fx : feature_cols = [‘0’, ‘1’, ‘2’, ‘3’]

  • training_column (str) – the column you want the models to be able to predict

  • df_input_for_ml (pd.DataFrame) – A pandas dataframe with your data

  • training_time (int) – the amount of time you want to train the models. If you set it to 0, it will not have a time limit and will train untill it reaches saturation i.e. best models

Returns

Return type

csv files in the specified path

plotting module

Module used for producing/reproducing plots

teemi.learn.plotting.bar_plot(x: list, y: list, error_bar: Optional[list] = None, horisontal_line=True, save_pdf=True, color='white', path='', title=None, x_label=None, y_label=None, size_height: int = 25, size_length: int = 15)None[source]

Plotting a bar_plot .

Parameters
  • x (list) – x coordinates

  • y (list) – y coordinates

  • error_bar (list (Optional)) – lits of errorbars matching the x coordinates

  • horisontal_line (bool) –

  • save_pdf (bool) –

  • path (str) –

  • color (str) – can be matplotlib color names or hex color codes

  • title (str) –

  • x_label (str) –

  • y_label (str) –

Returns

Return type

bar_plot

teemi.learn.plotting.bar_plot_w_hue(dataframe, x: str, y: str, save_pdf=True, path='', hue: str = 'category', palette='dark', title='', x_label='', y_label='', horisontal_line: bool = True, size_height: int = 10, size_length: int = 10)None[source]

Plotting a correlation_plot.

Parameters
  • dataframe (pd.DataFrame) – Dataframe with categories in seperate column from x and y

  • x (str) – x coordinates

  • y (str) – y coordinates

  • save_pdf (bool) –

  • path (str) – path to folder with name of the file

  • title (str) –

  • x_label (str) –

  • y_label (str) –

Returns

Return type

bar_plot_w_hue

teemi.learn.plotting.carpet_barplot(pd_dataframe_cross_tab_prop, colorDict: dict, save_pdf=True, path='', xlabel='', ylabel='', size_height: int = 10, size_length: int = 20, bar_width=1.0, ylim=[0, 25])None[source]

Plotting stacked barplots from a pandas dataframe cross tab df

Parameters
  • pd_dataframe_cross_tab_prop (pd.DataFrames) – pd.crosstab

  • colorDict (Dict) – key as coloumn name, value hex color code

  • save_pdf (bool) –

  • path (string) –

Returns

Return type

stacked barplot

teemi.learn.plotting.color_range_dict()dict[source]

Returns a dictionary of color ranges.

Returns

A dictionary of color ranges containing keys ‘yellow’, ‘orange’, ‘blue’ and ‘green’

Return type

dict

teemi.learn.plotting.correlation_plot(dataframe, x: str, y: str, save_pdf: bool = True, path: str = '', title: str = '', size_height: int = 10, size_length: int = 10, y_axis_range: list = [0, 1], x_axis_range: list = [0, 1])None[source]

Plotting a correlation_plot.

Parameters
  • dataframe (pd.DataFrame) –

  • x (str) – x coordinates

  • y (str) – y coordinates

  • save_pdf (bool) –

  • path (str) –

  • size_height (int) –

  • size_length (int) –

  • x_axis_range (list) –

  • y_axis_range (list) –

Returns

Return type

correlation_plot

teemi.learn.plotting.grouped_bar_plot(x: list, y: list, colors: list, category_labels: list, title='', y_label='', x_label='', path='', axhline=True, size_height: int = 10, size_length: int = 10)[source]

Create a grouped bar plot from input data and save the plot in a pdf format

Parameters
  • x (list) – A list of x-coordinates of the bars in the plot.

  • y (list) – A list of y-coordinates of the bars in the plot.

  • colors (list) – A list of color codes for the bars in the plot.

  • category_labels (list) – A list of labels for the categories represented by the bars in the plot.

  • title (str, optional) – The title of the plot. Default is an empty string.

  • y_label (str, optional) – The label for the y-axis of the plot. Default is an empty string.

  • x_label (str, optional) – The label for the x-axis of the plot. Default is an empty string.

  • path (str, optional) – The path to the directory where the plot will be saved. Default is an empty string.

  • axhline (bool, optional) – The flag to show an horisontal line

teemi.learn.plotting.horisontal_bar_plot(x: list, y: list, vertical_line: bool = True, save_pdf: bool = True, path: str = '', color='white', title=None, x_label=None, y_label=None, size_height: int = 10, size_length: int = 8, legend=False)None[source]

Plotting a horisontal_bar_plot .

Parameters
  • x (list) – x coordinates

  • y (list) – y coordinates

  • vertical_line (bool) –

  • save_pdf (bool) –

  • path (str) –

  • color (str) –

  • be matplotlib color names or hex color codes (can) –

  • title (str) –

  • x_label (str) –

  • y_label (str) –

Returns

Return type

horisontal_bar_plot

teemi.learn.plotting.plot_ml_learning_curve(x_partitioned_data: list, y_training: list, y_cv: list, training_sd, cv_sd: list, save_pdf=True, path='', size_height: int = 10, size_length: int = 10, title='', linewidth: int = 1.5, y_axis_range: list = [0, 25])None[source]

Plotting a learning curve from partitioned dataframes.

Parameters
  • x_partitioned_data (list) – x coordinates

  • y_training (list) – y coordinates for trained models

  • y_cv (list) – y coordinates for cross-validated models

  • training_sd (list) – standard-deviation for the models

  • cv_sd (list) – standard-deviation for cross-validated models

  • save_pdf (bool) –

  • path (str) –

Returns

Return type

Learning curve

teemi.learn.plotting.plot_phylo_tree(alignment_file, save_pdf=True, path='', height=10, wideness=8)[source]

Plotting a phylogenetic tree.

Parameters
  • alignment_file (Bio.Align.MultipleSeqAlignment) – A multiple sequence alignemt made into a Bio.Align.MultipleSeqAlignment obejct Dataframe with categories in seperate column from x and y

  • save_pdf (bool) –

  • path (str) – path to folder with name of the file

Returns

Return type

plot_phylo_tree

teemi.learn.plotting.plot_stacked_barplot_with_labels(df: pandas.core.frame.DataFrame, colors: list, title='', y_label='', x_label='', path='', size_length: int = 20, size_heigth: int = 10)[source]

Plots a stacked barplot from a dataframe.

Parameters
  • df (pd.DataFrame) – The dataframe containing the data to be plotted.

  • colors (list) – A list of colors for the different bars in the plot.

  • title (str, optional) – The title of the plot. Default is an empty string.

  • y_label (str, optional) – The label for the y-axis of the plot. Default is an empty string.

  • x_label (str, optional) – The label for the x-axis of the plot. Default is an empty string.

  • path (str, optional) – The path to the directory where the plot will be saved. Default is an empty string.