Test module

Genotyping module

teemi.test.genotyping.pairwise_alignment_of_templates(reads: list, templates: list, primers: list)pandas.core.frame.DataFrame[source]

Infers relationship of templates to reads based on highest score from a pairwise alignment.

Parameters
  • reads (list of Bio.SeqRecord.SeqRecord) – these are .ab1 files made into Bio.SeqRecord.SeqRecord objects

  • templates (list of Bio.SeqRecord.SeqRecord) – Templates for inferring relationship with - could be plasmid fx

  • primers (list of Bio.SeqRecord.SeqRecord) – list of primers to be for finding were the read should start

Returns

Return type

pd.Dataframe in the following way

Example

<<<df_alignment = pairwise_alignment_of_templates(reads,templates, primers_for_seq)

<<< df_alignment

Sample-Name inf_promoter_name align_score inf_promoter 132 yp53re_cpr_A10_A10-pad_cpr_fw pCCW12 634.0 5 188 yp53re_cpr_A11_A11-pad_cpr_fw pTPI1 904.0 6 247 yp53re_cpr_A12_A12-pad_cpr_fw pTPI1 851.0 6 93 yp53re_cpr_A1_A01-pad_cpr_fw pCCW12 543.0 5 41 yp53re_cpr_A2_A02-pad_cpr_fw pCCW12 636.0 5

Notes

If you want inf_part_number column then change your the description of the Bio.SeqRecord.SeqRecord as follows:

pCCW12.description = ‘1’

Data wrangling module

teemi.test.data_wrangling.concatenating_list_of_dfs(list_of_dfs: list)[source]

Concatenating a list of daframes into one pd.dataframe by rows

teemi.test.data_wrangling.plat_seq_data_wrangler(sequencing_plates: list)list[source]

Makes list of Plate2Seq pd.DataFrames into numeric values and removes nan values.

Parameters

sequencing_plates (list of pd.DataFrames) – Sliced Plate2seq pd.dataframes

Returns

Return type

Plate2Seq pd.DataFrames with numeric values

teemi.test.data_wrangling.plate_AvgQual(list_of_dfs_numeric: list, Avg_qual=50, used_bases=25)list[source]

Filters out rows that doesnt follow the criteria.

Parameters
  • list_of_dfs_numeric (list of pd.DataFrames) – Sliced and Plate2seq pd.dataframes

  • Avg_qual (int) –

  • used_bases (int) –

Returns

Return type

Plate2Seq pd.DataFrames with that follows Avg_qual and used_bases criteria

teemi.test.data_wrangling.slicing_and_naming_seq_plates(sequencing_plates, where_to_slice=7)list[source]

Slices rows of a list of dataframes and changes the names. Is used to ease pre-processing of Plate2seq excel files

Parameters
  • sequencing_plates (list of pd.DataFrames) – Plate2seq pd.dataframes

  • where_to_slice (int) – indicate where to slice the dataframe

Returns

Return type

list of plates sliced pd.DataFrames

teemi.test.data_wrangling.split_df_names(df_names_column, which_column_to_split1=0, which_column_to_split2=2)list[source]

Splits sample names from plate2seq pd.dataframes into plate and well columns