Welcome to rendseq’s documentation!

This is an in-development package for the analysis of end-enriched RNA sequencing data.

For more information, see our GitHub page: https://github.com/miraep8/rendseq

File Functions

Functions for fetching, creating, and opening raw and processed data files.

rendseq.file_funcs.make_new_dir(dir_parts)

Create a new directory and return valid path to it.

Parameters

name (- dir_parts - a list of strings to be joined to make the directory) –

Return type

  • dir_str - the directory name

rendseq.file_funcs.open_wig(filename)

Open the provided wig file and return the contents into a 2xn array.

Parameters

required (-filename (string) -) – the filename you desire to open!

Returns

-reads (2xn array) – and the second column being the count at that position (raw read, z_score etc)

Return type

a 2xn array with the first column being position

rendseq.file_funcs.validate_reads(reads)

Make sure the given reads meet our format requirements.

Parameters

array) (-reads (2xn) – and the second column being the count at that position (raw read, z_score etc)

Return type

NoneType

Raises

Throws exceptions if reads are not correctly formatted

rendseq.file_funcs.write_wig(wig_track, wig_file_name, chrom_name)

Write provided data to the wig file.

Parameters
  • array) (- wig_track (required) - the wig data you wish to write (in 2xn) –

  • to (- wig_file_name (string) - the new file you will write) –

Functions for Calculating Z-Scores

Functions needed for z-score transforming raw rendSeq data.

rendseq.zscores.main_zscores()

Run Z-score calculations.

Effect: Writes messages to standard out. If –save-file flag, also writes output to disk.

rendseq.zscores.parse_args_zscores(args)

Parse command line arguments.

rendseq.zscores.score_helper(start, stop, min_r, reads, i)

Find the z-score of reads[i] relative to the subsection of reads.

Goes from start to stop, with a read cutoff of min_r

rendseq.zscores.validate_gap_window(gap, w_sz)

Check that gap and window size are reasonable in r/l_score_helper.

rendseq.zscores.z_score(val, v_mean, v_std)

Calculate a z-score given a value, mean, and standard deviation.

NOTE: The z_score() of a constant vector is 0

rendseq.zscores.z_scores(reads, gap=5, w_sz=50, min_r=20)

Perform modified z-score transformation of reads.

Parameters
  • reads (-reads 2xn array - raw rendseq) –

  • (interger) (-gap) – interest that should be excluded in the z_score calculation.

  • (integer) (-min_r) – one should include in zscore calulcation.

  • (integer) – of reads going into the z_score calculation for a point that point is excluded. note this is sum of reads in the window

  • (string) (-file_name) – the message printed

Returns

-z_score (2xn array) – and the second column being the z_score.

Return type

a 2xn array with the first column being position

Functions for calling peaks from Z-Scores

Take normalized raw data find the peaks in it.

rendseq.make_peaks.hmm_peaks(z_scores, i_to_p=0.001, p_to_p=0.6666666666666666, peak_center=10, spread=2)

Fit peaks to the provided z_scores data set using the vertibi algorithm.

Parameters
  • array) (-z_scores (2xn) – location) second column is a modified z_score for that position.

  • (float) (-spread) – probability of transitioning from inernal state to peak state. The default value is 1/2000, based on asseumption of geometrically distributed transcript lengths with mean length 2000. Should be a robust parameter.

  • (float) – 1/1.5.

  • (float) – for the peak state.

  • (float) – distribution.

Returns

-peaks – column being a peak assignment.

Return type

a 2xn array with the first column being position and the second

rendseq.make_peaks.main_make_peaks()

Run the main peak making from command line.

rendseq.make_peaks.parse_args_make_peaks(args)

Parse command line arguments.

rendseq.make_peaks.thresh_peaks(z_scores, thresh=None, method='kink')

Find peaks by calling z-scores above a threshold as a peak.

Parameters
  • pos. (- z_scores - a 2xn array of nt positions and zscores at that) –

  • be (- thresh - the threshold value to use. If none is provided it will) – automatically calculated.

  • score (- method - the method to use to automatically calculate the z) – if none is provided. Default method is “kink”

Indices and tables