Who am I?
My name is Pedro Sebe (he/him), I am a bioinformatician/data scientist in Brazil. I studied Molecular Sciences at Universidade de São Paulo, and today I work with Bioinformatics and Data Science at Varsomics/Hospital Albert Einstein. I am also a grad student at Universidade de São Paulo, studying the mutational signatures of cancer and how to detect them in clinical samples
✨ My main interests
Bioinformatics Bioinformatics is the interdisciplinary area applying methods from Computer Science to solve Biology problems, specially in Molecular Biology. I have experience using Bioinformatics tools to analyse data from the clinical lab at Hospital Albert Einstein, in the areas of Oncology (detection of fusions in RNAseq, identification of mutational signatures) and Metagenomics/Metatranscriptomics (identification of pathogens). Bayesian models Bayesian methods allow modelers to combine prior information with empirical data to generate better predictions and inferences. My favorite components that we can use to build bayesian models are Gaussian Process, Hierarchical Models and Sparsifying priors. I believe bayesian inference is also relevant for machine learning, specially when used for high-stakes decisions, where accurate description of uncertainty is crucial. My main tool for exploring Bayes is PyMC. Machine learning✔️ My skills
Programming languages- Experient in Python and Bash
- Learning R
- Notions of C, JS and PHP
- Data manipulation: Pandas, Numpy
- Data visualization: Matplotlib, Seaborn, basic Plotly
- Machine Learning: Scikit-learn, Tensorflow, Jax
- Miscelaneous: Jupyter Notebook/Jupyter Lab, PyMC
- NGS quality control and filtering with AfterQC and FastQC
- De novo assembly with SPAdes
- Variant calling with GATK tools and Freebayes
- Variant annotation with snpEff and Annovar
- Taxonomic classification with Kraken
- Genetic analysis with Plink
- Miscellaneous tasks with samtools and bedtools
- Version control with Git.
- Basic relational database queries with SQL
- Basic queries to graph databases with Neo4j
- Learning Tidyverse tools for data analysis in R.