vhfsantos

About me

Since 2017, I've been developing and applying data analysis and machine learning tools to solve biological questions in different fields of biology, such as Evolution, Biochemestry, Molecular Biology, Systems Biology, and Gene Regulation.
Currently, I am the bioinformatician of the Bas van Steensel lab, where we develop new technologies to understand gene regulation better. In this context, I'm mainly responsible for developing AI tools, visualization methods, automated pipelines, and statistical frameworks that follow these technologies.

Interests: Bioinformatics; Computational Biology; Deep Learning; Systems Biology; Evolution

Main computational tools

PARM: Promoter Activity Regulatory Model

AI Gene Regulation Python

PARM is a deep learning model that predicts the promoter activity from the DNA sequence itself. We trained PARM on a specific type of MPRA data that allows it to predict cell-specific promoter activity in a lightweigth and fast manner. In addition to developing PARM together with my colleagues, I was responsible for performing a plethora of computational experimente with PARM to test hypotheses on transcription factors biology. I was also responsible for publishing PARM as a Bioconda package and maintaining the codebase.

View on GitHub
Read paper

primetime: Automated pipeline for detection of TF activity from barcoded reporters

Gene Regulation Snakemake R Python

Primetime is a user-friendly pipeline for analyzing transcription factor (TF) prime reporter data. My colleague developed a robust method to quantitatively detect the activity of TFs, and I develeoped primetime to follow this new technology. Primetime is a snakemake pipeline that automates the processing of sequencing data, including barcode counting, clustering, annotation, and differential TF activity analysis across experimental conditions.

View on GitHub
Read paper

Domainogram

Gene Regulation R

Domainogram is a type of plot that shows statistical differences in nuclear lamina interactions between two conditions using pA-DamID data. I've done a rework in the original code. My version is based on ggplot2 to make it more flexible and easier to use. Also, in the paper below, we studied the mechanisms behind the nuclear lamina interaction by rearranging (deleting, inverting) the genome organization. In this context, adapted the statistics and visualization of the domainograms to each new rearrangement of the genome.

View on GitHub
Read paper

CURE: automated and parallel pipeline for the Curation of UltraconseRved Elements (UCEs)

Phylogenomics Python Bash

CURE is a framework to curate UCE data for species-tree reconstruction. When dealing with UCE data, there are two main ways to cure the data (more info in our paper). None of them was automated or efficiently suitable for large datasets. Guided by my colleagues, I developed CURE to solve this problem. CURE is user-friendly and speeds up the curation process by parallelizing the steps.

View on GitHub
Read paper

Publications

Research papers

Trauernicht M.*; Franceschini-Santos, V. H.*; Yücel H.; van Steensel, B. (2025); Protocol for multiplexed transcription factor activity detection using optimized barcoded reporters and an automated computational pipeline. Submitted.
_{* Equal contribution}

van Lieshout, T.; Carlos G. Urzúa-Traslaviña, C. G.; Barbadilla-Martínez, L.; Boi, M. C. L.; Harm-Jan Westra, H.J.; Klaassen, N.; Franceschini-Santos, V. H.; Parra-Martínez, M.; de Ridder, J.; van Steensel, B.; Voest E.; Franke, L. (2025); Identification of (ultra-)rare functional promoter mutations in cancer using sequence-based deep learning models. Submitted.

Picinato, B. A.; Franceschini-Santos, V. H.; Zaramela, L.; Vêncio, R. Z. N.; Koide, T. (2025); Archaea express circular isoforms of IS200/IS605-associated ωRNAs. Submitted.

Barbadilla-Martínez, L.; Klaassen, N.*; Franceschini-Santos, V. H.*; Breda, J.; Hernandez-Quiles, M.; van Lieshout, T.; Urzua Traslaviña, C.; Yücel, H.; Boi, M.; Hermana-Garcia-Agullo, C.; Gregoricchio, S.; Zwart, W.; Voest, E.; Franke, L.; Vermeulen, M.; de Ridder, J., van Steensel, B. (2024). The regulatory grammar of human promoters uncovered by MPRA-trained deep learning. BioRxiv.
_{* Equal contribution}

Dauban, L., Eder, M., de Haas, M., Franceschini-Santos, V. H., Yañez-Cuna, J.O., van Schaik, T., Leemans, C., Rademaker, K., Martinez-Ara, M., Martinovic, M., de Wit, E., van Steensel, B. (2023); Genome-nuclear lamina interactions are multivalent and cooperative. BioRxiv

Manjón, A. G., Manzo, S. G., Prekovic, S., Potgeter, L., van Schaik, T., Liu, N. Q., Flach, K., Peric-Hupkes, D., Joosten, S., Teunissen, H., Friskes, A., Ilic, M., Hintzen, D., Franceschini-Santos, V. H., Zwart, W., de Wit, E., van Steensel, B., & Medema, R. H. (2023). Perturbations in 3D genome organization can promote acquired drug resistance. Cell reports, 42(10), 113124. Advance online publication.

Freitas, F.V.; Branstetter, M.G.; Franceschini-Santos, V. H.; Dorchin, A.; Wright, K.; Lopez-Uribe, M.; Griswold, T.; Silveira, F.A.; Almeida, E.A.B. UCE phylogenomics, biogeography, and classification of long-horned bees (Hymenoptera: Apidae: Eucerini), with insights on using specimens with extremely degraded DNA. Insect Systematics and Diversity, Volume 7, Issue 4, July 2023, 3

Pessoa-Lima, C.; Tostes-Figueiredo, J.; Macedo-Ribeiro, N.; Hsiou, A.S.; Muniz, F.P.; Maulin, J.A.; Franceschini-Santos, V. H.; de Sousa, F. B.; Barbosa, F., Jr.; Line, S.R.P.; et al. Structure and Chemical Composition of ca. 10-Million-Year-Old (Late Miocene of Western Amazon) and Present-Day Teeth of Related Species. Biology 2022, 11, 1636