Bioinformatics, Computational Genomics

Centre of New Technologies, University of Warsaw

Warsaw, Poland

bayesian-networks bigdata database matlab

Number of employees: 100-249

www: http://www.cent.uw.edu.pl/en/research/lab/lgfs

Added 4/9/2016 6:38:55 PM

Position info

Seniority: Internship

Monthly gross salary range: 2000 - 10000 PLN

Team description:

In the Laboratory of Functional and Structural Genomics at Centre of New Technologies University of Warsaw we perform computational studies, whose main objective is to analyze and predict the three-dimensional structure of the human genome, and its relation with the genomic diversity of human populations, both natural and pathological. In particular, we investigate structural variants, copy number variants observed in various sub-populations and the groups of patients, and their three-dimensional localization in the structure of the nucleus.

We also examine the relationship of the expression levels of selected genes from their location in three-dimensional space. In addition, we use structural information to enrich the sequential genomic analysis in order to better define the function of selected genomic regions that are important in the context of personalized medicine.

For this purpose, first we are developing a variety of large-scale computational tools for analysis of whole genome sequences, the identification of structural variants, determining the statistical significance of the observed number of copies of genomic regions in selected cohorts of patients. Secondly, we evaluate their uniqueness comparing the observed changes with typical and natural genomic diversity that has been cataloged for example in the 1000 Genomes Project Consortium. Thirdly, we infer the biological function of these genomic regions using publicly available databases. Fourthly, we identify unique local three-dimensional environment for selected sites, eg. regulatory ones. In the fifth step, we analyze the impact of structural re-arrangements of those local neighborhoods on the gene expression profiles, which is related to the presence of transcription factories.

Position description:

We need bioinformatics collaborators / freelancers for three positions in collaboration with commercial software developing company:

1) Software Engineering
Ability to recommend the core technology that should be used to achieve the desired result. Skills should however include some blend of the following:
● Statistical programming languages, like R or Python
● Deep knowledge of relational databases, such as Oracle
● Familiarity with NoSQL solutions, like MongoDB, and when it is appropriate to use this type of solution
● Knowledge of machine learning techniques. Experience with platforms like Tensor Flow or IBM Watson is a plus
● Java and Javascript are a plus

Basic Statistics
A strong understanding of when, and when not to, apply different techniques to the business case. Stakeholders and project team will depend upon this person to design and evaluate experiments and make decisions.

Machine Learning
Familiarity with machine learning techniques and the ability to at least implement them using R or Python libraries. It is not necessary to be fluent in how the algorithms work, but to understand when it is appropriate to apply each technique.

Data
A demonstrated ability to work with unstructured data, such as social media and blog posts, and video. Additionally, an ability to cope with messy data. The project requires building a taxonomy of terms related to different aspects of the pharma business, and tune it over time as appropriate.

Problem Solving
Able to absorb the overall strategy of the project team and recommend the right approach. There will be a heavy emphasis on experimentation and continuous learning. 

Communication
A demonstrated ability to communicate findings to the project team. The project will require analysis of the data collected as input to the product roadmap beyond the initial minimum viable product (MVP).

2) in addition to (1) also:
• A degree in bioinformatics, biology, computer science or related discipline.
• Good understanding of biological concepts and experimental methods commonly used in molecular biology.
• Basic shell scripting skills and experience with version control. 
• Solid statistical foundation, including background in probability theory and experimental design and analysis.
• Fluency in R programming (required) and knowledge of best practices for package development.
• Experience with NGS data analysis and familiarity with Bioconductor packages used in genomics.

3) We are also looking for talented and highly motivated Computational Biologist.
A successful candidate will work with interdisciplinary teams (including biostatistics and bioinformatics scientists) and will carry out data analysis and integration across various domains.

Responsibilities:
● Work closely with technology team to develop, maintain, and document a computational infrastructure that efficiently executes complex queries across large volumes of disparate data and knowledge on genomics, genetics, pharmaceuticals, and chemicals
● Work closely with biological, translational and clinical scientists to design studies for experimental validation of in silico-derived predictions and assist with the analysis and interpretation of study results
● Propose, develop, and evaluate the validity of a wide range of bioinformatics methods and techniques to model large volumes of disparate data towards the identification of new drug-indication hypotheses
● Develop data models and databases, new software applications or customize existing applications to meet specific scientific project needs
● Compile data for use in activities such as gene expression profiling, genome annotation, and structural bioinformatics
● Provide statistical and computational tools for biologically based activities such as genetic analysis, measurement of gene expression, and gene function determination
● Test new and updated bioinformatics tools and software

Successful candidates will meet the following requirements:
● MS in bioinformatics, biostatistics, computational biology or similar combined with a very strong record of high-throughput data analysis
● A solid understanding of the relevant concepts in biology and genetics and enthusiasm for learning more
● An understanding of the statistical principles behind current best practices in the field. Experience with biomarker discovery and evaluation is a plus
● Strong experience in the use of a high-level programming language such as R, MATLAB, Python or Perl for complex data analysis
● Strong communication, data presentation and visualisation skills
● Ability to work both independently and collaboratively, and to handle several concurrent, fast-paced projects

Requirements

Must have:

  • Statistical programming languages, like R or Python
  • Deep knowledge of relational databases, such as Oracle
  • Familiarity with NoSQL solutions, like MongoDB, and when it is appropriate to use this type of solution
  • Fluency in R programming (required) and knowledge of best practices for package development (position 2)
  • Strong experience in the use of a high-level programming language such as R, MATLAB, Python or Perl for complex data analysis

Nice to have:

  • Knowledge of machine learning techniques. Experience with platforms like Tensor Flow or IBM Watson is a plus
  • Java and Javascript are a plus
  • Experience with NGS data analysis and familiarity with Bioconductor packages used in genomics
  • Strong communication, data presentation and visualisation skills

Languages: English

Apply Now!