Biomedical Informatics and Data Science (NLM)

Home Home Training Biomedical Informatics and Data Science (NLM)

National Library of Medicine (NLM) Training Program in Biomedical Informatics and Data Science

Cross-training researchers since 1992 at the interface between the computational /mathematical sciences and biological sciences/biomedicine to face the biggest challenges in biomedical informatics and data science today.

We are pleased to announce that our training program received its sixth consecutive renewal of 5 years, beginning July 1, 2017. We are grateful for the support we have received over many years from the National Library of Medicine (NLM), part of the National Institutes of Health (NIH), on grant T15LM007093.

Program Director:
Dr. Lydia Kavraki
Rice University

Program Co-Director:
Dr. Elmer Bernstam
The University of Texas Health
Science Center at Houston

Program Administrator:
Melissa Glueck
Gulf Coast Consortia

Digital information streaming from innumerable sensors, instruments and simulations is outrunning our capacity to accumulate, organize and analyze it for making healthcare decisions. We need fundamental progress in biomedical informatics to exploit the full wealth of knowledge embedded in genomic, proteomic, genetic, epidemiological, and clinical data and gain a full return on our substantial investments in health information technology.

Herein lie challenges to biomedical informatics – and opportunities for training as well. This program provides research training in Healthcare Informatics, Translational Bioinformatics, Clinical Research Informatics, and Public Health Informatics to PhD students and postdoctoral trainees across the GCC’s member institutions. This is one of only 16 institutionally-based NLM training programs in the United States.

This program serves the needs of trainees such as a young physician who wishes to expand her/his analytical and computational knowledge of computer-assisted analysis, simulation and multi-dimensional imaging; a biologist who wants to develop expertise in functional genomics; or a computer scientist who wants to prepare her/himself for a research career in translational bioinformatics.

The next Call for Applications will be in Spring 2020

See the sections below for information about mentor and trainee eligibility and fellowship requirements and benefits.

The “NLM application instructions” section below includes links to the application forms and the mentor recommendation forms.

Please pay close attention to the “NLM Curriculum Plan” section below when preparing your application.

The NLM Training Program addresses challenges facing biomedical informatics, namely how to address the digital information streaming from innumerable sensors, instruments and simulations that is outrunning our capacity to accumulate, organize and analyze it for making healthcare decisions, in order to:

  1. to make sense of datasets that may be massive, heterogeneous, deficient or even contain errors;
  2. to cull from them important insights into fundamental problems of biomedicine and
  3. to convey information in ways readily understood by researchers and clinicians.

To do this, our training program focuses on what is loosely known as “big data” – that is, data-driven discovery and decision-making tools, meaning using computer programs to seek associations in databases whose complexity hides such relations from even expert humans, and to make discovered associations and patterns intelligible to humans. We have designed a curriculum, one that will develop the core competencies for biomedical informaticians as defined in the report by the American Medical Informatics Association (AMIA). The curriculum of foundation courses and electives will be customized to meet each trainee’s research interests, previous coursework, and knowledge gaps.

Our training program supports training and research primarily in three domains of informatics as defined by the NLM:

  1. healthcare/clinical informatics: applications of informatics principles and methods to direct patient care; examples include advanced clinical decision support systems, or multimedia electronic health records.
  2. translational bioinformatics: applications of informatics principles and methods to support “bench to bedside to practice” translational research; examples include genome-phenome relationships, pharmacogenomics, personalized medicine, or genome-wide association studies (GWAS).
  3. clinical research informatics: applications of informatics principles and methods to support basic clinical trials and comparative effectiveness research that use human versus animal models; examples include biostatistics, in-silico trials, or merging and mining large disparate data sets that mix images, text, and data.


  • 12 month fellowship
  • the opportunity to compete for reappointment (for up to a total of three years) if progress is satisfactory
  • Stipend of $24,816 per year in NIH’s FY19
  • Partial support for tuition, fees and health insurance (not fringe) and travel
  • Primary mentor/department is responsible for the fringe, and the remainder of the stipend, tuition and health insurance.


  • 12 month fellowship
  • the opportunity to compete for reappointment (for a total of two years) if progress is satisfactory
  • NIH-level FY19 annual stipend of $50,004 and above, depending on the years of science-related experience since receipt of the doctoral degree
  • Partial support for tuition and fees for required courses, health insurance (not fringe) and travel
  • Primary mentor/department is responsible for the fringe, and any stipend supplementation needed to bring the trainee up to her/his former stipend level.

Both trainees and their mentors have responsibilities to fulfill in return for receiving support from a Keck Center training program.

By applying for a training fellowship, applicants and their mentors are agreeing to fulfill these responsibilities, as well as any additional responsibilities detailed in the application and curriculum pages.

For those responsibilities that fall to the trainee (i.e. curriculum, seminar attendance), the primary mentor agrees to support the trainee’s participation in all required didactic courses and NLM activities (seminars, conferences).

Furthermore, both mentors agree to attend the Keck Annual Research Conference each fall during the trainee’s fellowship (and serve as a poster judge at least one year), attend the original fellowship interview and subsequent annual progress reviews, and regularly interact and communicate to ensure effective supervision and coordination of the trainee’s project.

Finally, the trainee and both mentors agree to the following requirements:


  • completion of required and elective courses of trainee’s individual curriculum
  • completion within trainee’s first appointment year of an approved course in the Responsible Conduct of Research
  • completion within trainee’s first appointment year of UTHealth’s BMI 5310W course.


  • full time work on trainee’s NLM project, defined by the NIH as a minimum of 40 hours per week
  • attendance at the weekly Keck Seminars
  • attendance at the monthly NLM fellows meetings
  • attendance / poster presentation at the Keck Annual Research Conference
  • attendance and, if chosen, presentation at the NLM Annual Training Conference in June.

Communication / dissemination:

  • advance notification to the Program Director and program administrator of any major changes to the trainee’s project or mentors
  • trainee’s submission of an annual written progress report, and course transcripts (if applicable)
  • trainee’s presentation to the Steering Committee and in the presence of both mentors at an in-person annual progress review
  • submission of final copy of any publications resulting from research conducted while the trainee holds this fellowship & acquiring an NIH PMCID number in a timely manner for these publications
  • acknowledgement of Gulf Coast Consortia and NLM support in posters and publications
  • trainee’s coordination of his/her project between the two mentors
  • trainee’s updating the Gulf Coast Consortia on publications, awards, and current position for 15 years after the fellowship ends (an NIH requirement for progress reports and grant renewals).

NLM Steering Committee

Lydia Kavraki, Program Director Rice University

Elmer Bernstam, Program Co-Director The University of Texas Health Science Center at Houston

Rui Chen Baylor College of Medicine

Chris Tsz-Kwong Man Baylor College of Medicine

Luay Nakhleh Rice University

Nicholas Navin The University of Texas MD Anderson Cancer Center

B. Montgomery Pettitt The University of Texas Medical Branch at Galveston

Badri Roysam University of Houston

Marina Vannucci Rice University

NLM Training Faculty:

The NLM Training Program was renewed for another 5 years effective July 1, 2017, with a new training faculty. Faculty members who were on the previous training faculty, or prospective faculty, will find the instructions on how to request joining the renewal faculty on the Faculty Eligibility page.

Please note that requests to join the NLM training faculty are only considered when faculty are sponsoring a candidate in the current call for applications.

Current NLM Faculty

Faculty eligibility to mentor an NLM applicant:

  • Primary and secondary mentors must be tenure-track faculty, and members of the NLM Training Program faculty, or be provisionally approved in advance of their candidate’s application.
  • Please note membership in the Gulf Coast Consortia, or in the training faculty of another GCC grant does not automatically confer membership in the NLM Training Program. Faculty must request to join each training faculty, as decisions are made by individual Program Directors.
  • Requests to join the training faculty will be considered only when they are sponsoring a candidate in response to the current call for applications. Requests must be submitted no later than the due date of the LOI, preferably earlier.
  • A faculty member may be the primary mentor for only one NLM trainee at a time; they may sponsor another applicant in a call for applications shortly before the current trainee will end his/her fellowship.
  • Primary mentors are usually expected to have prior training experience, for example by having previously been a primary or secondary mentor in a training program.
  • However, if a young faculty member does not have previous training experience, a member of the NLM Steering Committee will be assigned to “mentor the mentor,” to counsel only on matters relating to trainee guidance and evaluation (i.e. “best practices”), and not on the research in question.
  • A faculty member may be the secondary mentor for more than one trainee.

How to request joining the NLM training faculty: 

Email the following information to Melissa Glueck at by the LOI date specified in the Call for Applications:

  • a 5-page NIH Biosketch (in Word format) including peer-reviewed funding; faculty are expected to be PI on an NIH R01 grant or equivalent peer-reviewed federal funding, and to have a training history;
  • a few sentences about how the faculty’s research interests relate to which domain of the NLM Training Program (see list in NLM Overview section);
  • whether the faculty wishes to sponsor a predoctoral/postdoctoral candidate as primary/secondary mentor; and
  • the name and institution of the proposed co-mentor.

This will be forwarded to the Program Director for consideration. The faculty member will be reviewed and notified of eligibility to join prior to the application deadline.

Should his/her applicant be appointed to an NLM fellowship, then the faculty member will be added to the NLM training faculty. If his/her applicant is not appointed, then the faculty member will not be added to the training faculty at that time. This is to ensure an active training faculty.

US citizens or Permanent Residents (who already have their “Green Card”) who are students enrolled in a PhD program, or are postdocs affiliated (or who will be affiliated) with the following Gulf Coast Consortia member institutions are eligible to apply:

  • Rice University
  • Baylor College of Medicine
  • The University of Texas Health Science Center at Houston
  • The University of Texas Medical Branch at Galveston
  • The University of Texas MD Anderson Cancer Center
  • University of Houston
  • Institute of Biosciences & Technology (IBT) – TAMHSC.

In addition: Predoctoral applicants:

  • must be currently enrolled at one of the 7 GCC members institutions, in a degree-granting program whose end point is a PhD.
  • Students enrolled in an MS program are not eligible to apply.
  • MD/PhD students are eligible for support during the PhD-period of their training.
  • Students must have been already accepted into the lab of a tenure-track faculty member, and develop a project involving a secondary mentor, also a tenure-track faculty member. Both faculty must already be a member of the NLM Training Program faculty.
  • Applicants enrolled in a bio-related degree program will be expected to take computational/informatics-oriented coursework, and vice versa. These additional course requirements will be discussed individually at the initial interview (and later in annual progress reviews) with the NLM Training Program Steering Committee to develop a customized curriculum for the trainee. See Curriculum page for more details.

Postdoctoral applicants:

  • must be currently employed by a GCC member institution, have a job offer from an eligible mentor (i.e. a tenure-track member of the NLM Training Program faculty, or a tenure-track faculty who is applying for membership), or be in a position to accept an offer from them, and begin  by the required start date named in the Call for Applications.
  • must be in possession of their PhD, MD, or other doctoral degree when they apply for a postdoctoral fellowship. Those who have completed the requirements for their PhD but will be awarded it at a later date must supply a letter from their institution verifying this.
  • whose doctorate in informatics or a closely related field such as computer science may be required to take selected courses as determined by the NLM Steering Committee to support the interdisciplinary nature of their research project; see the Curriculum page for more details.
  • whose doctorate is in another subject area (e.g. medicine) must follow a program of approved courses in biomedical informatics methods and applications most relevant to their project (prior graduate work in informatics will be considered); see the Curriculum page for more details.
  • must have as their primary mentor the faculty member in whose lab s/he is employed; trainees must develop a project involving a secondary mentor, also a tenure-track member of the NLM Training Program faculty.
  • Please note that the NLM funding agency expects postdocs to commit to staying for two years of fellowship, in order to become truly trained in biomedical informatics.

Preparing to apply:

  • Choose 2 mentors from 2 different areas: one biomedical, one computational.
  • It is preferable that mentors come from different departments, and different institutions.
  • Both mentors must be tenure-track faculty.
  • Both mentors must be members of the NLM Training Faculty at the time of application, or be provisionally approved before the candidate’s application is due.

Letters of Intent and requests to join the NLM Training Faculty are both due by the date named in the Call for Applications above.

  • See Faculty Eligibility section for details of what to send to request joining the faculty. If you are a member of the GCC or of another training grant, this does not automatically make you a member of the NLM faculty.
  • Please email the Letter of Intent to Melissa Glueck at and include:

1. name, title, and institutional affiliation of the applicant, primary mentor, and secondary mentor.

2. brief abstract of proposed project (including naming to which domain of biomedical informatics it belongs – see Purpose of the NLM Training Program for the domains), and a few sentences about the mentoring plan–describing the skills and knowledge the trainee will receive from each mentor. If you choose, the mentoring plan can also be the longer one you intend to submit with the application, but it is not required to be.

You are required to contact your NLM Steering Committee member before submitting your application: Once the mentoring plan and project have been outlined in the LOI, applicants are required to discuss their proposed project with the NLM Steering Committee member at the applicant’s home institution before completing the application, to make sure that their proposed project is appropriate for the NLM Training Program. See NLM Steering Committee section for contact information.

Components of the application: 

From others:

1. Mentor Recommendation form from both of the mentors (link to this form). Mentors are to email their completed forms directly to

2.  Two (2) Letters of Recommendation from people other than your mentors. The recommenders should email them directly to These letters must be received by the application due date. These recommendations are not required to be on letterhead or signed, as long as the name, title, department and institutional affiliation are included in the text. These recommenders may be former professors, employers, etc., and describe qualities that they observed in the applicant which qualify him/her to be an NLM trainee.

From trainee applicant: Email all following items in one email to

1. Fellowship Application form (link to this form). This form asks for educational background, published papers, and: A. Project description (750 words) – Describe the proposed research and explicitly connect it to one of the three domains of biomedical informatics that are supported by the NLMTP. B. Laymen’s description of project (250 words) C. Mentoring and Training Plan (400 words): special emphasis should be placed on this section: 1. Clearly describe how the two mentors will collaborate to train the applicant in both of their disciplines, and what the trainee will learn from each mentor (skills, knowledge, etc.). 2. Include professional development training that the trainee will attend, and and national conferences at which s/he will present. 3. Include specific milestones and an estimated timeline for completion of the graduate degree or postdoctoral training. D. Career Goals (250 words): Describe your overall career goals and how they relate to this training program. Describe the education, training and other career development experiences you will need to achieve your career goals, and how this training program will help you achieve them better. E. Curriculum – Complete the curriculum section with your proposed courses. The NLM requires coursework from both predocs and postdocs. Together, these courses should define a coherent curriculum (not just a list of random courses – see the NLM Curriculum sections for details). You may request to have previously taken graduate coursework to be counted as one or more of your electives; however you must also include 3 electives from the approved list of electives, in case your prior coursework is not approved.

2. CV or resume: Email to  a current resume or CV outlining your professional work experience and academic history, including the dates you received your degree(s) and dates of anticipated degree completion.

3. Statement of Relevance: A brief statement (maximum one page long, double spaced, 11 pt Arial) that: –describes in which of the above-named three domains your project will be, with a clear justification of why this is the case. Please start your statement with the sentence: “This project falls into the domain of translational (etc.) bioinformatics.” –explains the relationship of your proposed project to possible future applications in translational biomedicine/biomedical informatics, that is, what your contribution to the informatics field will be.

The applicant may consult with the mentors, and solicit their opinion about this statement, but the statement of relevance should be written solely by the trainee. This statement is an overview of your project. It is different from the project description in your application, which should include the specific aims/measurements you will take in your project.

4. Transcripts are required from predoctoral AND postdoctoral applicants: official copies are preferred, but unofficial copies will be accepted. Email the following as PDFs (not photos):

  • Undergraduate transcripts from all institutions you attended
  • Master’s degree transcript (if applicable)
  • PhD students: current transcript, including all courses since you entered graduate school through the current semester.


  • Postdocs: a transcript from your doctoral degree, which includes a listing of courses attended (not just a diploma); documentation must also include the official receipt date of your doctoral degree (e.g. “conferred May 9, 2018”). If the transcript is in a foreign language, you must submit an English translation for all degrees; you may use a copy of the transcripts that you provided to your institution.

5. Proof of US citizenship (birth certificate or passport; a driver’s license is not proof of citizenship) or Permanent Residency (“Green Card”): email a clear PDF scan (not a photo). You must already have received your Green Card to be eligible to apply. If a Permanent Resident is appointed to a fellowship, s/he must provide a notarized copy of the card to the GCC to submit to the agency. Inquire in your department about finding a Notary Public to do this.


Applications are reviewed and selected applicants are invited to interview with the Program Director and the NLM Steering Committee. All applicants will be notified via email whether or not they have been selected for an interview. Interviews are a total of 20 minutes long, and are held within one month of the application deadline. Interviews include a brief presentation of the research project, planned coursework, and mentoring plan.

Both mentors are expected to participate in the interview in support of their the trainee.

See the section Instructions for Interviews for details about limits on time and number of slides.

The NLM requires a specific curriculum for both predoctoral and postdoctoral trainees.

Your curriculum plan must include all elements of training listed here, as well as a clear timeline for fulfilling them (e.g. first year Fall semester courses X and Y). The timeline should reflect the actual semesters when the courses are offered; see the separate attachment “NLM Courses” for links to course catalogs and schedules of approved electives each institution. Trainees are encouraged to suggest courses that are not included in this attachment if these are relevant to the mission of the NLM Training Program and their own individual training.

Together, these courses should define a coherent curriculum (not just a series of random courses) that reflects both the NLM domain in which an applicant’s project falls, and our training grant’s central emphasis of training in data-driven discovery, data science, and machine learning for biomedical informatics.


  • All trainees are expected to complete at least one course per semester and must receive a grade of B or higher for the course to count towards their curriculum.
  • Courses in the Responsible Conduct of Research and Foundations of Health Information Sciences I (BMI 5310W, UTH) must be taken in the first semester possible, and definitely completed during the trainee’s first year of appointment.
  • Timely progress towards completion of curriculum requirements is part of the evaluation process during a trainee’s annual progress review.

Each trainee will receive advising and specific recommendations about coursework appropriate to their background and research interests during their initial interview and subsequent annual progress reviews with the Program Director and NLM Steering Committee. Changes made to the trainee’s curriculum are binding.

Note for postdocs: Applicants who have a doctorate in informatics or a closely related field such as computer science may be required to audit selected courses as determined by the NLM Steering Committee, in order to support the nature of the postdoc’s research project, or to get an introduction to the broad field of biomedical informatics and data science. Applicants who do NOT have a doctorate in informatics or a closely related field such as computer science, e.g. biology or M.D., must audit a program of approved courses.

Required courses for predocs and postdocs:

1. Graduate Tools and Models – Data Science, COMP 543, 4 credits, Rice University. This graduate course requires prior sophisticated knowledge in computer science and statistics.

2. Foundations of Health Information Sciences I – must be taken in first year of appointment, BMI 5310W (web-based version), 3 credits, School of Biomedical Informatics, UT Health Science Center at Houston. Offered in the fall and spring semesters. This course should be taken in the first semester possible, but definitely must be completed within the first appointment year.

3. An approved Responsible Conduct of Research (RCR) course – must be completed during first year of appointment An RCR course must be taken every 4 years at your current training level (as a PhD student or as a postdoc): e.g. a postdoc who took RCR 2 years ago as a grad student must now take a postdoc course. If you have completed one of the courses named below at your current level within the past four years, name this course in your curriculum plan and submit a transcript showing completion as part of your application. BCM students must be completing their Ethics Years 1-4 in a timely manner.

Approved courses are: 1. Rice University: Training in the Responsible Conduct of Research, UNIV 594, 1 credit. Offered in the fall semester. For predocs and postdocs. 2. Baylor: Predocs: Responsible Conduct of Research (4 credits, taken over 4 years), GS-GS-5101-5104; Postdocs: Ethics for Postdocs. 3. UH: Responsible Conduct of Biological Research, BIOL 6120, 1 credit. For predocs and postdocs. 4. UTHSC-H: The Ethical Dimensions of the Biomedical Sciences, GS21 1051, for predocs. Postdocs: Postdoctoral Ethics course that is part of the Postdoc Certificate series. 5. MDACC: Predocs take UTH’s GS21 1051. Postdocs complete the MDA Postdoctoral Ethics course. 6. UTMB: Ethics in Scientific Research, MEHU 6101, for predocs and postdocs. Usually offered over two consecutive days in May and November.

4. Professional Development for Biomedical Informatics Professionals, COMP 573, 1 credit, Rice University. Dr. Kavraki’s permission as instructor is required to enroll. This course is offered in the Spring semester of even-numbered years.

5. GCC Rigor and Reproducibility workshop. Half-day workshop offered in spring and fall. Additional optional hands-on workshops offered in summer.

6. Three electives. The electives can be used to build the trainee’s background for the Graduate Tools and Models – Data Science class COMP 543 that is required, and to enhance the trainee’s knowledge in the domain of biomedical informatics into which the trainee’s project fits: healthcare/clinical informatics, translational bioinformatics, and clinical research informatics. Advanced courses are expected in this category. Graduate courses that the trainee has taken in the past and are relevant to the training can be counted for the three electives at the discretion of the NLM Steering Committee. See the NLM Electives section for a list of approved electives.

4. Predocs only: during the first year of appointment, 9 doctoral research hours. Appointed trainees will receive more information. New applicants are not required to address this in their application.

Other requirements:

1. Attendance at weekly Keck Seminars Fridays during the Fall/Spring semesters at 4:00 pm, officially Rice University course BIOS 592 Topics in Quantitative Biology and Biomedical Informatics. Enrollment for course credit is required for Rice predoctoral trainees. Trainees from other institutions should not enroll, because that will use up 1 of your 12 allotted inter-institutional course credits that you will need for your required coursework.

2. Attendance and self-organization of monthly NLM trainee meetings, held the 2nd Friday of the month (Sept. – May) at 3:00 pm in the BRC.

3. Attendance and poster presentation on your fellowship project at the Keck Annual Research Conference (held in  Oct. or Nov. in the BRC).

4. Attendance and, if selected, presentation of your fellowship project at the annual NLM Informatics Training Conference held in June at various locations. Travel funds are provided by this fellowship.

NLM Approved Electives See “NLM Curriculum Plan” section for required courses.

Inter-institutional Course Registration

Trainees whose training programs require them to attend courses offered at other institutions may register using an inter-institutional course registration form. Trainees will pay tuition to their home institution only; however, trainees must pay any required fees/lab fees to the institution they are “visiting.”

There is a cumulative limit of 12 credit hours that may be taken at other host institutions combined.

There is one form that has been agreed upon by all registrar’s offices at GCC institutions, which is available on the Rice registrar web page.

Although this form is located on the Rice web site, it was agreed upon by the registrars of all participating institutions, so it may be used by graduate students from one GCC institution to attend courses at another GCC institution.

However, please check with your institution’s registrar, as some prefer that you use their institution-specific form (e.g. BCM: link to web page with form).

Please note that these forms require several signatures from advisors to instructors to the registrar in a certain order, from both institutions, so allow adequate time for procuring these signatures.

The registration period for students from other institutions and the due date for these registration forms vary by institution. It is the trainee’s responsibility to return required form before the registration deadline. Late fees will not be reimbursed.

Selected applicants and both of their mentors will be invited for an interview before the NLM Program Director and Steering Committee. All interviews will take place on one day. All of the below should be in your presentation.

Prepare a brief oral presentation: 8 minutes maximum, with a maximum of 6 PPT slides + title slide + acknowledgement slide. Do not imbed movies in your slides. The interview will last a total of 20 minutes, including your 8-minute oral presentation, then questions from the reviewers. Should the committee interrupt your presentation with a question, the time clock will be paused, so you can be assured of the full 8 minutes’ presentation time. Please practice so that you do not exceed this 8-minute time limit (hint: do not get bogged down in the details of the experiments).

Aim to keep your presentation simple; you will be assessed on how well you can explain things concisely to a broad audience, as the NLM Steering Committee includes faculty from various disciplines related to biomedical informatics.

Your presentation should provide an overview of your proposed research project and should address the following:

1. the interdisciplinary nature of the project and approaches you propose to use, emphasizing the biomedical informatics, data science, and/or computational aspects. Highlight the parts of your project that are novel.

2. the NLM domain to which your research relates, and the importance of your project to this domain (See NLM Overview section for the domains.). This need not be lengthy or highly detailed but it should be explicitly stated. Oftentimes this may be obvious to you and others in your specific area, but it may not be clear to a diverse, interdisciplinary review group.

3. the roles that both your primary mentor and co-mentor will play in the proposed research / mentoring plan–what will you learn from each of them–, and especially how the two mentors together will enable you to conduct research that could not be accomplished with either alone.

4. your timeline for required and elective courses – this should be detailed in your slides, and reflect the actual semesters in which the courses are offered (See the section NLM Courses for links to course schedules.). If you request adding a new elective, you must list the approved elective you will take if you request is not approved. Be sure you include all required parts, e.g. Responsible Conduct of Research, COMP 573 Professional Development for Biomedical Informatics Professionals, etc.

Please note that this is an historical page describing an NLM administrative supplement-funded data science summer undergraduate program that took place in 2019. This program will not be offered in summer 2020. 

May 29 – August 2, 2019 in Houston, Texas Funded by the National Library of Medicine 3T15LM007093-27S1

The overall goal of our 10-week summer program is to familiarize rising junior or senior undergraduates with the fundamentals of biomedical informatics and data science, and the research tools and methods used in these areas.

The summer program begins with a mentored one-week boot camp on biomedical informatics (e.g. mining of biomedical literature, network-based analysis of biomedical data, machine learning for drug design and interactions),  and a second one-week boot camp on data science (e.g. elements of Python, elements of supervised and unsupervised learning, logistic and linear regression), and an assessment of Python knowledge and experience.

Participants will then be matched according to their interests and expertise, and the needs of the research projects,  with advanced PhD students and postdoctoral fellows, with whom they will work full time (~40 hours/week) for weeks 3-10. Students will also meet at least weekly with biomedical informatics and data science mentors, who will monitor their work and provide hands-on help throughout the summer. Participants will conclude by creating an abstract, scientific report, and poster on their research, and presenting in a multi-program poster session.

Activities include:

  • workshops on the responsible conduct of research;
  • workshops on science communication: how to write an abstract and brief scientific report, how to design and effectively present a poster;
  • workshops on how to apply for graduate school, medical school, or MD/PhD programs;
  • lab tours, including a tour and demonstration a 200-diagonal-inch visualization wall that provides a real-world example of how big data can be accumulated and utilized.

Eligibility: Rising junior or senior undergraduate students who are US Citizens, or Permanent Residents who already have their Green Card. We particularly encourage applications from students who are members of groups under-represented in the biomedical sciences. Preference will be given to applicants interested in attending graduate school or medical school, who have relevant coursework. Strong preference will be given to candidates who know and have used Python and have a computational background.

Stipend: This program provides a Student Participation Allowance of $7,050. Participants are responsible for paying their own travel and housing expenses.

Application deadline: all materials including letters are due to by February 1, 2019:

  • 2 letters of recommendation: One must come from an instructor in science or math/computer science. The other may be from an advisor, mentor, or employer with whom you worked with for 2+months within the past four years. Letters should address why you are suitable for this program, in terms of academic and personal Letters must be sent directly from the recommender to Melissa Glueck at

Submit your application materials in one email to, including:

  • One-page statement of interest: why you are applying to this program, what you hope to gain from this summer research experience, your career goals. Format: 11 pt, single-spaced, 1” margins, name at the top.
  • Resumé: include your contact information (i.e. cell phone number), the name of your college or university, your year and major, and current GPA; list all previous jobs you have had (including jobs in a lab, if any); include a paragraph with all relevant biological and quantitative coursework and expertise, including any computer language / programming courses and/or experience you have.
  • Academic Transcripts from all undergraduate institutions attended, including course enrollment in the current semester. Unofficial electronic copies are acceptable.

All applicants with complete applications can expect to be contacted about their status on or before March 8, 2019.

10-year Trainee Outcomes:

During the period 2010-2019, 29 predoctoral and 32 postdoctoral trainees were supported on their training grant, including those currently appointed.

Of the 29 NLM predoctoral trainees (9, or 30% female) who began graduate school between 2010-2019:

  • 17 trainees have completed their PhDs. Of those,
    • 10 are in academia (1 Assistant Professor, 1 Assistant Research Professor, 2 Assistant Teaching Professors, 4 postdocs, 1 science teacher, 1 medical student, 1 Resident);
    • 3 are in biotech (e.g. data scientists, bioinformatics scientists);
    • 2 are in industry (e.g. data scientist, The Boston Consulting Group); and
    • 2 are in government (NASA, a national lab).
  • 12 trainees (9 current trainees, 3 former trainees) remain enrolled in good standing in their respective schools, and all are expected to complete their PhDs on a timely basis; the 3 former trainees and 1 current trainee anticipate completing their PhDs by the end of 2019.

Of the 32 NLM postdoctoral trainees (10, or 31% female) who began their postdoctoral positions between 2010-2019:

  • Of 26 former trainees,
    • 15 are employed in academia (1 Associate Professor, 4 Assistant Professors, 1 instructor, 6 postdoctoral researchers, 1 research associate, 2 clinical oncologists);
    • 9 are in biotech/industry (e.g., Phillips Healthcare Research, Foundation Medicine);
    • 1 is an oncologist; and
    • 1 is seeking a position researching cancer biomarkers in the pharmaceutical industry.
  • 6 trainees are currently appointed.

Updated 7/22/2019

The Gulf Coast Consortia is committed to providing equal opportunity in training for individuals with disabilities and individuals from racial and ethnic groups who are currently under-represented in STEM fields. We welcome applications from all qualified trainees, regardless of ethnicity, race, or disability status. All GCC member institutions are ADAAA compliant and have offices of disability support services that provide accommodations and support services to trainees, faculty, staff, and visitors.

For any questions not answered by the information on this web page, please contact administrator of the NLM Training Program, Melissa Glueck at

Last updated 07/18/2019