The Data Science Institute of Columbia University invites applications to fill positions of Staff Associate and/or Sr. Staff Associate.

The Data Science Institute of Columbia University’s mission involves advancing the state of the art in data science; transforming all fields, professions, and sectors through the application of data science, while ensuring the responsible use of data for the benefit of society.

We are looking for a research programmer to serve as a system engineer on a large research project involving search and summarization over multilingual text and speech documents. Columbia University is the lead organization and there are four other universities on the team. The candidate will be responsible for maintaining and modifying the system architecture, developing new system components as required, creating and maintaining web resources in the languages we are required to work in building, overseeing evaluation of the system, and managing a group of GPUmachines for the group.

The programmer would work with a team from multiple universities on a project called SCRIPTS. SCRIPTS (System for CRoss language Information Processing, Translation and Summarization) is a planned end to end system for retrieval of speech and text documents in low to medium resource languages (e.g., Uhyghur or Arabic). In order for the system to determine relevance, speech documents must be automatically transcribed and both text and speech documents must be translated. There will also be a summarization component that will produce a short paragraph summary of the relevant portion of each document that an end user can use to determine if the document is what s/he was looking for. The project is funded by IARPA and the team will participate in multiple evaluations to demonstrate how well the system works.

For a Staff Associate position the candidate is required to have a Bachelor’s degree and 4 years of related experience or a Master’s degree and 2 years of experience.

For a Sr. Staff Associate position, the candidate must have a Bachelor’s degree and 8 years of related experience or a Master’s degree and 4 years of experience.

The following experience and skills are preferred for the successful candidate:
Experience managing large software projects.
Strong Java, C++, Python and Perl programming skills and a close familiarity with core and 3rd party libraries.
Familiarity with software engineering tools such as github, Docker and one-click build and test.
Knowledge of deep learning and GPU machines preferable.
Skill with processing, analysis and storage of big data.
Experience with Linux system administration.
Knowledge about Human Language Technologies (NLP, IR, ASR)
Experience with Network Analysis preferable.
Good at switching between tasks and assisting project members.
Skilled at interacting with internal and external partners.
Devoted user of industry best practices and test-driven development.

Ability to work under pressure of deadlines and be able to deliver required results on time. 

Strong analytical and quantitative ability. 
Strong interpersonal, organizational and communication skills, and willingness to work with multiple researchers and PhD students. 

Screening of applicants will begin on October 30th, 2017. The search will remain open no less than 30 days from the date of posting.

Applicants should apply through our online RAPS system:
academicjobs.columbia.edu/applicants/Central?quickFind=65271

Columbia University is an Equal Opportunity/Affirmative Action Employer.

  • Jonathan Stark
    Data Science Institute
    Columbia University in the City of New York
    550 West 120th Street
    NWC Building, Suite 1401
    New York, NY 10027
  • jrs2139@columbia.edu

  • Columbia University is an Equal Opportunity/Affirmative Action employer.




Explore Existing Job Openings Across the World or Publish a Job to Showcase It Globally in VePub.