Hiring During COVID-19: Fannie Mae is hiring for all open positions as we deliver on our mission of providing stability, liquidity, and affordability to the housing market during this critical time. All interviews and onboarding are conducted virtually. We look forward to connecting with you. Learn more

Data Scientist (Pipeline)

Job Description

Fannie Mae provides reliable, large-scale access to affordable mortgage credit in communities across our nation. We are the leading source of funding for housing in America, which means more people can buy or rent a home. We are focused on sustaining the housing recovery, improving our company, and leading change to make housing better. 
We are looking to build our network of interested candidates with a Data Science background and invite you to share your information with us. Let us know if you would be interested in working on a high-performing team and making a difference in making homes more accessible.
Fannie Mae has multiple Data Science opportunities available across our Enterprise Data, Modeling and Analytics division. For more information about Fannie Mae, visit http://www.fanniemae.com/progress
**Please note that this invitation is NOT an active opening/posting. Submitting an application constitutes an expression of interest in current or future similar openings at Fannie Mae. A recruiter will review your qualifications and if a position opens that aligns to your skillset, you may be contacted.  
Data Scientist Job Description:
Fannie Mae is expanding its data science talent to further push the frontiers of modeling and advanced analytics.  Are you passionate about advanced analytics algorithms and creating new data science tools and technologies? Do you have creative and innovative approaches to developing new analytics techniques?  We’re seeking data scientists who have domain knowledge or an interest in Big Data, machine learning, natural language processing, image processing and an interest to apply it to economic and financial applications.  You are looking to innovate the next generation of data analytics solutions with diverse data sets and leading-edge analytics use-cases? If you are ready for an exciting opportunity working hands on with the world’s most advanced data science technologies and thrive in a super dynamic environment where you are being counted on to develop advanced analytics products, this might be the role for you.
Minimum Qualifications:
  • Work or educational background in one or more of the following areas: operations research, computer science, Mathematics, data science, business analytics, or knowledge management.
  • Demonstrated experience programming with R/Python, Linux, and Spark in AWS cloud environment, or knowledge and algorithmic design experience in C#/C++ (3+ years)
  • Demonstrated experience with SQL and relational database technologies, such as Oracle, PostgreSQL, MySQL, RDS, Redshift, Hadoop EMR, Hive, etc.
  • Demonstrated experience processing structured and unstructured data sources, data cleansing, data normalization and prep for analysis
  • Demonstrated experience with machine learning techniques including natural language processing.
  • Demonstrated experience with code repositories and build/deployment pipelines, specifically Jenkins and/or Git.
  • Demonstrated experience using Apache Hadoop and/or Apache Spark stack for big data processing, or comparable distributed computing platforms.
  • Demonstrated experience using data streaming technologies such as Kafka, Rabbit MQ, NiFi, Kinesis or comparable tools
  • Demonstrated experience using Tableau, Kibana, Quicksights or other similar data visualizations tools.
  • Ability to handle terabytes of time-series and cross-sectional data and extract well defined alpha from the underlying relationships
  • Thorough understanding of statistical methods for optimizations (Linear / Non-linear / regressions / Neural Networks / ARIMA / VAR / SSPQN)
  • Very comfortable working with ambiguity (e.g. imperfect data, loosely defined concepts, ideas, or goals)  
Preferred Qualifications:
  • MS in Computer Science, Statistics, Math, Engineering, or related field, PhD preferred
  • 3+ years of relevant experience in building large scale machine learning or deep learning models and/or systems
  • 1+ year of experience specifically with deep learning (e.g., CNN, RNN, LSTM)
  • Demonstrated skills with Jupyter Notebook, AWS Sagemaker, or Domino Datalab or comparable environments
  • Passion for solving complex data problems and generating cross-functional solutions in a fast-paced environment
  • Knowledge in Python or C++ / C#, and SQL, object oriented programming, service oriented architectures
  • Strong scripting skills with Shell script and SQL
  • Strong coding skills and experience with Python (including SciPy, NumPy, and/or PySpark) and/or Scala.
  • Knowledge and implementation experience with statistical and machine learning models (regression, classification, clustering, graph models, etc.)
  • Hands on experience building models with deep learning frameworks like MXNet, Tensorflow, Keras, Caffe, PyTorch, Theano, or similar
  • Experience search architecture (ex - Solr, ElasticSearch)
Data Scientist
Education Level Preferred
  • Masters, PhD or Other Advanced Degree
  • 2-4 years
Senior Data Scientist
Education Level Preferred
  • Masters, PhD or Other Advanced Degree
  • 6-8 years
Data Scientist Manager
Education Level Preferred
  • Masters, PhD or Other Advanced Degree
  • 8-10 years

Req ID: 59260