1:1 Research Projects

Check out some of the incredible projects our AI + X graduates have completed!

workspace_premium - Published in Research Journals or Science Fairs
Sorry, no projects match those filters :(
Journal of Emerging Investigators workspace_premium
Diagnosing Hypertrophic Cardiomyopathy Using Machine Learning Models on CMRs and EKGs of the Heart

In this project, we presented a pair of models, one CNN model and one Long Short Term Memory (LSTM) model, that are capable of classifying cardiac magnetic resonance (CMR) and heart electrocardiogram (EKG) scans, respectively.
Surya K. | Summer 2022
Mentored by Sriram Hathwar
Generative Adversarial Fusion for Supercell Thunderstorm Forecasting

This paper proposes a novel machine learning model that attempts to address this problem by taking advantage of two different types of crucial data.
Yash J. | Fall 2022
Mentored by Sarthak Kanodia
T-RECSYS+: An Improved Music Recommendation System

In our research, we build a music recommendation system to make prediction of users' listening preference.
Zhou C. | Summer 2022
Mentored by Ross Greer
Exposing Undercounts in the Census through Regression Modelling

Although many community leaders have proposed that language barriers pose significant obstacles to Census outreach, this paper explores the viability of using predictive models to quantify the extent the role language plays.
Tarun S. | Fall 2022
Mentored by Katie O'Nell
Predicting the Price of New York City Airbnbs

How can one predict the price of a New York City Airbnb? We are trying to create a machine learning model that can predict the price of a NYC Airbnb given some factors with high accuracy.
Bobby B. | Summer 2022
Mentored by Tomer Arnon
Multi-Label Prediction of Protein Subcellular Localizations through Machine Learning and State-of-the-art Structural Embeddings

This research explores various machine learning models to predict protein subcellular localization using UniProt’s Swiss-Prot database and Meta’s high-performing ESM2 structural embeddings.
Alfred B. | Summer 2022
Mentored by Ayush Pandit
2nd Place at Contra Costa Science and Engineering Fair workspace_premium
Optimizing Prediction Accuracy Using Advanced Ensemble And Voting Classifier Methods

This project observes how various machine learning models, once tuned, can further be combined to create a complex model that uses NFL data from the past 18 years to predict the outcomes of matchups between any two competing teams.
Ashray P. | Summer 2022
Mentored by Christopher Mauck
Predicting Populations: Modeling Demographic Predictions for Nations Around the World Using Population Pyramids and Demographic Transition Models

In this research project, I used multiple machine learning models(neural network and linear regression) in order to predict key demographic statistics over the next 5 years for each nation.
Michael Z. | Summer 2023
Mentored by Kasra Koushan
Examination of Dragonflies (Pantala Flavescens): Characteristics, Identification and Migration Paths

This study explores the Pantala flavescens species' anatomy and mechanisms for long-distance migrations.
Alya K. | Winter 2022
Mentored by West Foster
Predicting International Commodity Prices, Production, and Land Usage Towards Reducing Agricultural Emissions

Our long-term goal for this research project was to analyze food insecurity and determine if we could predict the emissions and pricing of different food commodities.
Sahir G. | Summer 2023
Mentored by Landon Butler
Unmasking Fraud: Applying Machine Learning to Detect Bank Drops

This paper evaluates eight machine learning (ML) models that are capable of flagging bank accounts as fraudulent at the time of bank account registration.
James S. | Fall 2023
Mentored by
Improving User Retention in Video Game Industry

This research paper delves into the intersection of player data analytics and affective computing to predict and adapt game difficulty levels for the purpose of enhancing player retention in the gaming industry.
Ethan V. | Summer 2023
Mentored by Anjali Singh
Journal of Student Research workspace_premium
Machine Learning Approaches to Detect Brain Tumors from Magnetic Resonance Imaging Scans

Our study utilized a comprehensive dataset of brain magnetic resonance imaging (MRI) scans to compare and assess the performance of different baseline AI models.
Cherry (. | Winter 2022
Mentored by Shreya Parchure
Predicting State-Wide Cotton Yield using Geospatial Data

The focus of this study was to use Geospatial Linear Regression to predict cotton yield.
Gregory G. | Summer 2023
Mentored by Ribhav Gupta
workspace_premium
Using Machine Learning to Detect Alzheimer’s Disease in MRI Scans

We aimed to answer the question about if Magnetic Resonance Imaging (MRI) scans, which are often used in the diagnosing of other neurological disorders, can be used to diagnose AD in patients.
Sam L. | Summer 2023
Mentored by Ivan Villa-Renteria
Using Neural Networks to Predict U.S. Corporate Profits on Electronic Goods

The goal of this project is to train two neural network AI models: a Multi-Layer Perceptron (MLP) neural network and a Long Short-Term Memory (LSTM) neural network, to predict U.S. corporate profits on electronic goods into the future.
Will K. | Summer 2023
Mentored by Ana Sofia Muñoz Valadez
Generating Instagram Captions with ViT-GPT2 and GPT3

This paper presents a novel approach for generating Instagram captions based on visual features and language models. Our caption generator combines Vision Transformer-GPT2 and GPT3 to generate descriptive and engaging captions in the style of an Instagram post.
Ariel M. | Summer 2022
Mentored by Roger Jin
Vibration Analysis

This paper covers differences in accuracies of an artificial intelligence model that classifies different sets of sensor outputs (in this case, machinery shaft vibrations) collected by sensors into varying levels of weight on the shaft.
Akshay N. | Fall 2022
Mentored by Tim Gianitsos
Curieaux Academic Journal workspace_premium
Investigating Data Augmentation Strategies for Computer Vision Facial Expression Recognition

I aim to help people with autism better recognize emotions by developing improved artificial intelligence (AI) models to recognize facial expressions.
Jack L. | Fall 2022
Mentored by Peter Washington
A Methodology for Learning Airplane Descent Patterns

In this study, we trained a plane to land without human assistance.
Michael T. | Fall 2023
Mentored by Sean Konz
Texas High School Dropout Rates

Dropout rates in high schools throughout Texas have been steadily increasing, especially following the Covid-19 pandemic. This project aims to solve this problem through an artificial intelligence algorithm that can predict the dropout rate of a Texas high school campus based on specific characteristics of the school.
Emily J. | Summer 2022
Mentored by Philip Bell
Generation of Research Paper Titles

Can NLP accurately and effectively generate research paper titles? In this research paper, an effective and accurate artificial intelligence NLP model is tried to be determined by evaluating various models and methods for title generation.
Christopher G. | Summer 2022
Mentored by Sean Konz
SoilingNet: Convolutional Neural Networks for Analysis of Soiling Defects in Photovoltaic Panels

In our study, we present a two-step, fully supervised deep learning approach for analyzing solar panel soiling on a per-panel level.
Thomas G. | Fall 2022
Mentored by Ivan Felipe Rodriguez
Massachusetts Science and Engineering Fair (MSEF) workspace_premium
Applications of AI in Microfinance

I hope to explore how to use AI in the field of microfinance to help reduce income inequality. Microfinance has greatly helped decrease rural poverty rates in Bangladesh. The leader of this effort won the Nobel Peace Prize for his work. These micro-loans give opportunity to those who are not otherwise able to obtain financing for their entrepreneurial ideas. This can be applied in the U.S. too, in places where the population cannot otherwise obtain loans to start small businesses and climb their way out of poverty. It can be very difficult to start from rock bottom in the US, especially for those without access to resources. If people could obtain small loans and start small businesses, they could work their way out of poverty. The microloans have to be viable for banks as well. The problem I’d like to explore is whether AI/ML can be used to determine how to deploy microloans efficiently to address income inequality in the U.S.
Alex M. | Summer 2022
Mentored by Odysseas Drosis
Journal of Student Research workspace_premium
Predicting Running Injuries with Machine Learning Models

Is it possible to predict running injuries with only a dataset and machine learning models? This paper explores this question by using classification models, including the Logistic Regression model and the Random Forest Classifier model.
Elgin V. | Summer 2022
Mentored by Joseph Vincent
Is GPT-3 smarter than a sixth-grader?

Question answering (QA) and Large Language models (LLM) have been a major research focus in Artificial Intelligence for several years. In 2017, a task called Textbook Question Answering (TQA) was introduced. The task included lessons from a middle school science textbook consisting of texts, diagrams, and natural questions. Many people attempted to create question answer models but reported sub-par accuracies.
Anitej S. | Summer 2022
Mentored by Eric Bradford
Journal of Student Research workspace_premium
Evaluating Machine Learning Models on Predicting Change in Enzyme Thermostability

Our research problem is finding the best machine learning model to predict the change in enzyme thermostability after a single point mutation in the amino acid sequence.
Avnith V. | Fall 2022
Mentored by Jacklyn Luu
A new style of teaching: Exploring the benefits of visual language learning

This project was designed to investigate the efficiency and likelihood of people learning a new language through means of an AI that would help translate objects in day to day life to a language that the user would want to learn.
Yuna S. | Summer 2022
Mentored by Shreya Parchure
Analysis of Trending YouTube Videos: Finding Patterns in Viral Content

As the digital world continues to grow, content creators frequently have trouble building a community and producing videos that will interest their audience. Especially as these people look toward the internet for both recreational and monetary reasons, finding out techniques to build a community is important in today’s age. This paper analyzes the issues of video performance, revealing the patterns of what makes a video successful and viral. By training different models and testing different datasets, we were able to find the correlation between the potential chances of popularity and the video’s content. Using the most accurate model, the Random Forest model, content creators can see whether or not they are likely to do well based on patterns found in trending videos.
Vincent P. | Summer 2022
Mentored by Amanda Wang
Using Linear Regression to Detect the Binding Efficiency of Ligands for Effective p53-MDM2 Inhibition

This research project targets the interaction between the MDM2 and p53 proteins to find out the most efficient ligands, or small molecules, that can bind to MDM2 and prevent the inhibition of p53 so as to stimulate the opportunity for p53 to signal for cell repair/death.
Hoshita U. | Summer 2022
Mentored by Ayush Pandit
Allez Go: AI Fencing Referee

Technology in fencing is generally an underdeveloped field and automated referees present potentially significant benefits to the sport. Automated referees will offer a more consistent call compared to a group of human referees with slightly different interpretations of the fencing rules.
Jason M. | Spring 2022
Mentored by Anna Orosz
DeepSolar Bangladesh: A Novel Convolutional Neural Network (CNN) Architecture for the Detection of Solar Panels from Low Resolution Satellite Imagery in Developing Countries

Due to its environmental benefits and decreasing costs, the supply of solar energy is growing at an accelerating pace globally. However, the decentralised nature of solar makes it difficult to keep track of the different photovoltaic (PV) systems deployed across a country. There is a critical need for highly accurate, comprehensive national databases of solar systems, which would allow policymakers, researchers, and the government to study socioeconomic trends in solar deployment. Manual surveys have shown to be inaccurate. The 2018 DeepSolar study by Yang et. al developed a deep-learning framework and national solar deployment database for the US using high-quality satellite imagery, which proved to be a much more efficient and accurate approach. However, satellite imagery in developing countries such as Bangladesh is of much lower resolution and quality, and performed poorly with the original DeepSolar model by Yang et. al. Our study highlights the implementation of a novel convolutional neural network (CNN) in detecting solar panels through low resolution Google Static Maps API satellite imagery data.
Khondoker F. | Summer 2022
Mentored by Barbie Duckworth
Synopsys Science Fair workspace_premium
AI-Based Image Classification Used to Accurately Distinguish Recyclable Material Versus Non-Recyclable Material

One cause of this improper disposal of materials is that it can be difficult to tell if a material is able to be recycled. In response, I created a machine learning model that can distinguish recyclable materials from trash through image classification.
Katarina A. | Summer 2022
Mentored by Ayush Pandit
Molecular Analysis of Stellar Clusters Final Paper

By testing a multitude of models from the python package scikit-learn upon the APOGEE dataset, we’ve managed to produce a resultant product that can predict a star’s iron concentration depending upon its gravitational pull (LOGG) value or effective temperature (TEFF) value.
James W. | Summer 2022
Mentored by Aidan Donaghey
Building an Optimized algorithm that provides summaries of legal documents

The legal industry is built around documents as they provide evidence and reduce doubt in the court. Due to the large volume of documentation in the legal industry, the processing and summarization of these documents is important to a number of individuals. We were able to create a user interface that allows for the input of documents and makes use of the algorithm we created to output a summary of the document which can be copied by the user.
Aman B. | Summer 2022
Mentored by Eric Bradford
Approaches to fraud detection on credit card transactions using artificial intelligence methods

In this paper, we study the problem of detecting fraudulent credit card transactions. We select the most relevant features using a heuristic approach, and fit three different model classes to a simulated dataset: Logistic Regression, Random Forests and Gradient Boosting Machines. We find that hyperparameter tuning has a big impact on the precision and recall of our classifiers. We also find that of the three classes, Gradient Boosting Machines were the best-performing model class, achieving 83% precision and 64% recall on unseen data.
Aryaman R. | Summer 2022
Mentored by Yuan Lee
AI Detection of Emotions through EEG

We proceeded to develop a recurrent neural network (RNN) model structure that could analyze an individual’s brain waves and predict their emotional condition.
Srinivas S. | Summer 2022
Mentored by Nima Leclerc
Curieaux Academic Journal workspace_premium
Impact of Class Weights and Feature Importance in Automated Stroke Detection

In this research paper, we do a parametric study of class weighting as a way to tackle imbalance during training. We then infer the most important features that should be taken into consideration for stroke prediction.
Avyukth H. | Summer 2022
Mentored by Maxime Bassenne
Sign Language Recognition In Deep Learning: A Comparative Study of Custom CNN Model and Pre-Trained Architectures

The goal of this research is to successfully train different Convolutional Neural Network (CNN) models to identify sign language images, compare the performance of each model, and figure out the best model for image recognition and classification.
Vanessa H. | Summer 2023
Mentored by Matan Gans
Prediction of Nitrogen Dioxide Level using Machine Learning Models

As we release more pollution into the atmosphere as factory and vehicle byproducts, acid rains are formed and are constantly damaging the environment and harming wildlives under and above the ocean. If the occurrence of acid rains can't be limited, the phenomenon of this man-made disaster will threaten the health and safety of billions. By predicting the potential development of acid rain with nitrogen dioxide (NO2) levels, one of the major components of acid rain, we could prevent it by limiting the number of specific gasses produced.
Audrey W. | Summer 2022
Mentored by Matthew Radzihovsky
Dental Segmentation of The Mandible Using AI

Dental segmentation can help with decision-support issues in medical diagnosis, such as human identification and maxillofacial surgery, as well as orthodontic therapy, implant planning, and other issues. This leads to my research question; “How do I automate the task required to diagnose a dental problem during a dental visit?”
Khushi G. | Fall 2022
Mentored by Joseph Vincent
SaShiMi: Adapted for Google Colab

In this project, I convert SaShiMi, a music generation software, into something more resource-friendly.
Leo R. | Spring 2022
Mentored by Roger Jin
medRxiv Medical Ethics preprint workspace_premium
Differences in predicted rates of vaginal births after cesarean across racial groups in a ‘race-neutral’ model

A large body of work in machine learning has highlighted that supposedly de-biased systems often re-code sensitive variables like race in terms of proxy variables. In order to determine if this was the case in this calculator, we replicated their formula, then found base-rate statistics of all the input variables for three different racial groups: Black, White, and Asian.
Anjali S. | Summer 2022
Mentored by Katie O'Nell
Modeling The Impact of Electric Vehicle Adoption on NO2 Levels Using Machine Learning: A Predictive Analysis

Air pollution, notably nitrogen dioxide (NO2), poses severe health and environmental risks. The research question explores whether increasing EVs adoption can visibly reduce NO2 levels.
Arnav G. | Summer 2023
Mentored by
Journal of Student Research workspace_premium
Diversified AI Techniques for Augmenting Brain Tumor Diagnosis

This research explores the application of AI technology to expedite the diagnosis of brain tumors.
Dhruv M. | Fall 2022
Mentored by Odysseas Drosis
Facial Intoxication Detector with AI

The purpose of this study is to analyze whether Artificial Intelligence can be used to more effectively detect and prevent drunk driving and if a Machine Learning models like Logistic Regression and Decision Tree can be more accurate than a human police officer.
Rayyaan O. | Summer 2022
Mentored by Linda Banh
Brightness helps CNN classify a subset of the images from Google Quick Draw

We made a CNN that learned to recognize and classify sketches from Google’s Quick Draw dataset and implemented a novel brightness feature to test if it increased the accuracy.
Serena F. | Summer 2022
Mentored by Clayton Greenberg
Revolutionizing Football: Using Machine Learning to Predict Future Performances for Quarterbacks

It is important to have a reliable application that can aid users of betting and fantasy football in which players they should put bet on, or choose for their fantasy teams. My motivation behind this project was to create something that could further betting and fantasy football, and even increase traction.
Anya N. | Winter 2023
Mentored by Eric Bradford
Using Machine Learning to Classify Stars, Quasars, and Galaxies

In this project, we looked at data from stars, quasars, and galaxies from the sixteenth data release of the Sloan Digital Sky Survey Telescope. The aim of the project was to accurately and quickly classify these three types of objects using machine learning. We used three machine learning algorithms, namely logistic regression, multi-layer perceptron, and decision tree classifier.
Chinmay R. | Summer 2022
Mentored by Amanda Wang
Cardiac Auscultation: Metrics of Smartphones and Digital Stethoscopes

The objective of our research is to figure out how feasible and accurate a mobile device solution to cardiac ascultation is, compared to a digital stethoscope.
Nicholas T. | Summer 2022
Mentored by Sophia Barton
Santa Clara ISEF Qualifier workspace_premium
Combating Climate Fake News Using NLP

As fake news becomes more prevalent across the US, important issues become harder to solve. One such issue is climate change, where climate misinformation has worsened viewer’s abilities to distinguish between fake information and real information. This project’s objective is to tackle climate misinformation using an artificial intelligence model.
Rayyan M. | Summer 2022
Mentored by Philip Bell
AI Models Necessary to Reduce Racism in Criminal Justice In The United States of America

The goal was to ensure the AI model is accurate and free of skewed/biased data. Throughout this research project, three different regression models are looked at in order to narrow down the most accurate models.
Yudhiishbala V. | Summer 2022
Mentored by Ana Sofia Muñoz Valadez
Can A Person’s MBTI Type Be Determined By A Sample Of Their Writing?

This project aims to lessen the reliance of self-report for personality typology through an artificial intelligence algorithm that can type people as one of the 16 MBTI types using an unedited writing sample by that person.
Parinita K. | Summer 2022
Mentored by Philip Bell
Predicting Dropouts Using Machine Learning Models

High dropout rates in high schools and universities have become a major complication in many countries following the upsurge of the Covid 19 pandemic. With the accuracy provided, this model should be held in high regard as it serves as an efficient method for predicting possible college dropouts in university.
Viksar D. | Summer 2022
Mentored by Eric Bradford
Stroke Susceptibility Prediction Using Artificial Neural Networks

Our research question is centered around predicting stroke susceptibility and determining which demographic and medical factors significantly increase or decrease that risk. Our results indicate the potentiality of the use of AI in determining stroke risk based solely on collected medical and lifestyle data.
Vinati P. | Summer 2022
Mentored by Samuel Kwong
Journal of High School Research workspace_premium
Diagnosing Brain Tumors from MRI Images Using Deep Transfer Learning

This study aims to utilize a transfer learning method in which the prior knowledge of a pretrained model is used to aid in a new classification problem.
Armita K. | Fall 2022
Mentored by Oscar O'Rahilly
Optimizing Stroke Detection with Machine Learning Techniques

This paper addresses the critical question of how various medical factors influence vulnerability to stroke in patients, aiming to develop an effective machine-learning model for stroke detection.
Neil S. | Summer 2023
Mentored by Ye Wang
Predicting Repeat Purchases in E-Commerce

The underlying problem here is identifying more efficient upselling strategies for retailer companies. These findings may potentially help a retail company determine the likelihood of a customer purchasing another item based on their first purchase, and then decide how to upsell based on that information.
Ayrton S. | Summer 2022
Mentored by Bryce Johnson
Application of AI to Tennis Match Footage Transcription

One of the best ways for tennis players to improve their game is to record and watch their own match footage, find patterns in the points they win and lose, and practice based on these realizations. However, watching match footage and documenting each point shot by shot is a very time-consuming process. This paper investigates an AI approach to transcribing tennis match footage, combining a deep convolution neural network (YOLOv4), a pose estimation model (Movenet), and a long short- term memory (LSTM) deep neural network. Looking at a transcript of each point will be far more efficient than watching entire match footage for a player to understand how they are losing and winning and analyze patterns in their game. The LSTM model in this project achieved accura- cies of 73.33% and 79.31% when classifying shot type (forehand, forehand volley, forehand slice, backhand, backhand volley, backhand slice, over- head/smash, and serve) for players on the close side and opposite side of the net, respectively, and 55.17% and 60.00% when classifying the di- rection of a shot (cross-court, down the line, down the middle, inside in, inside out, out wide, down the t, and body) for players on the close side and opposite side of the net, respectively.
Marco Y. | Summer 2022
Mentored by Eric Bradford
Height Prediction Using Basic Data

I used basic data sets to see if some learning models have a chance of predicting height based on country and age.
Daniel S. | Summer 2023
Mentored by Jonathan Delgadillo Lorenzo
Comparing Machine Learning Models to Determine Which is Most Effective at Detecting Brain Tumors

By using machine learning techniques to analyze various brain tumor scans, the goal was to determine which techniques are the most efficient and accurate in determining the presence of brain tumors.
Samya C. | Summer 2022
Mentored by Shreya Parchure
Predicting NH4 Levels for Corn Crop in Wisc

Ammonium (NH4), an organic matter that accumulates in the top portion of soil, can pose a serious risk to biodiversity. Using machine learning to construct regression models, NH4 levels can be predicted and therefore mitigated. In this paper we used linear, ridge, and lasso regressions. Through the evaluation of crop farming factors that contribute to the NH4 levels, it was concluded that NO3 and N2O have the most direct correlation to NH4. These factors yielded the best accuracy for regression models with the best performing model being a multiple feature linear regression which resulted in 60% accuracy. While certain measures did improve the model’s performance, outliers continuously worsened the results.
Julia S. | Summer 2022
Mentored by Barbie Duckworth
The Impacts of Child-Mentor Relationships on Child Mental Health

In this paper, we use a dataset from the Substance Abuse and Mental Health Data Archive to analyze the impacts of child-mentor relationships on child mental health. Previous studies have come to similar conclusions: positive parent-student and teacher-student relationships often lead to signs of positive mental health in children. To expand upon these past findings, we used different measures of mental health and relationship qualities to create a predictive model. We conducted classification and coefficient weight analyses to see how strong of an impact different variables representing indicators of healthy relationships had on different aspects of child mental health. This information was used to create predictions of future cases. Like previous studies, we found that there was a general pattern that showed that good child-mentor relationships, defined specifically by the frequency of praise and fights, had an overall positive impact on child mental health, specifically when it came to symptoms of depression. Furthermore, parents seemed to have stronger impacts on child mental health than teachers did. Limitations include possibly biased respondents who may not be representative of the greater population, as well as the specificity of the variables that were chosen. Going forward, further steps to analyze different datasets and deepen the scope of the research will be helpful in finding patterns and developing more detailed conclusions.
Samantha K. | Summer 2022
Mentored by Akshay Jagadeesh
One Class Classification for Overdose Death Detection

In the past year alone there were an estimated 107,622 drug overdose deaths in just the United States. With such an incredible amount of deaths from just this, I thought it would be beneficial to create a model to predict who is at risk of drug overdose deaths.
Ram N. | Spring 2022
Mentored by Eric Bradford
Identifying EOL software

Throughout our project, we have been exploring ways to identify EOL and malicious websites using python. Creating a new variable with the keywords, we were able to correctly predict 99% of the EOL websites using the data given to us.
Neal D. | Summer 2022
Mentored by Clayton Greenberg
2nd Place at Orange County Science Fair, California Science and Engineering Fair (CSEF) workspace_premium
The Differentiation of Viral and Bacterial Pneumonia using Deep Learning

This project aims to find out whether a Convolutional Neural Network can be used to classify x-ray scans as having either bacterial or viral Pneumonia.
Arnav D. | Fall 2022
Mentored by Nic Thibodeaux
Using Machine Learning Algorithms to Predict Property Prices

I decided to investigate the connection between AI and real estate, and how machine learning algorithms can be effective in determining property prices, analyzing a dataset on AirBnb listings in New York City from 2016.
Armon S. | Summer 2022
Mentored by Bryce Johnson
AI in Recycling

By using a machine learning algorithm, we have been able to create a tool that can detect what type of material an item is and determine whether it is recyclable or not.
Nicholas K. | Summer 2022
Mentored by Shreyas Muralidharan
Fake News Detection with BERT

In this paper, we propose a fake news detection model using a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model.
David S. | Summer 2022
Mentored by Roger Jin
Brain Cancer Detection

Current methods for determining the presence and type of brain tumor in a given patient’s MRI scan can oftentimes be inefficient and are prone for error. By using a machine learning algorithm, the error in these classifications is reduced significantly, and the process is made much more efficient.
Rohan T. | Fall 2022
Mentored by Nic Thibodeaux
Using EEG to Detect Eye Movement

In this paper, we show that it is possible to use EEG data to detect eye movement using machine learning. By recognizing eye movement through EEG results, our goal is to help individuals with disabilities better control object movement and perform daily activities independently.
Aishwaroopa N. | Summer 2022
Mentored by Tomer Arnon
Sketch Recognition using Artificial Intelligence

This paper is about an AI project that helps people learn about animals.
Joseph N. | Fall 2022
Mentored by Oscar O'Rahilly
Journal of Emerging Investigators workspace_premium
Comparison of Different Approaches for Stock Price Prediction

To prove the hypothesis, stock prices of Tesla, Apple, and Papa Johns during the past five years were used to train LR and NN models to predict a given stock price.
David A. | Fall 2022
Mentored by Odysseas Drosis
Findings Provided by a Machine Learning Clustering Model of Stellar Kinematics in Star Clusters Can be Used to Identify Runaway Stars

The need for developing a machine learning model that can detect these runaway stars is established in this research as it attempts to expand the domain of astrophysics to help astrophysicists understand tidal tail formation and the theoretical dark matter subhalos’ effects on celestial bodies.
Roneet D. | Summer 2022
Mentored by Tony Rodriguez
Predicting Precipitation and Other Weather Conditions With Logistic Regression and Random Forest Classifiers

An issue we have with our current algorithms is the inability to predict weather accurately for any time range further than ten days. Our intentions with this project were to create a model that could get an accuracy within 5% of our current weather prediction algorithms.
Vadim Y. | Fall 2022
Mentored by Darnell Granberry
Standardized Testing is it Really Effective?

How effective is standardized testing in assessing students' knowledge? With data from previous students on sat scores and GPAs from college and high school, I can use this data and plug it into different models to determine how well these sat scores determine GPA.
Rohan S. | Summer 2022
Mentored by Christina Cheng
Predicting Future Phonological Changes Of Mandarin Chinese

We created a system of translating IPA representation to vectors that captures characteristics of phonemes such as their articulatory location, and experimented with several machine learning models to capture existing trends from Old, Middle to Modern Chinese (Mandarin).
Peijie G. | Summer 2022
Mentored by Kush Khosla
Fake news detection, methods and data processing

In this project we studied several models to discern fake news articles from real news articles to find the best method.
Soham P. | Summer 2023
Mentored by Erick Ruiz
Classification of Exon and Intron Boundaries

The goal was to accurately classify exon and intron boundaries based on DNA sequences. Scientists can learn more about proteins if divisions between exons and introns are clear. We used multiple different machine learning approaches including a logistic regression model, a multilayer perceptron, a LSTM, and a model that included a LSTM, autoencoder and a multilayer perceptron. LSTMs performed well, pointing to the idea that order of nucleotides is important when classifying DNA sequences.
Milo B. | Summer 2022
Mentored by Samuel Kwong
Using Machine Learning to identify the Habitability of Exoplanets from TESS Transit Data

With so many new tools and methods to detect exoplanets, it is time to start determining which are deserving of more research in the hunt for other habitable planets.
Eve B. | Summer 2022
Mentored by Tomer Arnon
Predicting Drug-Drug Interaction Severity Using Network Characteristics

Drug-drug interactions (DDIs), which can add to or diminish the effect of one drug or impact the metabolism of one drug, have harmful effects on health in patients that take multiple drugs. Testing for DDIs is slow and costly, so computational models have recently been used to predict them. Network information is useful in describing a drug’s known interactions and mechanisms to determine whether two drugs could be interacting. This research explores various machine learning models to predict the severity of unknown interactions of existing drugs (major, minor, or moderate), using the DDInter database.
Deetya N. | Summer 2022
Mentored by Linda Banh
What Factors Correlate with the Relationship Between Gender and Race and Pursuing STEM?

In this paper, I sought to answer the question, “How do races and genders differ in the way they pursue STEM, and what factors correlate with these differences?” Identifying which factors cause this lack of representation is the most important step in fixing the diversity problem.
Saket R. | Summer 2022
Mentored by Bradley Yam
Insect Identification Project for Agricultural Advancement

Through the use of image data, we developed an artificial intelligence system which is able to predict an insect’s species based on a photo of a given insect.
Archith S. | Summer 2022
Mentored by Barbie Duckworth
What Data is Needed to Accurately Determine Someone's Mental Health?

If we determine which data is actually necessary, we can build better user trust while maintaining the efficacy of a chatbot.
Claire L. | Summer 2022
Mentored by Katie O'Nell
Predicting Solar Array Output Using Weather Sensor Data

With the recent push for renewable energy sources, solar energy is one that is readily available. This paper will explore how to predict the power output of a solar array based on weather data, collected from sensors throughout each day.
Maximilian P. | Summer 2022
Mentored by
Plant Toxicity Classification by Image

Since differentiating between dangerous and safe plants is a complex task for a human brain, this study approaches the issue through machine learning models starting with a convolutional neural network (CNN) and discovering that a logistic regression model—trained on a dataset with manually designed features—has the best performance with the particular dataset used.
Eera B. | Summer 2022
Mentored by Clayton Greenberg
Application of Machine Learning to Distinguish Premature Leukemia Cells from Healthy Blood Cells

Convolutional Neural Networks (CNNS) to solve this problem have been created with classification validation accuracy rates as high as 96.15%.
Noor E. | Summer 2022
Mentored by Yiran Li
World Academy of Science Engineering and Technology (WASAT) workspace_premium
Stock Prediction Project

As of 2022, 60%-73% of the trading volume done in the U.S stock market is done by algorithms, illustrating the dependency and importance of these types of algorithms in stock predictions. Additionally, a recent census by Gallup shows that 145 million Americans (roughly 56% of the population) invest and own stocks, showing the population’s keen interest in the stock market. I endeavored to create a model that takes into account opening, closing, high, and low prices from the last 3 days to predict the opening price on the fourth day. Of course, the model is not just limited to only being able to approximate the opening price the following day; for example, it will also be able to predict the opening price 5 days later.
Sofia S. | Summer 2022
Mentored by Odysseas Drosis
Attention LSTMs in Multimodal Models: A holistic approach to predicting COVID infection trends

In our search for a way to simultaneously predict all state-level COVID infection rates in the United States with COVID heat maps and domestic flight graphs, we propose novel methods of processing graph and image sequences with attention-based LSTM layers as well as evaluate the effectiveness of different multimodal fusion techniques.
Nuo W. | Spring 2022
Mentored by Eric Bradford
Predicting emotion ratings from color statistics of images

We utilized this data to see how visual elements affect our everyday lives by engineering features of images and running that through a variety of neural networks.
Claire M. | Summer 2022
Mentored by Clayton Greenberg
Is AI Necessary In Deciding Whether an Offender Is Likely To Recidivate, With & Without the Effect of Protected Characteristics

Our research topic addresses the issue of whether or not an AI algorithm can be developed to forecast whether or not a criminal would recidivate by removing protected characteristics from models in terms of both accuracy, precision and recall and their equality across different groups, as well as if such a tool is necessary
Thalia S. | Summer 2022
Mentored by
Brain Tumor Classification

Early classification and diagnosis of Brain Tumors are essential for providing the right treatment to a patient. It is crucial to get treatment as soon as possible because the survival rate for someone with an untreated brain tumor can range from as low as 3 months to as high as 5 years. In this project, we classified brain tumor images into 4 categories: glioma, meningioma, pituitary, and no tumor. With the use of baseline and deep learning models, the deep learning models demonstrated a significantly higher performance due to their ability to analyze images. The model with the highest accuracy was the MobileNet, a pre-trained transfer learning model trained on 5,608 images. This model yielded a validation accuracy of 98.24%. Using metrics including Kappa cohen score, precision, and recall, we validated the machine learning model's performance. We deployed the MobileNet model to a web app using Streamlit, where users submit MRI images and receive diagnoses of tumor class. We found that the model performed very well while utilizing the web app, indicating that it is safe to be used. However, since we only have 4 classes and there are over 150 total types of brain tumors, it could easily get a diagnosis wrong if it is not in one of these 4 classes.
Rohan S. | Summer 2022
Mentored by Sriram Hathwar
Biased News Detection Using Artificial Intelligence

Biased news has become especially problematic in modern times. In this paper, we investigate the abilities of various machine learning models to correctly identify biased news. Our results indicate that artificial intelligence is very capable of doing this detection. In particular, both logistic regression and K-nearest neighbor classifier performed rather well in cross validation.
Arianna H. | Summer 2022
Mentored by Matteo Santamaria
The Motions of Forest Fires

Can I make an AI model that can predict how forest fires will move? I am trying to take data about the environment and a current fire and predict where the fire will go / if it will grow or shrink.
Giuliano T. | Summer 2022
Mentored by Sophia Barton
Skin Cancer Detection

The goal of this research project is to predict whether or not a patient has skin cancer through a machine learning model that is developed from an image dataset. Skin cancer is extremely dangerous, as over 9500 people in the US are diagnosed with it daily. If detected early, patients will have a more likely chance of survival. I tested an MLP Classifier, Decision Tree Regressor, a Logistic Regression Model, and a KNN Model to compare various results and ultimately determine the best accuracy. The MLP Classifier had a 74.5% accuracy, the Decision Tree Regressor had a 74.1% accuracy, the Logistic Regression Model had a 68.8% accuracy, and the KNN Model had a 74.6% accuracy (all testing). We can see that the MLP Classifier, Decision Tree Regressor, and the KNN Model had around the same accuracy while outperforming the Logistic Regression Model. However, when comparing training data, there seems to be a large overfitting problem with most of the models.
Jaida G. | Summer 2022
Mentored by Odysseas Drosis
Journal of Student Research workspace_premium
Stellar Classification based on Numerous Characteristics using Machine Learning

The task of stellar classification can be tedious and lengthy when done manually. One can expedite stellar classification by creating an artificial intelligence model to automate the process. The current stellar classification model serves to effectively categorize stars for research purposes regarding their distribution around the universe, so automating the development of this resource would allow professionals to allocate more time to explore the bounds of our current understanding of space and the universe. After finding and analyzing a dataset containing numerical and categorical features, a supervised learning approach was then used to train and test different models on their ability to classify the stars in the given test set. A Decision Tree Classifier, Random Forest Classifier, Ridge Classifier, and Support Vector Classifier were trained and tested using the data.
Roberto T. | Fall 2022
Mentored by Sophia Barton
Machine Learning Algorithms for Consumer Plastics Identification and Sorting

Seeking a solution to maintain plastic recycling costs while increasing the output of reusable material, this paper poses the question “what type of machine learning algorithm is most suitable for a plastic identification system in consumers’ homes?” It investigates five total machine learning algorithms to determine which one best balances accuracy, management of resources, and time efficiency, ultimately arriving at the conclusion that a Support Vector Machine that uses a Polynomial Kernel is the best algorithm and serves to demonstrate that algorithms such as the ones analyzed have become advanced enough for more efficient, AI driven systems to replace those of the status quo.
Alexander K. | Summer 2022
Mentored by Eric Bradford
A Comprehensive Study of Machine Learning Models for the Prediction of pH in Fairfax County

Water quality is drastically declining as bodies of water become more and more acidic due to global warming. Because of this, it is crucial to understand what factors determine pH, and how to predict pH values. In this study, we determine which features are most important in determining pH, and also what combination of features and machine learning models give us the best accuracy in the prediction of pH values. We demonstrate that pH values from the previous year and the percentage of rehabilitated sanitary sewer lines are the most important features in determining pH. We also demonstrate that Decision Tree Models and Random Forest Models perform the best across all features and feature subsets.
Matthew N. | Spring 2023
Mentored by Sarthak Kanodia
Predicting the ask price in options using different machine learning methods

The goal of this project is to determine how to predict important aspects of options, including ask price. We want to compare different machine learning models to learn the best model and the best hyperparameters for that model for this purpose and dataset.
Krishang S. | Summer 2022
Mentored by Matan Gans
Most Important Soil Properties to Consider for High Crop Yield

With the current population of Earth massively growing, there is a real risk coming up in the near future of not having enough food to feed the planet. One of the best ways to solve this upcoming problem is to improve the quality of the soil because that will help increase Earth’s overall crop yield in a sustainable way.
Arpan A. | Summer 2022
Mentored by Mirna Kheir Gouda
The Effect of the News on the Stock Market

I remember opening my investing account one day and looking at my stocks, noticing that the majority of them were declining pretty severely. Ever since the Russia and Ukraine war, this had been the case. I saw many news headlines talking about the market, and how it was affected from this war. It was then I realized, why isn’t there a way for us to predict these changes in the market given sufficient news headlines?
Pranav B. | Summer 2022
Mentored by Eric Bradford
An Investigation Into Applications of Machine Learning Algorithms on Solar Flare Data and Distance Prediction

The current method that NASA uses for flare prediction involves studying various solar cycles that range from 11 days to 80 years. However, there are far too many factors to consider using this method of prediction and the forecasts are often wrong.
Isaac A. | Summer 2022
Mentored by Aidan Donaghey
Journal of Emerging Investigators workspace_premium
The Utilization of Artificial Intelligence in Enabling the Early Detection of Brain Tumors

This research aims to investigate the application of machine learning to enhance diagnosis.
Shanzeh H. | Summer 2022
Mentored by Odysseas Drosis
Does Physiology Really Help a Volleyball Player Succeed?

As a child, I always went to my sister’s volleyball games. From club games to school games, I went to them all. Because I saw so many volleyball games by watching my sister’s games, I always saw that there was always a trend that I kept seeing from each team.
Justin H. | Summer 2022
Mentored by Christina Cheng
The Impact of the Covid-19 Pandemic on the Test Scores of Various Demographics

The research problem that this research paper attempts to address is, How did Covid impact and affect the test scores of different demographics of students in various schools in New York? The overall approach that was used to answer this question involved using statistical analysis to examine the correlation between how certain factors such as being economically disadvantaged or being hispanic affected student proficiency scores on New York State regents exam tests in 2018 and 2021 in different schools.
Deniz G. | Summer 2022
Mentored by Kush Khosla
Training a Machine Learning Model to Recognize Signs of Depression

Many people have not had a professional diagnosis, and could have depression without being aware of it. To attempt to answer this question, I used a person’s daily habits and how they change, and attempt to predict if they will experience a depressive episode in the future.
Bwohan W. | Spring 2022
Mentored by Barbie Duckworth
Predicting Recidivism in the United States Criminal Justice System

There are real-life effects of machine bias in courtrooms. American lives are completely changed based on criminal sentencing, which can mean the difference between rehabilitation and recidivism. This issue is urgent and finding a solution to ensure equality is imperative.
Alexander F. | Summer 2022
Mentored by
Gamma Ray Classification

This research paper is in this domain of physics, specifically using Artificial intelligence (AI) to classify high-energy Gamma particles as background or signal. I was able to create an AI model to classify these gamma particles with high accuracy. I was able to accomplish this by using a wide range of data, from the MAGIC Gamma Telescope Dataset, which consists of key features, to construct a model.
Eesha S. | Summer 2022
Mentored by Victoria Lloyd
Stock Price Prediction Project

Can an Algorithm be created to accurately predict the stock price of any company on a given day? Within my project, I'm trying to create a code that can accurately predict any stock's price on a given day.
Jackson D. | Summer 2022
Mentored by Odysseas Drosis
Predicting Skin Cancer using Machine Learning

Skin cancer affects an increasing number of people which is the main motivation for my research. The goal is to develop a model that is accurate enough for real world application and that could be distributed to more rural areas which have decreased access to medical technology.
Keithan P. | Spring 2022
Mentored by Eric Bradford
Using Logistic Regression in the Early Detection of Tsunamis caused by Earthquakes

Many tsunamis have enough energy to cause a large amount of human death, as well as trillions of dollars in damage repair. The problem arises on how to better prepare for these natural disasters and limit its damage, especially in countries such as Japan that are frequented by tsunamis.
Ian C. | Summer 2022
Mentored by
Optimizing Skin Cancer Classifiers By Applying Multiplicative Weight Update Into A Mobile Application

Skin cancer is slowly becoming the most common and frequent type of cancer in the world. In the United States alone, research suggests that skin cancer is the most common cancer in and that one in every five people will acquire skin cancer at some point throughout their lifetime.
Animish J. | Summer 2022
Mentored by Odysseas Drosis
Journal of Student Research workspace_premium
A Hybrid CNN-LSTM Model For Predicting Solar Cycle 25

The goal of this study is to predict Solar Cycle 25 through the deep learning approach,and determine what parameters affect prediction accuracy and what the optimal number of historical solar cycles are used to reliably and accurately predict the upcoming solar cycle. The solar cycle predictions will help us prepare ahead of time for future solar activity.
Alice H. | Summer 2022
Mentored by Tony Rodriguez
Using Machine Learning Architecture to Detect Brain Tumors

Not all tumors in the brain are cancerous, so irregular cell growths can often be determined as benign and are ignored completely. However, many of these low-danger cell growths can develop into higher-danger cancerous tumors. Identifying these low-danger tumors early is key to preventing cancer in the future.
Mike S. | Summer 2022
Mentored by Matthew Radzihovsky
Stock Price Prediction

The most advanced stock prediction models take into account fundamental and technical analysis. Fundamental analysis involves analyzing the stock’s intrinsic value, financial statements, tangible assets, management effectiveness, consumer behavior, and overall company outlook.
Harrison S. | Summer 2022
Mentored by Odysseas Drosis
Detecting Facial Expressions

This paper shows that machine learning algorithms can be used in order to classify emotions into different categories. The most promising results were from the K-nearest neighbor model, with 61.5% accuracy in the training set, and 32.1% accuracy in the testing set.
Bertran M. | Summer 2022
Mentored by Odysseas Drosis
Using Machine Learning Models to Determine Medical Features Most Indicative of Gestational Diabetes

With the recent overruling of Roe v. Wade, pregnancies have become especially dangerous since many females who previously abort fetuses no longer can. By doing this research project, I can help figure out what medical factors may lead to gestational diabetes, and hopefully from there, the results can be used to counteract those features.
Fiona B. | Summer 2022
Mentored by
Identifying Cancer Types in Microscope Images of Lung and Colon Cells

Lung adenocarcinoma cells make up large proportions of lung cancer cases and colon cancer is one of the most prevalent cancers in the United States. Although many different medicines have been recently developed to attack these cancers, the most effective way to stop it is early detection.
Adam S. | Summer 2022
Mentored by Odysseas Drosis
2nd Place at San Diego BROADCOM Science Fair (Senior Division) workspace_premium
A Machine Learning Approach to Understanding the Determining Factors of the Gender Wage Gap

By studying the affect of different attributes on the gender wage gap, we can better understand both the scale of this issue and its possible solutions. So, we explore the question, how does a worker’s marital status, along with other variables, impact the gap in hourly wage between male and female workers? We seek to create a model able to predict the gender wage gap given a set of variables—age, years of education, race, state, and marital status.
Sophia G. | Summer 2022
Mentored by
Developing an Accurate AI Algorithm for Histopathologic Cancer Detection

In this specific research project, we will be focusing on the lymph node scans of women with breast cancer, which is the most common cancer for women residing in the US, other than skin cancer. Research statistics show that about 1 in 8 women in the United States will develop invasive breast cancer throughout her life.
Leah N. | Summer 2022
Mentored by Odysseas Drosis
Identifying Parameters in Water Potability Analysis Through Machine Learning

Predicting whether the water is potable or not can be helpful for people who are reliant on bodies of water and redirect them to safer options. It will also be beneficial to apply the algorithm to other places where it is expensive and inefficient to send people out and collect water samples. Over the past couple of decades, researchers have often commented on the lack of funding as a source of error when it comes to data analysis and the accuracy of the research.
Molly H. | Summer 2022
Mentored by Sharon Chen
Stock Price Prediction

In my project, I used two different methods, the linear model and the neural network. These two were a big part of my project as they involved a pattern for my values while predicting future stocks. The linear model basically uses the relationship between the data points to draw a straight line through all of them. This line can be used to predict future values.
Sai P. | Summer 2022
Mentored by Odysseas Drosis
Fake News Classification

As fake news becomes more of a problem across the US, its consequences are becoming more and more damaging. This project aims to combat this disinformation through an artificial intelligence algorithm that can classify real articles versus fake ones.
Roshni K. | Summer 2022
Mentored by Philip Bell
How Does One’s Background Determine Their Mental State?

Mental health issues have become very prevalent in recent times, and although significant progress has been made in terms of treatment in the form of counseling, medication, and other methods, in order to truly find out the root cause of many mental health issues, a correlation has to be drawn between one’s mental state and another factor, such as family history, age, work environment, etc. It is also beneficial to correlate one’s mental state with a multitude of factors to see the compilation.
Sohum T. | Summer 2022
Mentored by Akshay Jagadeesh
3rd Place in the Alameda County Science Fair workspace_premium
A Novel Approach to Promote Equity in Skin Disease Diagnosis by AI Models

The goal is to increase the accuracy of existing models in diagnosing skin disease across various skin tones within 10% of that obtained in diagnosing fairer skin tones, which is about 95.8%.
Varsha N. | Summer 2022
Mentored by Roger Jin