Back To Projects
Stellar Classification based on Numerous Characteristics using Machine Learning
Roberto T. | Fall 2022 |

The task of stellar classification can be tedious and lengthy when done manually. One can expedite stellar classification by creating an artificial intelligence model to automate the process. The current stellar classification model serves to effectively categorize stars for research purposes regarding their distribution around the universe, so automating the development of this resource would allow professionals to allocate more time to explore the bounds of our current understanding of space and the universe. After finding and analyzing a dataset containing numerical and categorical features, a supervised learning approach was then used to train and test different models on their ability to classify the stars in the given test set. A Decision Tree Classifier, Random Forest Classifier, Ridge Classifier, and Support Vector Classifier were trained and tested using the data.


The task of stellar classification can be tedious and lengthy when done manually. One can expedite stellar classifi- cation by creating an artificial intelligence model to automate the process. As we as a species continue to explore the frontier of the observable universe, we should seek to automate time intensive problems like stellar classification. The current stellar classification model serves to effectively categorize stars for re- search purposes regarding their distribution around the universe, so automating the development of this resource would allow professionals to allocate more time to explore the bounds of our current understanding of space and the universe. After finding and analyzing a dataset containing numerical and categorical features, a supervised learning approach was then used to train and test different models on their ability to classify the stars in the given test set. A Decision Tree Classifier, Random Forest Classifier, Ridge Classifier, and Support Vector Classifier were trained and tested using the data. The most successful models were the Decision Tree Classifier and Random Forest Classifier, each with about a 94 percent prediction accuracy across different accuracy metrics on the test data. Despite some drawbacks in regards to the availability of usable data, four models were trained and two were proven to be consistently and successfully accurate. Any future attempts at developing models for stellar classification should concentrate more on gathering data as to have a more thoroughly trained set of models.

Explore More!

Published Paper
Roberto T.
Sophia Barton
Computer Science MS from Stanford

Related Projects

An Investigation Into Applications of Machine Learning Algorithms on Solar Flare Data and Distance Prediction

The current method that NASA uses for flare prediction involves studying various solar cycles that range from 11 days to 80 years. However, there are far too many factors to consider using this method of prediction and the forecasts are often wrong.
Isaac A. | Summer 2022
Mentored by Aidan Donaghey
workspace_premium
Stock Prediction Project

As of 2022, 60%-73% of the trading volume done in the U.S stock market is done by algorithms, illustrating the dependency and importance of these types of algorithms in stock predictions. Additionally, a recent census by Gallup shows that 145 million Americans (roughly 56% of the population) invest and own stocks, showing the population’s keen interest in the stock market. I endeavored to create a model that takes into account opening, closing, high, and low prices from the last 3 days to predict the opening price on the fourth day. Of course, the model is not just limited to only being able to approximate the opening price the following day; for example, it will also be able to predict the opening price 5 days later.
Sofia S. | Summer 2022
Mentored by Odysseas Drosis
Findings Provided by a Machine Learning Clustering Model of Stellar Kinematics in Star Clusters Can be Used to Identify Runaway Stars

The need for developing a machine learning model that can detect these runaway stars is established in this research as it attempts to expand the domain of astrophysics to help astrophysicists understand tidal tail formation and the theoretical dark matter subhalos’ effects on celestial bodies.
Roneet D. | Summer 2022
Mentored by Tony Rodriguez