
🚢 Learn Machine Learning with the Titanic#
The Titanic dataset is a classic for beginners because its data are very easy to understand. 🎯
The goal: predict whether a passenger survived the disaster.
We use features like age, gender, and passenger class to make these predictions.
This project teaches you essential steps such as:
- 🧹 Data cleaning and how to handle missing values
- ✂️ Splitting the dataset into training and test sets
- 📈 Applying simple algorithms:
- logistic regression for a binary outcome
- decision trees that answer sequential questions
Once the model is trained, it is evaluated with metrics like accuracy or precision.
It’s a great introduction to working with real data and learning basic evaluation techniques.
To put it into practice, you can use Python and libraries like pandas and scikit-learn. Here’s a brief example:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# cargar datos (ej. desde Kaggle)
df = pd.read_csv('titanic.csv')
X = df[['Pclass','Sex','Age']].copy()
X['Sex'] = X['Sex'].map({'male':0,'female':1})
X = X.fillna(X.mean())
y = df['Survived']
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, random_state=42)
model = LogisticRegression()
model.fit(X_train, y_train)
pred = model.predict(X_test)
print('Exactitud:', accuracy_score(y_test, pred))This code shows the basic steps: loading, cleaning, splitting, training, and evaluation.
📝 Explanation in a few words#
Imagine you have a table with passengers and their attributes.
The goal is for a machine to read those data and, based on patterns, tell you if someone might have survived.
It’s like teaching a friend to guess the end of a story knowing only a few clues.
🔑 The trick is to prepare the data well and then check if the “guess” works.
More information at the link 👇

