Về Chương 6
Before/After Data Diff

Data Transformation Diff

So sánh dữ liệu thô (Raw) và dữ liệu sau làm sạch (Cleaned)

Trim
Imputed (NA)
Type Cast
Outlier Dropped
Duplicate

Dữ Liệu Thô (Raw Data)

IDNameAgeIncomeStatus
1 John Doe 2515000Active
2Alice SmithNaN22000Active
3Bob3099999999Inactive
4Eve2218000ACTIVE
5Charlie"45"35000Active
5Charlie"45"35000Active
7Dave38Pending

Dữ Liệu Sạch (Cleaned)

IDNameAgeIncomeStatus
1John Doe2515000Active
2Alice Smith3222000Active
--- Row dropped: Outlier detected ---
4Eve2218000Active
5Charlie4535000Active
--- Row dropped: Duplicate detected ---
7Dave3828500Pending