Here are some keypoints of what I did and discovered the last few days:
Here is a very good book of its kind https://www.edwardtufte.com/tufte/books_vdqi
From the description “The classic book on statistical graphics, charts, tables. Theory and practice in the design of data graphics, 250 illustrations of the best (and a few of the worst) statistical graphics, with detailed analysis of how to display data for precise, effective, quick analysis”.
Seaborn uses a technique called kernel density estimation , or KDE for short, to create a smoothed line chart over the histogram.
When we use the concat()
function to combine dataframes with the same shape and index, we can think of the function as “gluing” dataframes together.
Unlike the concat
function, the merge
function only combines dataframes horizontally (axis=1) and can only combine two dataframes at a time.
An inner join returns only the intersection of the keys, or the elements that appear in both dataframes with a common key.
Outer join: includes all data from both dataframes
Left join: includes all of the rows from the “left” dataframe along with any rows from the “right” dataframe with a common key; the result retains all columns from both of the original dataframes.
I also started a project on kaggle, the dataset “Red wine quality” which I think it will help me understand and apply some of the concepts of linear regression and classification methods. I also want to read again the concepts of regression in the book “Introduction to statistical learning” to better understand them through practical applications.