I have been making progress with the seminar about Business Intelligence, SQL and Power BI and it is a very interesting seminar. Power BI is a powerful visualization tool that is quite easy to use and I want to apply it to real data. The cool thing is that graphs change automatically if you changeContinue reading “20-12-2020”
Author Archives: Angelos88
14-11-2020
Today I completed the third chapter from the book Introduction to Statistical Learning and did the exercises in R. It’s a really good book and I read it for a second time, I want to read the whole book again because the second time I go through it, I understand some concepts better. In combinationContinue reading “14-11-2020”
31-10-2020
These days I spent many hours trying to understand how scraping works and I understood the basics. It’s difficult to find a code that fits all the webpages for scraping data, and if the HTML code is not written in a good way, things get difficult. However I understood some basic stuff by watching tutorialsContinue reading “31-10-2020”
03-10-2020
These days I’ve completed the NY schools project and I’ve managed to make a summary of my data analysis projects in GitHub pages and it looks very nice. It is something I will constantly update with more and better projects. I also worked on my CV more and decided to take an online seminar inContinue reading “03-10-2020”
28-09-2020
Worked these days with Dataquest and command lines. Also with histograms in Python and distributions like Normal and Uniform. An interesting observation about boxplots: A value is an outlier if: It’s larger than the upper quartile by 1.5 times the difference between the upper quartile and the lower quartile (the difference is also called the interquartileContinue reading “28-09-2020”
23-09-2020
I’m really excited today, because I used Basemap library in Python as part of a project on Dataquest. It will be really cool to be able to plot statistics on actual maps on my own projects. Here is a tutorial about using the library, which I found very helpful. I will also check this notebookContinue reading “23-09-2020”
21-09-2020
Some interesting things I did the last few days: List Comprehensions The function below can be written with a single line of code: ints = [1, 2, 3, 4] times_ten = [] for i in ints: times_ten.append(i * 10) print(times_ten) [10, 20, 30, 40] It can be written like this: times_ten = [(i * 10)Continue reading “21-09-2020”
18-09-2020
This is a very useful page for practicing regular expressions. It needs a lot of practice to be comfortable with these. I did some practice on Dataquest but that is just an introductory step. I remembered some concepts on statistics like simple random, stratified and cluster sampling from Dataquest, which does a pretty good jobContinue reading “18-09-2020”
15-09-2020
Here are some keypoints of what I did and discovered the last few days: Here is a very good book of its kind https://www.edwardtufte.com/tufte/books_vdqi From the description “The classic book on statistical graphics, charts, tables. Theory and practice in the design of data graphics, 250 illustrations of the best (and a few of the worst)Continue reading “15-09-2020”
11-09-2020
Today I continued the intermediate Python course in Dataquest and I will probably do it along with the statistics course. This explains the difference between .iloc and .loc in pandas very clearly https://stackoverflow.com/questions/31593201/how-are-iloc-and-loc-different#:~:text=loc%20gets%20rows%20(or%20columns,not%20present%20in%20the%20index. Uncomment in Python: Ctrl + / Also, I learned about frequency distributions in Python and differences between histograms and barplots (I refreshedContinue reading “11-09-2020”