Bookmarks

A collection of helpful resources that I might want to find again in the future or point other people towards.

Advice: Faculty

  • Thread by Sarah Sheffield (@sarahsheffield) advice for new faculty.
  • Course Free 10 week course on google drive about writing academic an academic syllabus. Course by @DLabree, recommended by Elena Aydarova (@aydarova).
  • Tweet by Gero Grams (@GeroGrams) on how to say no to things
  • Thread by Mine Dogucu (@MineDogucu) on learning/teaching (with) R. See also:
  • Thread by Wes Kao (@wes_kao) on managing up
  • Thread by Vrinda Nair (@VnVrinda) on 22 tools for your PhD Journey
  • Managing research careers tool by Edinburgh University Thread and Webpage
  • Talk by Laura Albert (@lauraalbertphd) on time management: Do less, Do it faster, Do it at the right time.

Advice: Graduate Students

  • Thread by Matt Betts on 8 questions to consider before starting a PhD.
  • Thread by Maram Duncan (@MCDuncanLab) on the hidden curriculum for new grad students
  • Thread by @LifeAfterMyPhD on 5 low-stakes steps to set yourself up for an industry job search
  • Thread by @mathladyhazel on the best math books for self-learners
  • Thread by Alex Eble (@alexeble) on adivce for thriving in a PhD. Full document here. Note: has US and economics focus, but translates well.
  • Website Stats Notes in the British Medical Journal. Like a dictionary but for stats words and methods.
  • Book Esstential Math for Data Scienceby Thomas Nield. Recommended by Vicki Boykis for those tooking for an intro/refresher on linear algebra, probability and statistics.

Agent based modelling

  • Paper review of agent based model (preprint of JEL - chunky at 90 pages!)

Analysis and Asymptotics

  • Video Lectures by Steven Strogatz (@stevenstrogatz) on asymptotics and pertubation methods.

Causal Inference

  • Lecture Notes by Matt Blackwell, “Causal Inference with Applications”
  • Paper The taboo against explicit causal inference in nonexperimental psychology. Suggested by Brian Nosel (@BrianNosek)
  • Thread by Volodymyr Kuleshov (@volokuleshov) about the ICML 2022 tutorial on Causality and Fairness
  • Video Science before statistics: Causal Inference by Richard McElreath (3 hour crash course in causal inference)
  • Video Series Statistical Rethinking (2022) by Richard McElreath on Youtube

Coding (General)

Short-form

  • Article on setting up a private .gitignore to keep a clean codebase
  • Book / website on package development in Python. (Think Hadley & Bryan’s R packages but for Python) Recommended by @EmilyRiederer
  • Book The missing readme by Chris Riccomini and Dmitriy Ryaboy
  • Paper Ten Simple Rules for Taking Advantage of Git and GitHub
  • Sildes by Ariel Muldoon (@aosmith16) on “More git and github - collaborators, merege conflicts and pull requests”
  • Docs for {renv}

Long-form

Docker

  • Thread by Alex Gold (@alexkgold) on getting started with Docker. Free online book

Databases

  • Article by architecture notes (@arcnotes) on “Things you should know about databases”

Datasets

  • Thread by R Ladies - Sources of Messy(ish) data

Data Ethics

  • The Verge - AI Drug development maske chemical weapons

  • Nature - AI Drug development maske chemical weapons

  • Richard McElreath recommendation of paper by Xiao-Li Meng on how data quality influences effective sample size

  • Imperial Explainable AI Seminars

  • Thread by Santiago (@svpino) on imbalanced datasets

  • Tweet by Adam Kruchten (@AdamKruchten) on when we care about marginal or conditional effects.

  • ASA article on the 2020 work and salary survey, showing women tend to earn less in base salary and total income but that in a regression gender is not significant predictor of total income.

  • NPR story Where Google find that men are underpaid

  • Thread by Paul Hunermund (@PHunermund) on the above google article.

  • Preprint on reconstructing large portions of training data from a trained neural network

  • Paper Do Datasets Have Politics? Disciplinary Values in Computer Vision Dataset Development. Suggested by Abeba Birhane (@Abebab)

  • Paper Big data loses to good data. Unrepresentative big surveys significantly overestimated US vaccine uptake.

  • Paper On extending LinearSHAP, TreeSHAP and DeepSHAP to RKHS-SHAP. Found through tweet by Siu Lun Chau (@Chau9991).

  • Chapter 33 on interpretability of book Probabilistic Machine Learning: Advacned Topicsby Kevin Murphy (@sirbayes), Been Kim (@_beenkim) and others.

  • Book by Claire McKay Bowen “Protecting your privacy in a data-driven world”

  • Tweet by Rasha Shrain (@rashaben) requesting reading materials on p-values and p-hacking

  • Course 12 week reading course on Ehics and Data Science by Rohan Alexander

Data Visualisation

  • Thread by Indrajeet Patil (@patilindrajeets) on the effective use of colours in data vis.
  • R Package {performance} for aesthetically pleasing ggplot sytle diagnostic plots. (The qq plot even has tolerance intervals!)
  • Blog Post by Thomas Mock on Creating and using custom ggplot2 themes
  • Blog Posts by Ameila McNamara (@AmeliaMN) about Histograms and Kernel Density Estimation

History of Statistics

  • Course 10-week reading list on “History of Statistics and Data Sciences” by Rohan Alexander

Markdown

  • Tweet by Steve Bauman (@realstevebauman) pointing out that Github’s markdown now supports note and warning blockquotes
  • Tweet by Zev Ross on using the character ├ to get aesthetically pleasing subsections in RStudio
  • Tutorial on embedding mentimeter into webpages (try with html slides???)

Memes

Science Communication

  • Thread by Carl Bergstrom (@CT_Bergstrom) and Ryan McGee (@RS_McGee) telling the story of a paper using a comic strip and stick-figure Darwin.
  • Thread by Tessa Davis on slide design to keep your audience engaged
  • Thread by Dorsa Amir (@DorsaAmir) on slide design
  • Website OpenPeeps - Open Source hand drawn individual characters
  • Blog Post by Kate Jolly (@katejolly6) on designing slides in xaringan with xaringanthemer and css.

Machine Learning

  • Blog Post on double descent in neural network performance ### Optimisation
  • Video by Trefor Bazett (@TreforBazett) on using Lagrange multipliers to solve constrained optimisation problems
  • Video Series by @3blue1brown on constrained optimisation (hosted on khan academy)
  • Course Notes Advanced Data Analysis from an Elementary Point of View

Parquet

  • Thread by Pau Labarta Bajo (@paulabartabajo_)

Point processes

  • Course Material By Rick Schoenberg on point process models
  • Blog Post by Benjamin Cretois on fitting point process models in stan.

Professional Development

  • Tweet by Francisco Yirá (@francisco_yira) about designing a personal learning plan.

Quarto

SQL

  • Thread by Tom Carpenter (@tcarpenter216) on translating dplyr skills to SQL

  • Online resources for learning SQL, as recommended by Ijeoma Okereafor (@MeetIjeoma)

    • http://sqlbolt.com
    • http://w3schools.com/sql
    • http://mode.com/sql-tutorial
    • http://sqlteaching.com
    • http://SQLZoo.net
    • http://selectstarsql.com
    • https://pgexercises.com
  • SQL games recommended by Vikas Rajputin (@vikasrajputin)

    • (SQL Island)[https://sql-island.informatik.uni-kl.de/] (In German but chrome translation is pretty good)
    • (SQL Murder Mystery)[https://mystery.knightlab.com/]
    • (SQL Polic Department)[https://sqlpd.com/]

Workflow

  • Jenny Bryan - Naming Things (Slides)
  • Talk by David Robinson on “The unreasonable effectiveness of public work”

Writing

  • Books on writing suggested by Helen Sword, author of Stylish Academic Writing.