Derrick Roberts robertsd

## normcore-llm.md

      
              1 file
            
          
              311 forks
            
          
                55 comments
              
            
              3261 stars
            
          
                veekaybee
                / normcore-llm.md
            
            
              Last active
              January 9, 2025 15:56
            
              
                Normcore LLM Reads
              
          
    Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
Foundational Concepts


Pre-Transformer Models


## isolation_forest.md

      
              1 file
            
          
              1 fork
            
          
                0 comments
              
            
              1 star
            
          
                veekaybee
                / isolation_forest.md
            
            
              Created
              February 2, 2023 02:49
            
          
    Isolation forests versus decision trees

Isolation forest paper


Isolated points should be lower and closer to the root of the tree


## chatgpt.md

      
              1 file
            
          
              39 forks
            
          
                4 comments
              
            
              350 stars
            
          
                veekaybee
                / chatgpt.md
            
            
              Last active
              December 24, 2024 20:23
            
              
                Everything I understand about chatgpt
              
          
    ChatGPT Resources

Context

ChatGPT appeared like an explosion on all my social media timelines in early December 2022. While I keep up with machine learning as an industry, I wasn't focused so much on this particular corner, and all the screenshots seemed like they came out of nowhere. What was this model? How did the chat prompting work? What was the context of OpenAI doing this work and collecting my prompts for training data?
I decided to do a quick investigation. Here's all the information I've found so far. I'm aggregating and synthesizing it as I go, so it's currently changing pretty frequently.
Model Architecture


## BuildingDataProducts.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              1 star
            
          
                aglove2189
                / BuildingDataProducts.md
            
            
              Last active
              November 3, 2023 16:24
            
              
                How to Build Resilient Data Products
              
          
    How to Build Resilient Data Products

Every aspect of your product should contribute to one of these 5 principles:

Small
Fast
Reproducible
Transparent
Frictionless


## simple_dash.py
# ========== (c) JP Hwang 27/7/20  ==========
from shared_funcs import load_fig
fig = load_fig()

import dash
import dash_html_components as html
import dash_core_components as dcc
from dash.dependencies import Input, Output

external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']

## td-in-r.r
### x is either a vector of numbers or a data frame with sums and weights. Digest is a data frame.
merge = function(x, digest, compression=100) {
    ## Force the digest to be a data.frame, possibly empty
    if (!is.data.frame(digest) && is.na(digest)) {
        digest = data.frame(sum=c(), weight=c())
    }
    ## and coerce the incoming data likewise ... a vector of points have default weighting of 1
    if (!is.data.frame(x)) {
        x = data.frame(sum=x, weight=1)
    }

## iowa-liquor-sales-dataset.readme.md

      
              2 files
            
          
              2 forks
            
          
                0 comments
              
            
              41 stars
            
          
                dannguyen
                / iowa-liquor-sales-dataset.readme.md
            
            
              Last active
              October 30, 2024 19:04
            
              
                Cleaning, summing up the State of Iowa Liquor Sales dataset
              
          
    Iowa Liquor Sales dataset via Socrata/data.iowa.gov

(preliminary exploration)

The state of Iowa has released an 800MB+ dataset of more than 3 million rows showing weekly liquor sales, broken down by liquor category, vendor, and product name, e.g. STRAIGHT BOURBON WHISKIES, Jim Beam Brands, Maker's Mark

This dataset contains the spirits purchase information of Iowa Class “E” liquor licensees by product and date of purchase from January 1, 2014 to current. The dataset can be used to analyze total spirits sales in Iowa of individual products at the store level.

You can view the dataset via Socrata
	# ========== (c) JP Hwang 27/7/20 ==========
	from shared_funcs import load_fig
	fig = load_fig()

	import dash
	import dash_html_components as html
	import dash_core_components as dcc
	from dash.dependencies import Input, Output

	external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
	### x is either a vector of numbers or a data frame with sums and weights. Digest is a data frame.
	merge = function(x, digest, compression=100) {
	## Force the digest to be a data.frame, possibly empty
	if (!is.data.frame(digest) && is.na(digest)) {
	digest = data.frame(sum=c(), weight=c())
	}
	## and coerce the incoming data likewise ... a vector of points have default weighting of 1
	if (!is.data.frame(x)) {
	x = data.frame(sum=x, weight=1)
	}