Skip to content

Instantly share code, notes, and snippets.

View wassname's full-sized avatar
🙃

Michael J Clark wassname

🙃
View GitHub Profile
@wassname
wassname / symphypothesis.py
Last active August 2, 2024 02:55
fhypothesis: a easy way to display hypothesis in python, kind of like assert
import sympy as sp
from typing import Dict, Any
from IPython.display import display
from sympy import init_printing
init_printing()
def shypothesis(hypothesis: str, variables: Dict[str, Any] = None, round=3, verbose=False):
"""
Evaluate a hypothesis using SymPy, showing simplified equation and result.
@wassname
wassname / choice_tree.py
Last active June 3, 2024 13:44
for huggingface transformers sometime you want to constrain output to json schema and record the probabilities on choices/enums. I use it when rating, judging. It's much more efficient than sampling multiple times.
from jaxtyping import Float, Int
import torch
from torch.nn import functional as F
from torch import Tensor
from typing import List, Callable, Tuple, Dict, Optional
import pandas as pd
from transformers import AutoModelForCausalLM, AutoTokenizer
def get_valid_next_choices(choices_tokens, current_tokens):
@wassname
wassname / hf_perplexity.py
Last active December 7, 2024 05:42
simple perplexity for huggingface models similar to llam..cpp
# Directly taken from https://huggingface.co/spaces/evaluate-measurement/perplexity/blob/main/perplexity.py
# TODO replace with a strided version https://github.com/huggingface/transformers/issues/9648#issuecomment-812981524
import numpy as np
import torch
import itertools
from torch.nn import CrossEntropyLoss
from tqdm.auto import tqdm
import torch.nn.functional as F
from datasets import load_dataset, Dataset
from transformers import AutoTokenizer, AutoModelForCausalLM
#Add e5-instruct-mistral layers, so they naming is different than
# original mistral instruct one
from __future__ import annotations
from typing import Sequence
from .constants import MODEL_ARCH, MODEL_TENSOR, MODEL_TENSORS, TENSOR_NAMES
@0xdevalias
0xdevalias / _gh-cli-copilot-api.md
Last active December 9, 2024 07:39
Some notes and references while exploring the GitHub CLI extension for the GitHub Copilot CLI
@wassname
wassname / STOP_DOING_MATH.md
Last active November 7, 2024 00:32
It turns out LLM's can generate the STOP DOING MATH meme https://knowyourmeme.com/memes/stop-doing-math

STOP DOING MATH

  • NUMBERS WERE NOT SUPPOSED TO BE GIVEN NAMES
  • YEARS OF COUNTING yet NO REAL-WORLD USE FOUND for going higher than your FINGERS
  • Wanted to go higher anyway for a laugh? We had a tool for that: It was called "GUESSING"
  • "Yes please give me ZERO of something. Please give me INFINITE of it" - Statements dreamed up by the utterly Deranged

LOOK at what Mathematicians have been demanding your Respect for all this time, with all the calculators & abacus we built for them

@wassname
wassname / cuda_11.8_installation_on_Ubuntu_22.04
Last active October 6, 2023 04:20 — forked from MihailCosmin/cuda_11.8_installation_on_Ubuntu_22.04
Instructions for CUDA v11.8 and cuDNN 8.7 installation on Ubuntu 22.04 for PyTorch 2.0.0
#!/bin/bash
### steps ####
# verify the system has a cuda-capable gpu
# download and install the nvidia cuda toolkit and cudnn
# setup environmental variables
# verify the installation
###
### to verify your gpu is cuda enable check
@mrtysn
mrtysn / openai-custom-instructions.txt
Last active October 6, 2023 10:53
ChatGPT Custom Instructions
>>> What would you like ChatGPT to know about you?
I have been using computers since 1997. I know the ins and outs of them.
Furthermore, I want to get a lot of high quality work done in a very short amount of time.
I try to keep my search time at O(1) and I expect the same from others.
Finding a way to solve a problem should be done very quickly with a very high accuracy.
>>> How would you like ChatGPT to respond?
— Be highly organized
@izikeros
izikeros / README.md
Last active March 26, 2024 18:11
[split text fixed tokens] Split text into parts with limited length in tokens #llm #tokens #python

Text Splitter

Code style: black MIT license

A Python script for splitting text into parts with controlled (limited) length in tokens. This script utilizes the tiktoken library for encoding and decoding text.

Table of Contents

@qxcv
qxcv / from_gymnasium.py
Last active August 21, 2024 11:54
Gymnasium envs with Dreamer v3, along with a Minigrid example
# from_gym.py adapted to work with Gymnasium. Differences:
#
# - gym.* -> gymnasium.*
# - Deals with .step() returning a tuple of (obs, reward, terminated, truncated,
# info) rather than (obs, reward, done, info).
# - Also deals with .reset() returning a tuple of (obs, info) rather than just
# obs.
# - Passes render_mode='rgb_array' to gymnasium.make() rather than .render().
# - A bunch of minor/irrelevant type checking changes that stopped pyright from
# complaining (these have no functional purpose, I'm just a completionist who