u/adssidhu86

Andrej Karpathy is joining Anthropic. Anthropic on hiring + acquisition spree.

Andrej Karpathy is joining Anthropic. Anthropic on hiring + acquisition spree.

Andrej Karpathy is joining anthropic and back into core AI research. He has been instrumental in creating great learning courses in his career. His computer vision lecture was what got me into AI and his build GPT-2 from scratch remains the most goated lesson. He was planning to solve learning and education using AI so this news is a bit of surprise. What do you think of these moves from Anthropic.

u/adssidhu86 — 17 hours ago

New UI Preview feature on Claude Code is really great.

I noticed a small but interesting Claude Code behavior.

I gave it a screenshot and asked it to make a navbar prettier.

Instead of immediately editing CSS, it first asked me to choose a direction:

  1. Refined gold pill
  2. Sparkle prefix
  3. Glow halo around text

That is the part I found useful.

For frontend work, “make it prettier” is not a coding instruction. It is a taste decision.

Claude Code did not jump straight from prompt to diff. It stopped at the subjective layer first.

The flow felt like:

visual context → design options → human choice → code edit

u/adssidhu86 — 3 days ago

I made a free 50-min lesson on Hugging Face model repos, datasets, and practical AI engineering skills

I’m building a free open AI cohort, and I just published Lesson 01.

The lesson is called Hugging Face Beyond Upload.

Most beginner tutorials treat Hugging Face like this:

download model → run notebook → move on

I wanted to teach it more like an AI engineering skill.

The lesson covers:

- how to navigate Hugging Face model repos properly

- how model files are structured

- how config.json connects to the actual model class

- how to move from a model page to the relevant Transformers code

- how to understand model files instead of treating them as magic blobs

- why small models like Qwen3-0.6B are useful for learning

- why Markdown matters in AI workflows: model cards, README files, GitHub issues, Discord, Cursor/Claude Code planning files

- how to think of open models as infrastructure / supply chain

The biggest section is on datasets.

I show 3 ways to inspect Hugging Face datasets:

  1. Croissant metadata endpoint

  2. Data Studio / browser dataset viewer

  3. load_dataset with Python, pandas, and plots

We inspect columns, categories, response lengths, short examples, long examples, distributions, and how to make an early judgment about dataset quality before using it for training or fine-tuning.

The lesson also sets up the next part, where we run Qwen3 directly in C, so learners can understand what libraries like Transformers are doing behind the scenes.

I think this is an important skill for people trying to get into AI/ML jobs.

Not just “I used an LLM API”.

But:

Can you open a model repo and understand what is inside?

Can you inspect a dataset before training?

Can you connect model files to actual source code?

Can you reason about the quality of data before fine-tuning?

Video:

https://youtu.be/MjZio-A9oUY

Lesson page:

https://cohort.bubblnet.com/lessons/lesson-1-huggingface-beyond-upload

I’d genuinely appreciate feedback from people here:

- Is this the right level for learners trying to move from “using AI tools” to understanding models/datasets?

- What would you expect juniors applying for AI/ML roles to know about Hugging Face?

- Should I go deeper into model internals first, or datasets/training pipelines first?

u/adssidhu86 — 3 days ago

This MTP Pull Request merge is getting more attention than model drops!

This MTP pull request merge is getting more attention than many model drops.

I first noticed MTP while looking at Qwen3.5-0.8B, and now llama.cpp support makes the whole thing more interesting.

My current understanding is that MTP mainly improves token generation, not prompt processing.

So it helps when the model is writing a lot:

chat, coding, long answers, agents, synthetic data, local assistants.

But if the workload is mostly huge prompt + short answer, then prompt processing is still the bottleneck.

People are mentioning around 1.5x to 1.8x faster token generation in some setups.

My question is: how useful is this overall in real local AI workflows?

Is MTP going to matter mainly for long generation and agent loops, or will it become a default feature people expect in small local models?

u/adssidhu86 — 4 days ago
▲ 7 r/AILearningHub+1 crossposts

I made a free 50-min lesson on how to navigate Hugging Face beyond just downloading models

I’m building a free open AI cohort, and I just published Lesson 01.

The lesson is called Hugging Face Beyond Upload.

Most beginner tutorials treat Hugging Face like this:

download model → run notebook → move on

I wanted to teach it more like an AI engineering skill.

The lesson covers:

- how to navigate Hugging Face model repos properly
- how model files are structured
- how config.json connects to the actual model class
- how to move from a model page to the relevant Transformers code
- how to understand model files instead of treating them as magic blobs
- why small models like Qwen3-0.6B are useful for learning
- why Markdown matters in AI workflows: model cards, README files, GitHub issues, Discord, Cursor/Claude Code planning files
- how to think of open models as infrastructure/supply chain

The biggest section is on datasets.

I show 3 ways to inspect Hugging Face datasets:

  1. Croissant metadata endpoint
  2. Data Studio / browser dataset viewer
  3. load_dataset with Python, pandas, and plots

We inspect columns, categories, response lengths, short examples, long examples, distributions, and how to make an early judgment about dataset quality before using it for training or fine-tuning.

The lesson also sets up the next part, where we run Qwen3 directly in C, so learners can understand what libraries like Transformers are doing behind the scenes.

Video:
https://youtu.be/MjZio-A9oUY

Cohort page:
https://cohort.bubblnet.com/lessons/lesson-1-huggingface-beyond-upload

I’d genuinely appreciate feedback from people here:

- Is this the right level for learners trying to move from “using AI tools” to understanding models/datasets?
- What would you add to a beginner-friendly Hugging Face lesson?
- Should I go deeper into model internals first, or datasets/training pipelines first?

youtu.be
u/adssidhu86 — 6 days ago

Where are small Models like Qwen3 0.6B and Qwen3.5 0.8B used ? Huggingface shows 2.88 million downloads this month.[D]

I can see 2.88 million downloads per month for small Qwen3.5 model. I tried using earlier model 0.6B in a deep resarch workflow and it was very difficult to get something done with this model .

  • Firstly they have a very surface level understanding of concepts. Poor Semantic understand means they can get confused about the topic or the task.
  • Json outputs are often broken . Adding a layer of checks on top took much of my time while working with these models.
  • Slow resposne. This one depends on a lot of factors and can actullay be improved , still slow response is a buzz kill most of the time

I am very curious how is the community using these models.

reddit.com
u/adssidhu86 — 9 days ago

Where are small Models like Qwen3 0.6B and Qwen3.5 0.8B used ? Huggingface shows 2.88 million downloads this month.

I can see 2.88 million downloads per month for small Qwen3.5 model. I tried using earlier model 0.6B in a deep resarch workflow and it was very difficult to get something done with this model .

  • Firstly they have a very surface level understanding of concepts. Poor Semantic understand means they can get confused about the topic or the task.
  • Json outputs are often broken . Adding a layer of checks on top took much of my time while working with these models.
  • Slow resposne. This one depends on a lot of factors and can actullay be improved , still slow response is a buzz kill most of the time

I am very curious how is the community using these models.

reddit.com
u/adssidhu86 — 9 days ago

I was able to create very good posters using a prompt I came across on X.
using these images I want to create 3d mockups . I am using claude to create a mockup, is there a better tool for this ?

u/adssidhu86 — 12 days ago

Anthropic just announced they’re doubling Claude Code’s 5-hour limits for Pro/Max/Team plans and removing peak-hour reductions. They also said Opus API rate limits are being increased.

But I’m confused about the practical effect of this.

From what I understand, Claude Code already has:

a rolling 5-hour/session-style limit

plus a separate weekly quota/shared usage pool

So if the 5-hour allowance is now larger, doesn’t that just let heavy users burn through their weekly allocation faster?

Example:

before: maybe you’d get throttled by the 5-hour cap first

now: you can do much more in one burst

but the weekly bucket is still finite

Wouldn’t this shift the bottleneck from “session limits” → “weekly exhaustion”?

Or did Anthropic also silently raise the weekly quotas alongside the 5-hour increase?

Trying to understand whether this is:

a real net increase in usable compute, or

mostly a smoother short-term experience for bursty workflows.

From the docs/help pages it still looks like weekly limits exist separately from session limits.

u/adssidhu86 — 14 days ago

Anthropic just announced they’re doubling Claude Code’s 5-hour limits for Pro/Max/Team plans and removing peak-hour reductions. They also said Opus API rate limits are being increased.
But I’m confused about the practical effect of this.
From what I understand, Claude Code already has:
a rolling 5-hour/session-style limit
plus a separate weekly quota/shared usage pool
So if the 5-hour allowance is now larger, doesn’t that just let heavy users burn through their weekly allocation faster?
Example:
before: maybe you’d get throttled by the 5-hour cap first
now: you can do much more in one burst
but the weekly bucket is still finite
Wouldn’t this shift the bottleneck from “session limits” → “weekly exhaustion”?
Or did Anthropic also silently raise the weekly quotas alongside the 5-hour increase?
Trying to understand whether this is:
a real net increase in usable compute, or
mostly a smoother short-term experience for bursty workflows.
From the docs/help pages it still looks like weekly limits exist separately from session limits.

u/adssidhu86 — 14 days ago
▲ 13 r/learnmachinelearning+1 crossposts

Hey everyone,

I’m building a free, open AI engineering cohort called First Break AI and wanted to share it here for feedback.

Link: https://cohort.bubblnet.com/

The idea is simple: help beginners and early builders move beyond passive tutorial-watching and actually build proof of work.

The cohort is structured around a practical journey:

  1. Ship something real first

    Start with GitHub, Quarto, a public learning/blog site, and AI coding tools.

  2. See inside the machine

    Run a small model locally and understand what happens from tokenization to generation.

  3. Learn inference properly

    KV cache, sampling, chat templates, quantization, serving, batching, vLLM/TGI/llama.cpp-style concepts.

  4. Learn training fundamentals

    PyTorch, training loops, data pipelines, LoRA/QLoRA, DDP/FSDP, W&B, validation loss, and how to read training curves.

  5. Build an AI product

    APIs, RAG, agents, frontend/backend integration, deployment, monitoring, and iteration.

  6. Prove it

    End with either a capstone project or a meaningful open-source contribution.

Why I’m making this:

A lot of AI learning material is either too shallow (“just use this API”) or too abstract (“read the paper and good luck”). I wanted something in the middle: practical, systems-oriented, and portfolio-driven.

It’s not a paid course. No certificate. No guarantee of a job. The goal is to help people build enough real work that their GitHub, blog, project, or PR speaks for them.

Who it’s for:

- students

- career switchers

- software engineers moving into AI

- people who know some Python but feel lost around real AI systems

- people who want to understand inference/training instead of only prompting models

I’d love feedback from this subreddit:

- Is the roadmap too ambitious for beginners?

- What would you remove?

- What would you add?

- What kind of capstone project would make this most useful for someone trying to break into AI?

Again, the cohort is free/open. I’m sharing it mainly to get feedback and hopefully make it useful for learners.

Link: https://cohort.bubblnet.com/

u/adssidhu86 — 9 days ago
▲ 3 r/Agentic_Marketing+1 crossposts

Migrating from a free or default domain offered by Wix, GitHub, Cloudflare to your custom domain sounds like a great idea, however this step can completely kill traffic to your site.

This happened to my site & I am wondering if any one is looking into this? This sounds like a perfect job for AI agents to fix SEO , indexing issues.

I created a site using the free domain provided by no code platform & started getting traffic to my site. Moved site to a custom domain that killed traffic on my site.

reddit.com
u/ShinchanBoo08 — 18 days ago