Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What would be the set of skills that would put you in the category of LLM talent that is in extreme short supply?

Just curious what the current bar is here and which of the LLM-related skills might be worth building.



Being able to train base LLMs. This is currently an alchemical skill since you can't learn it at school. This can be further split into infrastructure engineering (managing GPU clusters aint easy), data gathering and cleaning (at terabyte scale), the training itself, etc etc.

Being very good at fine tuning for a particular goal. Its much easier to learn fine-tuning, so standards are higher to stand out.

Being able to come up with architectural improvements for LLMs, aka the researcher path.

Wages start at $250k for grads at the big AI companies.


Funny you sort of describe me

1. For BERT scale model, all you need is a good codebase from GitHub (I had some luck with this one [0]) and a few weeks of trial and error. Want to try training T5 or LLaMA, but don't have the resources needed. Of course training models with more than 100B parameters is another level of labyrinth.

2. Finetuning is mostly related to how well you understand the task and the data you are dealing with. Since the BERT paper focuses on the GLUE benchmark, I've become very proficient in fine-tuning GLUE and eventually got sick of it.

3. Made some architectural improvements to BERT, got decent results so I wrote a paper, and got rejected because the reviewers want a head-on evaluation against some well funded papers from Google.

4. Not in my country. Damn, I am envious.

[0] https://github.com/IntelLabs/academic-budget-bert.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: