TLDR: Extreme Summarization of Scientific Documents

ultrastito · on Nov 28, 2020

That's what abstract are for

justarandomq · on Nov 28, 2020

After some thought I agree with you that this is the wrong problem to solve.

I took a narrative detour I wanted to share:

Suppose we make the analogue of a scientific paper to a piece of mineral ore (in terms of their raw content, and without written symbols in them for the sake of the analogy) extracted from some mine or quarry. This ore is somehow useful to someone, even if its value is structural: the shingles on an academic roof or a heavyweight desk. What a summarizer attempts to do is use a generic refinement process that will grind up the ore and then separate the components of interest such as Iron, Uranium, or Gold.

Anyone thinking that all of metallurgy reduces to simply throwing the slab into a machine and have it spew out the precious metals will find, instead, more complexity than they bargained for, and have more questions on machines or methods to resolve. Gold, Iron, Uranium, all have different extraction process.

I believe this approach may give some insight in what problems to solve instead with AI: focus on those discoveries that have helped advance "metallurgy", those of discovering and understanding the structure of the mineral ore and contents (scientific papers) and their relation with current technologies at the time, not on the philosopher's stone of 'summarizing' process more akin to a hammer that makes everything seem like a nail.

kreeben · on Nov 28, 2020

>> this is the wrong problem to solve

Highly intelligent human beings have a natural ability to summarize big ideas into TLDRs. Are humans basically a bunch of "summarizers"? Probably not. Is this ability to summarize or compress big ideas into smaller, more condensed pieces of information, important to the human race? Yes, I would say that they are. So to me, this is certainly one of those problems that we correctly attempt to solve.

R0b0t1 · on Nov 28, 2020

Some abstract too big, author word smaller good

capableweb · on Nov 28, 2020

Speaking of abstracts, did you read the abstract from the paper you're commenting about? I don't think you did, because it outlines how this approach is different, and maybe if there was something better than the abstract, you would have read it and not assume it's the same as what we're using abstracts for today.

In short, here's the major differences:

> SciTLDR contains both author-written and expert-derived TLDRs

> CATTS improves upon strong baselines under both automated metrics and human evaluations

jessfyi · on Nov 28, 2020

Abstracts are important (and clearly key in generating these TLDRs), but when it comes to ranking and recommending other papers (not to mention noting whether a new paper has content that can actually push a field forward) an abstract just isn't enough.

mschuetz · on Nov 28, 2020

Abstracts nowadays are way too large, bordering on a full blown introduction. I agree that's what they should be for, but they are not, in practice.

austinjp · on Nov 28, 2020

Everybody * scans papers for the piece of info they happen to be searching for, before reading it in any detail. The abstract should contain that, but might not. And nobody * reads the entire abstract anyway.

A clinician scanning a medical paper is looking for patient relevance: should they use the approach described? The statistical details are too intimidating, the preamble is irrelevant, they know the scope of the problem already.

This is not what "should" happen, but it is what actually happens.

The gap between published findings and clinical practice is several years. The peer review and publication process are way out of touch with clinical reality.

On top of this, people find articles using Google and read them on their phones. (In reality, they read summarised opinion pieces found via Google.)

A systematic reviewer may read papers in full. But even they scan papers for inclusion/exclusion criteria first. The deeper the information is buried, the greater the risk of misclassification. I'm not suggesting that TLDRs will fix this, it's just another data point in why we're seeing TLDRs being created.

* "Everybody" and "nobody" here excludes researchers :)

KineticLensman · on Nov 28, 2020

In some situations abstracts serve as bibliographic metadata rather than a summary of the content. Examples includes cases where the content is hidden behind a paywall or, in defence, when a paper's content is classified in some way but the existence of the paper itself is not. In both cases, the abstract may help you decide whether it is worth accessing the full paper, but on its own won't give you an answer. E.g "we studied X" but not "and concluded Y".

Obviously abstracts can include a content summary as well as bibliographic metadata, but not all do.

pcrh · on Nov 28, 2020

This kind of effort serves the function of helping people to _approximately_ "know what is known", but it's really not very useful to the more important part of research efforts, which is to know what is not known.

thereisnospork · on Nov 28, 2020

A large part of research is spent on the understanding of what is known; parsing papers is part and parcel for professors, grad students, and corporate R+D alike.

No idea if their approach is useful, but they are tackling a worthwhile problem.

visarga · on Nov 28, 2020

> No idea if their approach is useful, but they are tackling a worthwhile problem.

When there are 1000+ papers every week in your field you need some advanced tools. It's hard to read everything, it's O(N).

thorncorona · on Nov 28, 2020

Very cool.

Consider the paper's abstract: "We introduce TLDR generation, a new form of extreme summarization, for scientific papers. TLDR generation involves high source compression and requires expert background knowledge and understanding of complex domain-specific language. To facilitate study on this task, we introduce SciTLDR, a new multi-target dataset of 5.4K TLDRs over 3.2K papers. SciTLDR contains both author-written and expert-derived TLDRs, where the latter are collected using a novel annotation protocol that produces high-quality summaries while minimizing annotation burden. We propose CATTS, a simple yet effective learning strategy for generating TLDRs that exploits titles as an auxiliary training signal. CATTS improves upon strong baselines under both automated metrics and human evaluations. Data and code are publicly available at this https URL."

The algorithm summarizes it as:

“We introduce TLDR generation, a new form of extreme summarization, for scientific papers that produces high-quality summaries while minimizing annotation burden.”

Der_Einzige · on Nov 27, 2020

This is very neat work, and I would never have predicted that abstractive summarization would end up advancing so much more quickly in general than extractive summarization did from transformers being introduced. Makes me wish that simple highlighting of a document at the word-level was actually a sorta "solved" (gives compelling output more often then not) problem like condensed abstractive summarization is...

justarandomq · on Nov 28, 2020

What you get from applying TLDR to their paper:

We introduce SCITLDR, a new multi-target data set of 5.4KTLDRs over 3.2Kpapers.

Keeping pdf's copy-paste artifacts:

We introduceTLDRgeneration, a new formof extreme extreme summarization, for scientific pa-pers.

Adding intro and conclusion (optional):

We introduce SCITLDR, a new data set of 5.4KTLDRs over 3.2Kpapers.

[0] https://scitldr.apps.allenai.org/

austinjp · on Nov 28, 2020

Yeah. I happen to have been looking at this problem in my spare time recently. I tried a bunch of abstractive AIs and approaches, and none produce consistently usable results.

I'm sticking with extractive approaches plus a bunch of hard-coded general and domain-specific rules for now.

m1sta_ · on Nov 28, 2020

Yikes

_5p0g · on Nov 28, 2020

Not sure what I was expecting. It gave me back the first line of the abstract as response. (For anyone wondering, the paper I tried was: "A Heterarchy of values determined by the topology of nervous nets" by Warren S. McCulloch.)

petercooper · on Nov 28, 2020

Oh! I listened to Scott Hanselman interview one of the authors of this paper on his podcast the other day. It might interest some of you as she explains it all in a very accessible way: https://www.hanselminutes.com/763/tldr-extreme-summarization...

swyx · on Nov 28, 2020

(hey peter!) I did get very excited about it when I heard about it, but cooled significantly when I learned that it was trained only on cs papers and required the full abstract plus paper text. nice proof of concept but going to need significant work to generalize for, say, a newsletter business like yours

petercooper · on Nov 29, 2020

Ah yeah, I definitely wouldn't want to use this sort of thing on newsletters. I'm against automated curation generally. It's not what we're about :-)

anonymousDan · on Nov 28, 2020

This is potentially a godsend for me - I was faced with having to write a 1 paragraph summary of 70 student dissertations for an accreditation process next week. Abstracts are too long. Fingers crossed it works as advertised!

freerangebat · on Nov 28, 2020

“Why use lot word when few word do trick.”

etaioinshrdlu · on Nov 28, 2020

I played around with this on the demo page and found that while the generated "TLDR" are pretty good, they tend to generate sentences composed of fragments of existing sentences. Basically, it seems vaguely extractive in nature. Never did I see it summarize a concept in new words, or try to dumb down a complicated concept further than the original paper did. Given the results of GPT3 I would think that it should be possible to do much better by now, at least with enough data and compute time.

arolihas · on Nov 28, 2020

I think you’re underestimating how hard what you’re describing is. GPT-3 can mimic the language of reasoning but that doesn’t mean it’s capable of higher order reasoning.

FL33TW00D · on Nov 28, 2020

After reading through https://www.gwern.net/GPT-3 , I suspect that GPT-3 is capable of higher order reasoning, given the right motivation (prompt).

arolihas · on Nov 28, 2020

It’s impressive but doesn’t the need for a “good” prompt kind of show it doesn’t have the strong reasoning required to do the task you’re describing? Also there’s some interesting critique here https://www.lesswrong.com/posts/ZHrpjDc3CepSeeBuE/gpt-3-a-di...

msamwald · on Nov 28, 2020

Being very efficient at mostly extractive summarization and abstaining from abstractive summarization does seem a better bet though, because fewer things can go wrong and it is easier to check the summaries against the full text.

behnamoh · on Nov 27, 2020

While there are lots of TLDR websites out there, I want to know how this one is different from them. I get it; many scientific papers are to some extent bs, and many are just wrong. For PhDs, it's a hassle to go through all of that bs to find something that is actually true. I feel like PhDs basically have to spend hundreds of hours reading papers that don't really benefit them. Tools like this could probably help with that, but as long as scientific success is measured by how may papers you've published and/or how long your papers are, I don't see any hope of actually doing science in the coming years when the academia will be essentially "saturated" with papers.

tyingq · on Nov 28, 2020

It's Apache licensed and on github, which is substantially different from a lot of systems I see described in papers.

visarga · on Nov 28, 2020

Never saw a paper lauded for its long length. Maybe the authors would have written shorter papers but they didn't have enough time.

bjornsing · on Nov 28, 2020

No? Turing 1936 is well known as a lengthy paper.

(Just kidding.)

justarandomq · on Nov 28, 2020

You could link other TLDR tools and then we could read the paper to learn how they're different.

pixiemaster · on Nov 28, 2020

English only, as usual for NLG

qayxc · on Nov 28, 2020

In this particular case it's excusable as English is the Lingua Franca of science today. Used to be Latin, then French and German, now it's English. No big deal IMO.

In fact I kind of like the way this is going since it represents a fantastic opportunity for NL researchers to stand out simply by publishing research and corpora focused exclusively on low-resource languages and non-English/Mandarin in general.

It is also important to note that most of the ML research in the field is pretty much language agnostic and is concerned with general concept such as efficient en-/decoding [1], training methods [2], and even stealing pre-trained weights from APIs (like GPT-2 or even 3) without paying for training [3] :)

It's just easier to get your hands on and verify English corpora, results and pre-trained models for reproducibility than say Mongolian or Gaelic so that's a factor, too.

[1] https://arxiv.org/pdf/1904.09751.pdf

[2] https://arxiv.org/pdf/2003.10555.pdf

[3] https://arxiv.org/pdf/1910.12366.pdf

keyle · on Nov 28, 2020

Not to be mistaken with tldr, the command line utility that makes man pages readable and fun ;)

DrNuke · on Nov 27, 2020

I am afraid we will still need humans actually going through papers to get nuances, original contributions, if any, and wider narratives? (plug: the content project in my profile)