Generative AI has inspired an explosion of interest. ML had been a field requiring advanced knowledge of mathematics and computer science. Tools such as GPT, Midjourney and Stable Diffusion have opened up AI by providing intuitive natural language interfaces.
The power of these models hides a danger. While they can generate text and images that seem of similar quality to what a human artist or scholar can produce, that generated work hides the history behind the original data used to train these models. They generate based on statistical models from huge amounts of data with no attribution to the individual images and text in the data sets used to train the models.
The RB and LPX Foundation would like to make generative AI a tool for responsible scholars. Take a step with us and help train models for automatically selecting plausible citations for expository text. We will provide a dataset selected from Wikipedia articles with possible citations and abstracts of the associated articles to cite.
The challenge will be to devise a model to measure plausibility of citation. Teams will be ranked by accuracy of model and reproducibility of research.