Planet Code4Lib

Time: It doesn’t have to be this way / Meredith Farkas

Three pocket watches

“What we think time is, how we think it is shaped, affects how we are able to move through it.”

-Jenny Odell Saving Time, p. 270

This is the first of a series of essays I’ve written on time. Here are the others (they will be linked as they become available on Information Wants to be Free):

What I love about reading Jenny Odell’s work is that I often end up with a list of about a dozen other authors I want to look into after I finish her book. She brings such diverse thinkers beautifully into conversation in her work along with her own keen insights and observations. One mention that particularly interested me in Odell’s book Saving Time (2023) was What Can a Body Do (2020) by Sara Hendren. Her book is about how the design of the world around us impacts us, particularly those of us who don’t fit into the narrow band of what is considered “normal,” and how we can build a better world that goes beyond accommodation. Her book begins with the question “Who is the built world built for?” and with a quote from Albert Camus: “But one day the ‘why’ arises, and everything begins in that weariness tinged with amazement” (1).

“Why” is such a simple world, but asking it can completely alter the way we see the world. There’s so much in our world that we simply take for granted or assume is the only way because some ideology (like neoliberalism) has so deeply limited the scope of our imagination. Most of what exists in our world is based on some sort of ideological bias and when we ask “why” we crack the world open and allow in other possibilities. Before I read the book Invisible Women (2021) by Caroline Criado Perez, I already knew that there was a bias towards men in research and data collection as in most things, but I didn’t realize the extent to which the world was designed as if men were the only people who inhabited it and how dangerous and harmful it makes the world for women. What Can a Body Do similarly begins with an exploration of the construction of “normal” and how design based on that imagined normal person can exclude and harm people who aren’t considered normal, particularly those with disabilities. The book is a wonderful companion to Invisible Women in looking at why the world is designed the way it is and how it impacts those who it clearly was not built for. I’ll explore that more in a later essay in this series. 

One thing I took for granted for a very long time was time itself. I thought of time in terms of clocks and calendars, not the rhythms of my body nor the seasons (unless you count the start and end of each academic term as a season). I believed that time was scarce, that we were meant to use it to do valuable things, and that anything less was a waste of our precious time. I would beat myself up when, over Spring Break, I didn’t get enough practical home or scholarship projects done or if I didn’t knock everything off my to-do list at the end of a work week. I would feel angry and frustrated with myself when my bodily needs got in the way of getting things done (I’m writing this with ice on both knees due to a totally random flare of tendinitis when I’d planned to do a major house cleaning today so I’m really glad I don’t fall into that shooting myself with the second arrow trap as much as I used to). I looked for ways to use my time more efficiently. I am embarrassed to admit that I owned a copy of David Allen’s Getting Things Done and tried a variety of different time management methods over the years that colleagues and friends recommended (though nothing ever stuck besides a boring, traditional running to-do list). I’d often let work bleed into home time so I could wrap up a project because not finishing it would weigh on my mind. I was always dogged by the idea that I wasn’t getting enough done and that I could be doing things more efficiently. It felt like there was never enough time all the time. 

Black and white photo of a man hanging from a clock atop a buildingFrom Harold Lloyd’s Safety Last (1923)

I didn’t start asking questions about time until I was 40 and the first one I asked was a big one “what is the point of our lives?” Thinking about that opened a whole world of other questions about how we conceive of time, what kinds of time we value, to what end are we constantly trying to optimize ourselves, what is considered productive vs. unproductive time, why we often value work time over personal time (if not in word then in deed), why time often requires disembodiment, etc. The questions tumbled out of me like dominoes falling. And with each question, I could see more and more that the possibility exists to have a different, a better, relationship with time. I feel Camus’ “weariness, tinged with amazement.”

This is an introduction to a series of essays about time: how we conceive of it, how it drives our actions, perceptions, and feelings, and how we might approach time differently. I’ll be pulling ideas for alternative views of time from a few different areas, particularly queer theory, disability studies, and the slow movement. I’m not an expert in all these areas, but I’ll be sure to point you to people more knowledgeable than me if you want to explore these ideas in more depth.

How many of you feel overloaded with work? Like you’re not getting enough done? How many of you are experiencing time poverty: where your to-do list is longer than the time you have to do your work? How many of you feel constantly distracted and/or forced to frequently task-switch in order to be seen as a good employee? How many of you feel like you’re expected to do or be expert in more than ever in your role? How many of you feel like it’s your fault when you struggle to keep up? More of us are experiencing burnout than ever before and yet we keep going down this road of time acceleration, constant growth, and continuous availability that is causing us real harm. People on the whole are not working that many more hours than they used to, but we are experiencing time poverty and time compression like never before, and that feeling bleeds into every other area of our lives. If you want to read more about how this is impacting library workers, I’ll have a few article recommendations at the end of this essay.

My exploration is driven largely by this statement from sociologist Judy Wajcman’s (2014) excellent book Pressed for Time: “How we use our time is fundamentally affected by the temporal parameters of work. Yet there is nothing natural or inevitable about the way we work” (166). We have fallen into the trap of believing that the way we work now is the only way we can work. We have fallen into the trap of centering work temporality in our lives. And we help cement this as the only possible reality every time we choose to go along with temporal norms that are causing us harm. In my next essay, I’m going to explore how time became centered around work and how problematic it is that we never have a definition of what it would look like to be doing enough. From there, I’m going to look at alternative views of time that might open up possibilities for changing what time is centered around and seeing our time as more embodied and more interdependent. My ideas are not the be-all end-all and I’m sure there are thinkers and theories I’ve not yet encountered that would open up even more the possibilities for new relationships with time. To that end, I’d love to get your thoughts on these topics, your reading recommendations, and your ideas for possible alternative futures in how we conceive of and use time. 

Works on Time in Libraries

Bossaller, Jenny, Christopher Sean Burns, and Amy VanScoy. “Re-conceiving time in reference and information services work: a qualitative secondary analysis.” Journal of Documentation 73, no. 1 (2017): 2-17.

Brons, Adena, Chloe Riley, Ean Henninger, and Crystal Yin. “Precarity Doesn’t Care: Precarious Employment as a Dysfunctional Practice in Libraries.” (2022).

Drabinski, Emily. “A kairos of the critical: Teaching critically in a time of compliance.” Communications in Information Literacy 11, no. 1 (2017): 2.

Kendrick, Kaetrena Davis. “The public librarian low-morale experience: A qualitative study.” Partnership 15, no. 2 (2020): 1-32.

Kendrick, Kaetrena Davis and Ione T. Damasco. “Low morale in ethnic and racial minority academic librarians: An experiential study.” Library Trends 68, no. 2 (2019): 174-212.

Lennertz, Lora L. and Phillip J. Jones. “A question of time: Sociotemporality in academic libraries.” College & Research Libraries 81, no. 4 (2020): 701.

McKenzie, Pamela J., and Elisabeth Davies. “Documenting multiple temporalities.” Journal of Documentation 78, no. 1 (2022): 38-59.

Mitchell, Carmen, Lauren Magnuson, and Holly Hampton. “Please Scream Inside Your Heart: How a Global Pandemic Affected Burnout in an Academic Library.” Journal of Radical Librarianship 9 (2023): 159-179.

Nicholson, Karen P. “Being in Time”: New Public Management, Academic Librarians, and the Temporal Labor of Pink-Collar Public Service Work.” Library Trends 68, no. 2 (2019): 130-152.

Nicholson, Karen. “On the space/time of information literacy, higher education, and the global knowledge economy.” Journal of Critical Library and Information Studies 2, no. 1 (2019).

Nicholson, Karen P. ““Taking back” information literacy: Time and the one-shot in the neoliberal university.” In Critical library pedagogy handbook (vol. 1), ed. Nicole Pagowsky and Kelly McElroy (Chicago: ACRL, 2016), 25-39.

Awesome Works on Time Cited Here

Hendren, Sara. What Can a Body Do?: How We Meet the Built World. Penguin, 2020.

Odell, Jenny. Saving Time: Discovering a Life Beyond Productivity Culture. Random House, 2023.

Wajcman, Judy. Pressed for time: The acceleration of life in digital capitalism. University of Chicago Press, 2020.

Slow productivity is a team sport: A critique of Cal Newport’s Slow Productivity / Meredith Farkas

Impressionist painting of four people in flowing clothes resting on the bank of a river

Image credit: Dolce far Niente by John Singer Sargent 

This is the fourth in a series of essays I’ve written on time. You can view a list of all of them on the first essay.

This was going to be a somewhat different essay before I read Cal Newport’s Slow Productivity. I read the book the day it came out, interested in seeing how he incorporated the ideas from slow movements into the world of productivity, since in so many ways, productivity is the enemy of slowness. Given what I’d read of his work in the New Yorker, I was skeptical that he would really embrace slowness in his book and I discovered my skepticism was more than justified. I’m going to start by critiquing Newport’s book, but then get into my own vision for what it might take to achieve slow productivity.

In late 2021, Cal Newport began writing about “slow productivity,” largely in response to a tidal wave of published books that questioned our society’s focus on productivity (for productivity pundits, the answer is always productivity). He saw the goal of slow productivity as “keep[ing] an individual worker’s volume at a sustainable level” and argued that this will not have a negative impact on organizational productivity because less overloaded workers will be less focused on managing a glut of information. He envisioned systems that will track people’s work and assign new tasks based on when the people with the needed skills have time available. In a world full of unique individuals whose capacities vary day by day and where most tasks are far from mechanistic, I question whether this is possible. Tack on the fact that we have people working at varying levels of precarity plus the fact that our reward systems incentivize overwork and we’re always going to have some people who feel the need to do significantly more to prove themselves. Creating systems that don’t change the underlying realities and inequities in the world of work will not adequately address the issue of overwork and overwhelm. 

Strangely, though, his book has no suggestions for how slow productivity could be achieved at the systems level. It’s so individual-focused, that he suggests only taking on projects that don’t require meetings with others (the “overhead tax” on projects he calls it). The idea that meetings with others could make us better at our jobs doesn’t seem to occur to him. His understanding of slow proves to be surface-level at best. The slow movement isn’t just about individuals choosing to step away from fast culture; it’s about changing the culture so that everyone can slow down. Otherwise it just becomes an elitist enterprise where only those with the most privilege can actually access the benefits of slow living.

Mountz, et al. (2015) wrote about slow scholarship, arguing that it “is not just about time, but about structures of power and inequality. This means that slow scholarship cannot just be about making individual lives better, but must also be about re-making the university” (1238). Slow Food advocate, Folco Portinari (the author of the slow food manifesto though I rarely see him credited), wrote “there can be no slow-food without slow-life, meaning that we cannot influence food culture without changing our culture as a whole.” Slow Food isn’t just about buying local and slow scholarship isn’t just about not buying into the productivity expectations of the academy. It’s about collectively working to change the systems themselves.

But, really, Cal Newport is not writing this book for most of us. He’s writing it for white, male (there are plenty of critiques of his previous work on the basis of sexism), affluent, lone geniuses who aren’t accountable to a boss. He waits until the end of the book to explicitly state that his advice is for academics and people who work for themselves, but when he offers advice like go see a movie matinee on a weekday once a month, take month+ long vacations to gain perspective, cut your salary, and only take on projects that require no collaboration with others, we see how unrelatable this is to most knowledge workers. 

I’ll bet he pulled himself up by his bootstraps!

All you need to know about Newport’s philosophy you can get from page 7 of the book:

Slow productivity [is] a philosophy for organizing knowledge work efforts in a sustainable and meaningful manner, based on the following three principles:

1. Do fewer things

2. Work at a natural pace

3. Obsess over quality

I agree that these are good goals, but his book won’t help you get there. The rest of the book is recycled productivity tips from his previous work (many of which won’t work unless you have total control over your work) punctuated by completely unrelatable stories of famous figures throughout history that don’t connect well to any sort of usable takeaway. I read his story of Jane Austen and how she was only able to really be productive in her writing when her brother inherited an estate, she went to live there, and the family decided not to participate in society anymore. So is the takeaway that I need no children, plenty of servants, and no social engagements to be productive? Cool cool cool.

I will never understand why we trust advice from people who have zero experience working the sort of jobs we have. It would be one thing if his work was research-based, but it isn’t. Early in the book, he writes about how people don’t really understand why people are suddenly so exhausted and burned out by work, but there’s ample research in the sociology, anthropology, business, and psychology literature that addresses this. I know because I’ve read a lot of it! And if we’re trusting his experience, what does a person who went from Ivy League undergraduate work, to graduate work at MIT, to a post-doc, to a tenure-line position at Georgetown in computer science really know about what it’s like to work in a typical knowledge organization with a manager and peers who rely on them? I am in a massively privileged position where I have tenure and summers off and even I found very little that I could apply to my own work. As an instruction librarian, I teach students to look into the author of something they are going to rely on and determine if/why they would trust that particular author’s expertise on that subject. Maybe we should do the same?

If you’re looking for really brilliant and well-researched work relevant to slow productivity, check out Melissa Gregg’s Counterproductive, both of Jenny Odell’s books, Oliver Burkeman’s Four Thousand Weeks, Carl Honoré’s book on the slow movement, and Wendy Parkins and Geoffrey Craig’s Slow Living. They will not offer you concrete tips for being more productive, but, really, there’s no magical list of tips that will work for everyone. They will open your mind to what’s wrong with how we’ve been working and what is possible if we came together to collectively fight for change.

In my next post, I’ll share my own vision of what slow productivity looks like (I decided to break this up into two posts because it was getting a bit long). My tips for slow productivity are quite different from Newport’s in that they’re much more focused on our collectivity. He was right in his piece on “The Rise and Fall of Getting Things Done” that productivity advice is broken because it is not changing things at the level of the system (though he then produced another book focused on individual productivity, go figure). In organizations, we are often dependent on one another to complete our work. We are also held to the collective norms of the organization around productivity and performing busyness. Therefore, slow productivity must be a team sport. 

See you again in a couple of weeks!!!

Burkeman, Oliver. 2023. Four Thousand Weeks : Time Management for Mortals. First paperback edition. New York: Picador.

Gregg, Melissa. Counterproductive: Time Management in the Knowledge Economy. Durham North Carolina; London, Duke University Press, 2018.

Honoré, Carl. In praise of slow: How a worldwide movement is challenging the cult of speed. Vintage Canada, 2009.

Mountz, Alison, Anne Bonds, Becky Mansfield, Jenna Loyd, Jennifer Hyndman, Margaret Walton-Roberts, Ranu Basu et al. “For slow scholarship: A feminist politics of resistance through collective action in the neoliberal university.” ACME: An International Journal for Critical Geographies 14, no. 4 (2015): 1235-1259.

Newport, Cal. “The Rise and Fall of Getting Things Done.” The New Yorker, 17 Nov. 2020,

Newport, Cal. “It’s Time to Embrace Slow Productivity.” The New Yorker, 3 Jan. 2022,

Newport, Cal. 2024. Slow Productivity : The Lost Art of Accomplishment without Burnout. New York: Portfolio/Penguin.

Odell, Jenny. 2019. How to Do Nothing : Resisting the Attention Economy. Brooklyn, NY: Melville House.

Odell, Jenny. Saving Time: Discovering a Life Beyond Productivity Culture. Random House, 2023.

Parkins, Wendy and Geoffrey Craig. 2006. Slow Living. Oxford: Berg.

Petrini, Carlo. Slow food: The case for taste. Columbia University Press, 2003.

Quilting together at OCLC / HangingTogether

If you’re attending the American Library Association annual conference in San Diego later this month, watch out for the colorful display of quilts. Each year the ALA Biblioquilters hosts a silent auction of quilts as a fundraiser for the Christopher Hoy/ERT Scholarship Fund, which awards a $5,000 scholarship each year to an MLIS student. 

There you will find a colorful color wash style quilt comprised of 480 blocks and more than 2500 pieces entitled “Quilting Together,” donated by the OCLC Quilters. The quilt was designed, pieced, assembled, and financially supported by a team of twelve current and retired OCLC employees from across the organization. Each quilter dug into their own fabric stash to make the 3” blocks, which were then assembled into this colorful and unique creation.

Four people holding up a colorful patchwork quiltThe “Quilting Together” quilt, displayed by four of the twelve OCLC quilters

This isn’t the first offering by the OCLC Quilters. Last year we created a scrappy cat-themed quilt called a “World of cats,” obviously inspired by WorldCat. It raised $775 to support the scholarship.

Photo of quilt with cloth comprised of numbersThe data-inspired quilt backing

This year’s quilt is also inspired by WorldCat. We’ve borrowed the title of this quilt, “Quilting Together,” from a title in WorldCat, Quilting together : how to organize, design, and make group quilts, which is held by more than 200 libraries worldwide. And, like the record for this book, and all of WorldCat, this quilt is backed by data. Take a look at the numerically themed backing fabric!

Making a group quilt requires special considerations. For example, to be inclusive, the block chosen should be simple enough to accommodate sewists with a broad range of skills. Furthermore, a scrappy style allows participants to use their own leftover quilt scraps in their own fabric “stash” without having to purchase materials. An ample supply of scraps was donated by experienced quilters for anyone in need of supplies. Finally, as with other collaborations, it’s critical to recognize that not everyone has to contribute in the same way. While some employees were engaged at every stage of the quilt-making process, others contributed by making blocks, while others donated money for the professional longarm quilting.

An image of the OCLC quilt that shows how the OCLC logo has been incorporatedAn OCLC logo is incorporated into the quilt

A cataloging colleague pointed out to me that group quilting seems to have parallels to cataloging in WorldCat, as each contributor is part of a larger community that collectively enhances the object. And quilts, just like bibliographic records, are made up of a lot of components, with their own terms, like blocks, backing, binding, and much more. I like that. 

I hope you’ll not only stop by the quilt auction in San Diego, but that you’ll also get out your wallet to bid on it. You’ll get a one-of-kind item made by a group committed to the vision of collaboration in libraries. And quilting. 

The post Quilting together at OCLC appeared first on Hanging Together.

Not Business as Usual: Incorporating LIS Student Perspectives in the Apprenticeship Hiring Process / In the Library, With the Lead Pipe

In Brief

While a Master’s in Library and Information Science (MLIS) degree is typically necessary to become an academic librarian, practical experiences such as internships, practicums, and apprenticeships are essential in gaining employment post-graduation. Providing paid opportunities where LIS students participate in and contribute to meaningful mentorship, training, and work experience is critical to improving inclusion in academic libraries. This article reflects on experiences of student employees of the University of Colorado (CU) Boulder University Libraries’ Ask a Librarian Apprenticeship, who collaborated with the apprenticeship supervisor to purposefully reassess the hiring process for incoming apprentices. This article demonstrates how including student employees as active participants in the hiring process is not only a valuable experiential learning opportunity, but also shifts power dynamics from a sole hiring manager to a team including student employees, creating a better hiring process for student applicants. 

By: Estefania Eiquihua, Karen Adjei, Janelle Lyons, and Megan E. Welsh


Although the Master of Library and Information Science (MLIS) degree is required (in many cases) in order to be a professional librarian, a degree alone is not sufficient for library school (LIS) graduates when they enter the job market. Hands-on experience through internships, practicums, and apprenticeships allows students to put coursework into practice and prepare for the post-graduation job search by gaining a sense of what librarianship looks like. As these work experiences have historically been unpaid, it is crucial that libraries begin and continue to offer paid opportunities so that LIS students are not forced to pay for credits toward their degree or contribute free labor to an organization in exchange for practical experience. The challenge of receiving worthwhile professional experience, which may or may not be paid, is especially poignant for emerging library professionals who identify with a historically marginalized group that has traditionally been excluded from librarianship. 

Providing paid opportunities for emerging library professionals is one way to promote inclusion. However, libraries can further facilitate an environment of inclusion by actively involving their current student employees in the hiring process of such paid opportunities. When student employees are actively and purposefully involved in the hiring process – through crafting the job ad, developing evaluation criteria, and interviewing candidates – it benefits the library, the current employee, and future applicants. By intentionally including student employee experiences in hiring practices, professional development opportunities aimed to support emerging library professionals become more accessible. At the University of Colorado (CU) Boulder University Libraries, we experienced the power of involving student employees in the hiring process firsthand by embedding current graduate student apprentices throughout all stages of the hiring process as we recruited a new apprentice. Current student employees were able to gain valuable experience in hiring, candidates experienced a more transparent application and interview process, and the hiring supervisor received valuable insights into how best to implement more inclusive student employee hiring practices to benefit future iterations of the apprenticeship program. 

This article demonstrates how including student employees as active participants in the hiring process is not only a meaningful experiential learning opportunity for apprentices, but also shifts power dynamics from a sole hiring manager to a team including student employees. This article contextualizes these experiences by reviewing the literature on meaningful professional development opportunities for LIS students as well as literature about hiring processes in academic libraries. Our overall intention is to highlight how including current apprentices in iterations of the hiring process creates a better experience for applicants. The practices laid out in this article would be of particular interest for any library hiring supervisors interested in challenging the status quo, providing a rewarding professional development opportunity for student employees, and recruiting a more diverse population of student employees through thoughtful hiring practices.

Literature Review 

Much has been published on the value of providing LIS students with practical experiences through mentorship programs, internships, and practicums. Most literature in support of practical experiences for LIS students argue that an LIS curriculum alone does not provide students the on-the-job training that seems to be expected in the field. Lacy & Copeland (2013) cite that while all LIS programs place value on practical experiences, in many cases students are not required to participate in internships or practicums in order to graduate (unless they are concentrating on school librarianship, for example). The authors emphasize the importance of mentorship programs that offer opportunities for LIS students to network, experience day-to-day work life and job expectations, and to enhance job seeking skills. A study by Goodsett & Koziura (2016) questions what can be done to improve LIS education for new librarians. They surveyed over 575 LIS graduates in order to gain insight into the perceived effectiveness of their LIS education. While respondents undoubtedly found value in their LIS education, most reported that their LIS curriculum emphasized theoretical knowledge. An overwhelming number of respondents reported that practical experiences such as work experience, internships, and practicums were essential in gaining employment post-graduation. 

The need of LIS students to supplement their graduate curriculum sheds light on the importance for libraries to provide meaningful practical experiences so that the next generation of information professionals is well prepared to intentionally maintain and improve the field of librarianship. Lewey & Moody-Goo (2018) suggest that the ideal internship is mindfully designed and should be “transformative and empowering” for the LIS student. The authors emphasize that internships which are mindfully designed can “benefit all parties involved—intern, institution, library, librarians, and the LIS field as a whole” (p.238). The authors advocate that “meaningful internships should have four key features: supportive mentorship, purposeful planning and training, simulation of an authentic professional position, and reflection and assessment” (p. 238). Wang et al. (2022) agree that access to meaningful internships is essential for post-graduate success. However, the authors argue that internships should also strive to become more equitable. The authors cite various barriers that hinder LIS students from being able to participate in practical experiences such as: availability of opportunities, location, lack of time, and finances. Another barrier mentioned was the expectation for students to “volunteer” for experiences or to complete credit bearing practicums in which students have to pay tuition. The authors are critical of the “superficial professionalization” of librarianship and recommend that libraries should work toward supporting LIS students and recent graduates by funding internships and practicums. They also recommend offering interns competitive pay and offering remote or hybrid work to help alleviate the financial or geographic burden of trying to gain practical experience. 

Wildenhaus (2019) emphasizes the critical importance to denormalize unpaid positions in LIS. She notes that the message presented to many LIS students and new librarians is “the cost of entry to a career in libraries and archives is a willingness—and ability—to work for free” (p.2). Wildenhaus states,“the prevalence of unpaid internships may negatively impact efforts for diversity and inclusion among information workers while contributing to greater precarity of labor throughout the workforce” (p.1). Unpaid labor is an additional barrier to Black, Indigenous, and persons of color (BIPOC) seeking practical experience, as Galvan (2015) points out, “only students with access to money can afford to take an unpaid internship… insuring [sic] the pool of well-qualified academic librarians skews white and middle class” (para. 31). Holler (2020) furthers this notion by highlighting, “only certain sorts of people can afford to work for free: people who are wealthy; people with spouses or partners who can provide for them; people who have the luxury of living with families or guardians; people who are unburdened by care work and its economies; people without outstanding medical bills or student debt; and, overwhelmingly: people who are white” (para. 40). Holler (2020) rejects the notion that unpaid or underpaid labor should be normalized and advocates for a “equity budgeting model” in which the culture of paying dues is denounced and institutions commit to paying all workers, especially students who are trying to gain practical experiences in community-based cultural work sectors. Holler (2020) explains that the equity budgeting model is rooted in the desire to “[repair] the damage of a fundamentally extractive nonprofit-industrial complex and cultural work sector, which has survived on the systemic underpayment (or non-payment) of community members of color and freelance cultural workers alike — resulting in a cultural work economy in which independently wealthy, white, or salaried practitioners hold unfair and unequal sway” (para. 3).

There is a significant gap in the literature detailing the perspectives of BIPOC LIS students and new librarians’ experiences with unpaid labor. The lack of literature on the topic may be due to the vulnerable position in which BIPOC LIS students and new librarians find themselves–trying to break into their profession while being entrenched in a culture that insists on “paying your dues” in order to gain professional experience. Insight into their experiences would provide essential knowledge to challenge the status quo in hopes to denormalize the prevalence of unpaid labor in LIS. 

Furthermore, we were not able to find literature that specifically discussed the experiences of LIS students being involved in the hiring process, a growing body of literature has emphasized the importance of inclusive hiring practices as a way to reduce barriers that hinder recruitment efforts (Cunningham et al., 2019; Galvan, 2015; Harper, 2020; Houk & Nielsen, 2023; Shah & Fife, 2023). Shah & Fife (2023) further state, “the recruitment/hiring/retention life cycle for BIPOC job candidates for academic and research libraries is fraught with bureaucracy and layers of communication that deter the very DEAI concepts that they aim to practice” (para. 2). The authors emphasize that complex job descriptions and complicated application processes hinder recruitment efforts and instead, libraries should “focus on the humanity of the candidates” (para. 16), and work toward dismantling barriers by providing honest and concise job descriptions. 

Houk & Nielsen (2023) further this argument for person-centered hiring practices by advocating that every aspect of the recruitment process be critically examined. Specifically, the authors critically examine interviews and emphasize “the need for intentionality in creating environments where candidates, particularly candidates from marginalized communities, feel welcome and set up for success during their interviews” (Discussion section, para. 1). In their research, the authors found that the idea of “the interview as a test” was common. This manifested in explicit testing of skills through presentations or interview questions, in hidden testing through observations of a candidate’s behavior, or in perceived “fit.” The idea of hiring based on the “interview as a test” and “fit” is problematic when put into context of a profession that has been historically predominantly white. According to an American Library Association (ALA) 2012 Diversity Counts survey, nearly 88% of professional librarians identified as white. Cunningham et al. (2019) emphasizes that “fit” is often “undefinable, intangible, and thus allows for libraries to stay within their comfort zones and replicate the status quo” (p. 17). 

Furthermore, while interviews are an integral part of determining whether a candidate is a good match for a position, Houk & Nielsen (2023) argue that libraries should reexamine how they are evaluating candidates and ensure they are making intentional efforts to reduce bias in their hiring criteria. They suggest intentional actions such as providing candidates with interview questions and giving candidates accommodations to ensure that candidates are comfortable and more confident during the interview process. Establishing well-defined hiring criteria and qualifications help reduce bias. The work to improve the hiring practices for CU Boulder Libraries’ Ask a Librarian Apprenticeship through the inclusion of student apprentices, directly addresses these suggestions from the literature review and furthers the conversation by contributing a successful model of reducing professional development barriers in the LIS field.

Apprenticeship Context

University of Colorado (CU) Boulder is a large, R1, public university enrolling over 30,000 students. Five libraries on campus comprise the University Libraries system and support undergraduate and graduate students, faculty, staff, and the broader Boulder, Colorado community. The largest library on campus currently has a distinct reference desk (the Ask a Librarian Desk) and the University Libraries maintain a virtual chat service which we call “Ask A Librarian.” On most evenings and weekends during the academic year, our virtual chat service is staffed exclusively by LIS student employees. Since 2018, we have hired ten graduate students in library and information science as Ask A Librarian Apprentices at CU Boulder. The apprenticeship is a paid, practical experience which aims to build library school students’ skills in reference work by staffing evening and weekend chat shifts, while also supporting their interests as they engage in professional development, networking, and special projects ranging from building research guides to collection development to publishing and presenting. Unique from internships and practica, the apprenticeship is an intentionally scaffolded experience which provides LIS students with a holistic view of academic librarian responsibilities. It is an experience that lasts longer than a typical semester-long internship or practicum, and usually for the duration of the apprentice’s LIS education (due to campus funding parameters, LIS students are no longer eligible to be an apprentice after they graduate). 

In 2020, as the COVID-19 pandemic shifted the apprenticeship to a remote work opportunity, CU Boulder Libraries also intentionally viewed the apprenticeship as an opportunity to recruit LIS students of color to academic librarianship. Contextualized by the Black Lives Matter movement, the murders of Breonna Taylor and George Floyd, growing awareness of the historical injustices and predominance of whiteness in academic library settings, and training dedicated to recruiting and retaining librarians of color (see the excellent the Library Juice Academy course “Recruiting and Retaining Librarians from Underrepresented Minoritized Groups”), CU Boulder Libraries accepted a proposal in summer 2021 to continue the remote modality of the apprenticeship and to explicitly welcome BIPOC students to apply. The apprenticeship is a valuable opportunity for students to gain practical skills as they look toward graduation and enter the job market. It has evolved over the years, especially given the pandemic when apprentices transitioned from staffing our physical reference desk in person to staffing our virtual chat service. Apprentice project work over the past four years has included increased participation in the hiring process for incoming apprentices. 

Initially, in 2018 and 2019, the hiring process involved Megan as the apprenticeship supervisor and hiring manager developing and posting a job ad, reviewing applications, scheduling interviews, and making the final hiring decision; sometimes her colleague who managed the reference desk joined the interviews. The hiring process has evolved to be entirely virtual, matching the modality in which the apprenticeship is currently offered, and now includes current apprentices. The extent of apprentice participation in the hiring process has grown over the past four years. In 2020, apprentices began to sit in on interviews. We moved from incorporating a staff colleague as a companion interviewer to involving current apprentices, both because that staff colleague’s responsibilities had changed and that role experienced turnover, and also as a way for graduate student applicants to hear directly from the experience of current apprentices. This opportunity for current apprentices to articulate their unique perspectives and to be transparent about what the job actually looks like is a valuable opportunity for them and for applicants. Apprentices are able to describe everything from the questions they receive over chat, to the project work they engage in, to what it’s like to work with Megan as a mentor and supervisor. These are questions that Megan cannot answer in the same way, or in nearly as meaningful a way, as our current apprentices.

Currently, CU Boulder Ask a Librarian Apprentices participate in the hiring process by:

  • Reviewing and revising the job ad in collaboration with Megan. This helps to capture, in real-time, what apprentices have experienced throughout the entire hiring and employment process. They are able to bring their experiences into all stages of the hiring process to ensure that it benefits future apprentices. CU Boulder apprentice involvement in hiring creates continuity of feedback, revision, learning, and application of inclusive practices for everyone throughout the hiring process so that apprentices and the supervisor can learn from each other and improve approaches to hiring and onboarding,
  • Helping to recruit by advertising through listservs, library school forums, on social media (e.g., the We Here Facebook group, a space exclusively for BIPOC library school students and library professionals), and through word of mouth with peers at conferences and individually. These recruiting efforts highlight how apprentices create and leverage their networks within the LIS field to positively contribute to the hiring process. Advertising through these networks expands the reach of the job posting and knowledge of CU Boulder as a site that supports LIS student labor. It also represents the various social networks that current LIS students are a part of, especially ones which the hiring manager may not be aware of, have access to, or be welcome to participate in,
  • Reviewing, discussing, and suggesting revisions to hiring documentation. This documentation includes a rubric used to rank application materials, a list of interview questions, and a rubric used to rank interviewees,
  • Reviewing applications and ranking them to help prioritize who we should invite to the interview stage, 
  • Participating in the interview process by asking interview questions and answering candidates’ questions about their experience in the apprenticeship, and  
  • Ranking interviewees to help inform a final hiring decision.

Including apprentices in the interview stage of the hiring process can provide clarity for potential apprentices about the day-to-day work of the apprenticeship and tasks listed in the job ad, addressing questions and alleviating confusion that applicants may have. In this way, current apprentices help to reduce barriers for student applicants throughout the hiring process. Yet, beyond including current apprentices as key participants in such a visible aspect of the hiring process as the interview, much of the evolution of our hiring has involved apprentices helping to create and refine hiring documentation. This documentation helps to standardize the hiring process, enhance clarity of the job and applicant requirements, and decrease bias in the application and interview evaluation by ensuring that multiple perspectives are represented throughout. Increasing apprentice engagement in all elements of hiring helps Megan to evaluate applicants with perspectives other than her own, and it gives current apprentices the opportunity to learn about the hiring process more as a hiring authority rather than as the applicant they once were. 

Apprentice perspectives

Job ad development

Estefania and Karen were both excited to participate in reviewing the hiring material and criteria for the incoming apprentice. They were eager to participate because they wanted to gain experience on the other side of the hiring process while also improving the hiring process for the next round of applicants. In order to revise the hiring materials, Megan and the apprentices reviewed the materials that were used when Karen and Estefania were applying to the apprenticeship. Both apprentices relied on memories and past experience as the interviewee to inform how they would like to see changes made to the hiring material. Each considered what could have been perceived as a barrier by incoming applicants with the intention to make the hiring process more inclusive for the next round of applicants. Both also reflected on their prior experiences while originally applying for the apprenticeship and considered what specific wording from the job ad had appealed to them, what made the apprenticeship an attractive opportunity, and what revisions should be made to ensure the hiring materials were concise, transparent, and reduced bias. 

While reflecting on the job ad (see Appendix A), both apprentices had helpful suggestions for tweaking the original language to more accurately reflect the apprenticeship. For example, Karen suggested changing the language in which the apprenticeship was originally described as a “fast-paced” environment. Karen admits that she initially shied away from applying to the CU Boulder apprenticeship due to this description because she had previous work experience in a “fast-paced” environment and had mixed feelings about entering into a similar workplace. She encouraged changing the language because oftentimes “fast-paced” could be code for work environments that require a lot of responsibility with tight deadlines and no support. Karen also recalled that during Atla Annual 2022, she had attended a workshop co-hosted by Megan entitled, “Navigating in the Fog: Shining a Light on the Library Job Search Process” (Welsh & Knievel, 2022). From the hiring workshop, she was able to learn about common wording that deters women of color applicants in particular, which helped her to identify specifically why she did not apply to the apprenticeship position in the first place. The group decided to omit the “fast-paced” language and instead highlighted that the apprenticeship values practical professional experience alongside receiving mentorship from faculty librarians. We specifically changed the verbiage in the job ad to emphasize the exploratory nature of the apprenticeship experience in allowing emerging library professionals to contribute to and build their interests in the field of academic librarianship.

Estefania also reflected on the specific wording that made her excited about the opportunity and considered how the language in the job ad could enhance the transparency of the responsibilities and make the position more appealing, especially for BIPOC LIS students. For example, the original job ad stated, “A core goal of the apprenticeship program is to invite and encourage involvement of MLIS students from traditionally underrepresented groups in academic librarianship.” When Estefania originally applied for the position, she appreciated that this statement was included and recommended that for Fall 2023 the job ad include the addition of “BIPOC (Black, Indigenous, and People of Color) MLIS students are highly encouraged to apply.” While a small gesture, the additional language is important to advertise that this program is intentionally recruiting BIPOC and people from underrepresented groups. Estefania shared that when institutions add this verbiage, she feels more empowered to apply. 

Recruitment strategies

While Megan maintains a list of library schools to share the job ad with and colleagues in CU Boulder Libraries’ HR share the posting to the Libraries’ website, a job board, and a platform called Handshake, apprentice involvement in promoting the apprenticeship was crucial during the recruitment phase. Karen intentionally shared the job ad with as many groups she was a part of and networks she was connected to in order to cast the net far and wide. This strategy ensured that LIS students would be able to see the ad across many platforms and would have a better chance of being exposed to this opportunity. Leveraging and contributing to social networks is especially important in virtual modalities of professional and academic spaces, where in-person connection and subsequent exchange of information needs to be deliberate and intentional in order to be effective at all.

Places where we shared the job ad include the following: 

  • University of Maryland (UMD) MLIS Student listserv,
  • UMD MLIS Student discord channel, 
  • Association of Research Libraries (ARL) Diversity Programs Alumni,
  • Asian Pacific American Librarians Association (APALA),
  • National Association to Promote Library and Information Services to Latinos and the Spanish Speaking (REFORMA),
  • We Here Facebook Group,
  • Atla Listserv, a listserv for Theological and Religious Studies librarians,
  • Karen shared the job ad with a former supervisor and a mentor so they could forward this opportunity on to others who may be interested or who might have other networks they could spread the word through. Karen also shared with a peer whom she met at the California Library Association (CLA) conference, and
  • Estefania shared the job ad with iSchool Students of Color, a group which she was part of at the University of Illinois Urbana-Champaign.

In total, we received over sixty applications in the Fall 2023 hiring cycle, similar to previous hiring cycles since moving the apprenticeship to a remote opportunity in 2020.

Reviewing and ranking candidates’ application materials

The rubric for evaluating applications was another aspect of the hiring materials that we collaboratively decided to change (see Appendix B). We deconstructed the job ad and applied a numbered scale to help us determine which candidates addressed the qualifications highlighted in the ad. The scale ranged  from 0, representing that the criterion was not addressed in the applicant’s CV/résumé or cover letter or indicating ineligibility for the apprenticeship, to 2 representing that the applicant fully addressed the criterion and met eligibility requirements for the apprenticeship. While this numbering system would help us to keep track of who excelled in crafting their application materials, we decided to allow space for evaluator comments in order to balance the quantitative and qualitative in holistically considering which applicants should progress to the final interview stage. In addition, we also decided to change the language of the following criterion: “[Applicant] discussed interest in pursuing a career in academic librarianship.” Instead of using the word pursuing, we decided to use exploring. Karen advocated for this subtle change because she emphasized that LIS students may still be unsure about committing to academic librarianship. Rather, they would benefit from the opportunity to explore what it is like to work in an academic library without the added pressure of being sure about the career path as a qualifier for being chosen to interview for the apprenticeship. 

With over sixty applications to sort through, using the updated application rubric aided in the standardization of reviewing and ranking candidates. The numbered rating system helped us to generate a finalist list to invite for interviews in an efficient manner so that we did not prolong the hiring process. This efficiency and our concern for “closing the communication loop” in a timely manner meant that we could respond to applicants and provide constructive feedback and resources if they were not progressing through the hiring process. We thought that this clear, thoughtful, and quick communication with all applicants, regardless of acceptance or rejection throughout each step, would be another way for us to respect their time, energy, and effort while also providing guidance and resources that would help to further their careers.

Reviewing and updating interview materials

While we had used an application rubric in the past, the Fall 2023 hiring cycle was the first time we used a rubric to help evaluate interviews (see Appendix C). The interview rubric was structured in a similar manner as the application rubric, where each interview question corresponded with an item on our scoring rubric. For each question that an interviewee answered, we ranked responses according to a numbered scale spanning 1 to 5, where 1 meant that the interview questions were not answered and 5 indicated that the interview questions were answered very well. Since the interview rubric focused on how well the interviewee answered the questions, we felt that this newly developed tool helped to mitigate any bias we may have had in this decision making process. Also similar to the application rubric, we decided to keep space for interviewer comments and added a field to record suggested ranking in order to balance the quantitative and qualitative evaluation. This extra space, not tied to specific interview questions, afforded an opportunity to holistically consider the interviewees and help us determine a finalist to offer joining the apprenticeship program.

We also updated the wording for the Fall 2023 hiring cycle interview questions (see Appendix D). For example, in a three-part question, we asked candidates to reflect on how they would describe themselves, how fellow students or classmates would describe them, and how a teacher, professor, or supervisor would describe them. We decided to remove the second part of this question (about how peers would describe the interviewee) because we wanted to minimize any overwhelm the interviewee may feel and we realized that it did not provide additional substantive information compared to the other parts of the question (see Appendix D, Question #4). Self-reflection and the perspective of an evaluative figure were more important to us than how a peer might view the interviewee. In addition, we felt that some students might not have had enough experience in their studies to have received any feedback from their peers. As previous interviewees, we felt this part of the question may subject interviewees to unnecessary added pressure to prove their worth in a superficially professionalized manner. 

Similarly, we changed the following question, “Please share with us what diversity, equity, and inclusion mean to you, and how these values relate to academic librarianship,” to “Please share how you engage with diversity, equity, and inclusion in your current work or studies, and how you hope to bring DEI into this position and academic librarianship (see Appendix D, Question #5). First, we felt that the initial way the question was worded referred too vaguely to DEI and academic librarianship, and this would limit the opportunity to have a productive conversation with the interviewee. Karen and Estefania felt that it was too impersonal for us to get a sense of who the candidate was and how they could uniquely contribute to and benefit from the apprenticeship. We also felt that it was making the candidate espouse broad and generalized statements about DEI, and we pointed out that this would end up enforcing surface-level commitment to DEI that we have witnessed and experienced at other institutions and in the field. We knew that this was not Megan’s intention nor goal with valuing DEI in the Ask A Librarian Apprenticeship, so we reworded the question in a way that would invite authentic reflections on DEI. 

Additionally, when reviewing the interview questions, Estefania reflected back to her experience of feeling very nervous going into the interview. She shared that she was filled with doubt and anxiety and tried to combat this by endlessly researching CU Boulder Libraries and potential interview questions. While she agreed to some extent that this research was necessary and strengthened her responses to the interview questions overall, when given the opportunity to participate in revising the interview questions, she advocated for sharing the interview questions before the interview. We agreed that this would give candidates an opportunity to ease their interview anxiety and help them prepare their responses in a more constructive way. Megan shared interview questions with each interviewee the day before their interview in the Fall 2023 hiring cycle. 

Engaging in the interview process

Everyone agreed that it would be important to specify that having cameras on during the video interview was optional. We believed this option would minimize barriers people may have to applying to positions, such as nervousness or inability to find an appropriate space due to other commitments, for example. However, we kept in mind that if hired, the full use of technology would be critical in order to engage fully with the apprenticeship. The option to have cameras on or off was communicated in an email to applicants confirming their interview time.

Including a current apprentice in the interview process as an interviewer and as someone crafting documentation was incredibly beneficial. The apprentice reflected on their own experience and improved the interview questions, clarifying and adjusting them when needed to help interviewees further express themselves and showcase their candidacy. This robust and organic apprentice involvement in the interview process allowed us to gain a deeper sense of the person interviewing for the apprenticeship, and not just reduce them to numbers and ranking. In particular, for the updated question “Please share how you engage with diversity, equity, and inclusion in your current work or studies, and how you hope to bring DEI into this position and academic librarianship,” Karen also added a phrase of “You are welcome to share any lived experiences” for the first person we interviewed. Megan really appreciated how that question was phrased, and encouraged Karen to continue to phrase this question in this modified version. Pivoting for the rest of the interviews seemed to have a positive effect, as applicants were keen to share their lived experiences of DEI as well, especially if they did not have a lot of experience working with DEI in the workplace. This change also reinforced our strategy of modifying the initial interview question in order to elicit more authentic reflections on DEI within the apprenticeship. Both Megan and Karen hoped that this change set the stage for a better interview experience overall for this current round of recruitment.

Having a current student apprentice as part of the interview process further provided a mentorship opportunity on how to reduce bias in the interview process. For one candidate in particular, Karen had asked about their potential fit in the organization and apprenticeship. Megan was gracious enough to take the time to respond by giving Karen an institutional resource on the importance of interrogating what someone means by “fit” and to actually have criteria for this in order to mitigate personal bias in the hiring process as much as possible. Megan reinforced the importance of a holistic and equity-informed application rubric that both apprentices worked to improve so that fit bias would not be an issue. She also shared with Karen a resource from CU Boulder’s website on the different types of biases that may appear in the hiring process (e.g., beauty bias, institutional bias, etc.) and how to develop a plan to recognize these (Department of Environmental Studies, n.d.). This example highlights the mentorship opportunities afforded by including apprentices in the hiring process, along with the potential to ultimately create a more supportive and equitable academic librarianship landscape. 

Reviewing and ranking interviewees to choose a finalist

After the interviews, a few candidates were highly ranked by both Karen and Megan, necessitating a need for further discussion and prioritizing who we would extend an offer to. Reviewing both of our numbered rankings and qualitative observations helped in checking our assumptions and reevaluating our assessment of the whole application. Even with the standardization of the application and interview process to efficiently and fairly narrow down the list of candidates to one finalist, we had to review the qualitative measures within our ranking system to make sure we were taking the whole person into full consideration after all interviews had taken place. Specifically, the comments section of both the application and interview rubrics helped us to appropriately and fairly incorporate the human aspects in this decision making process to choose the finalist. 

Given that the apprenticeship seeks to fill gaps in LIS students’ experiences and education, Megan was initially unsure if a particular applicant would truly benefit from the apprenticeship because they already had some experience in an academic library setting. However, Karen noted that, although this candidate had academic library experience, they did not specifically have reference experience and would benefit from filling that gap through this apprenticeship. In making this decision, Karen thought about the pressures students face as they are getting ready to apply to jobs, and thus was keenly aware that specific experiences for a skill or role plays a key role in being considered for and obtaining future employment. As a result, pointing this out influenced Megan’s perspective about the value of the apprenticeship for the candidate, and this candidate was ultimately hired. Including the perspective of a student throughout the hiring process highlights how one fellow student in a position of power can advocate for another and helps to deconstruct any assumptions about student needs, goals, and readiness for a position. Ultimately, by taking into account current student experiences and embracing a whole-person approach, we created a more positive hiring process for all in making an informed decision on the final candidate. 

Reflections from the other side of the hiring process

From the early stages of the application process, Janelle felt optimistic that the values and climate of CU Boulder Libraries would align with what she hoped for in an employer. Janelle heard about the apprenticeship from Estefania, who she knew through a student group at their institution for aspiring librarians of color. Based on Estefania’s comments about her experience, Janelle sensed that the apprenticeship would be a great work environment and an ideal opportunity to learn more about academic librarianship. 

When Janelle went to apply in Summer 2023, she was struck by how approachable the job posting was. Unlike a number of position descriptions she encountered, when reading the Ask a Librarian posting, she thought to herself, “Wow, I definitely meet all of those requirements! I feel very confident about applying.” Particularly for internships and apprenticeships where training and learning is an integral part of a student’s experience, it is helpful when postings are transparent about the skills and mindset required for a position, while framing these requirements in a way that encourages students to apply.

Janelle also remembers the interview process as a positive experience. In her professional career, she recalls only one other interview where she received the questions in advance. In both cases, receiving the questions beforehand allowed her to enter the interview feeling more at ease, having ideas of what she could discuss for each question. She appreciated how welcoming Megan and Karen were, which helped create a supportive environment during the interview. Although she had initial nerves (as with most interviews), as the interview progressed she became more comfortable due to how Megan and Karen facilitated the interview. She was unable to ask all of her questions during the 30-minute interview, and so at Megan and Karen’s encouragement, she emailed her questions to them afterward. She appreciated the depth of their responses, and found it very helpful to be able to ask Karen directly about her experience with the apprenticeship.

During the interview process, it was clear that Karen was an active participant, and not just an observer. Beyond simply asking questions, Karen was very engaged and present in the interview process, which was a role Janelle had not seen a student occupy before. To Janelle’s knowledge, students are not typically embedded in the hiring process to this extent, although as mentioned above, thoughtfully involving students in the hiring process brings benefits to everyone involved. Seeing Karen’s significant involvement in the hiring process indicated to Janelle that her input and perspectives were valued, and showed her the potential that CU Boulder Libraries apprentices have to be active and respected participants in projects and tasks as important as hiring a new student employee.

As an apprentice who was hired through a process that included active involvement from a current apprentice, Janelle experienced firsthand the benefits of this approach to hiring. From learning about the apprenticeship from Estefania, to asking Megan and Karen questions about the apprenticeship, and to actually working in the position, the apprenticeship experience has met her original expectations. Throughout the hiring process, Janelle gained a good sense of the culture at CU Boulder Libraries, which made her feel confident and excited when starting the position. As a current Ask A Librarian apprentice, her opinions and experiences are valued, and she has had opportunities to challenge herself while receiving guidance and support. This speaks to the apprenticeship’s strength in empowering emerging librarians so that they have increased confidence when starting out in full-time positions. 


For professional librarian positions, we often hear the phrase that “interviewing is a two-way street”–the institution is interviewing the applicant and the applicant is interviewing the institution. By interrogating our hiring processes for graduate student positions, we can help foster that “two-way street” mentality at the student employment level as well. Understanding how individual academic institutions can differ, we anticipate that libraries can customize incorporating LIS students in the hiring process based on their needs. Through the course of writing this article, we have also recognized how our hiring process can improve in future hiring cycles. We would like to offer some recommendations for you as we consider how we may continue to iterate upon the hiring processes we outlined above: 

  • Introduce students to what you have to offer. Host a drop-in information session for potential applicants to learn about the apprenticeship before applying. At CU Boulder, we envision Megan sharing some information about the apprenticeship during the first part of an information session then leaving so that applicants may openly ask past and current apprentices about their experiences, Megan’s supervisory style and level of support, and how any institutional issues have impacted them. Apprentices can also share what projects they have worked on, specific accomplishments they achieved, and what they learned through the apprenticeship. Such a session is also a great time to introduce potential applicants to the values of the institution and share how the apprenticeship aligns and supports the mission, vision, and values of the library. The goal of this session is transparency and we encourage readers to consider ways that their hiring processes may be more transparent. 
  • Offer alternative opportunities as a source of continued support. Include links to similar apprenticeship opportunities or other professional development opportunities in emails to candidates who are not chosen for the position. As artificial intelligence is already changing the ways in which candidates draft their documentation and apply for jobs, this is an important time for the field of library science to consider how such tools may be used effectively by LIS student applicants. An applicant rejection email may include links to AI tools which could support crafting stronger application documentation for future job opportunities. Offer to connect applicants to colleagues you know if their current geographic region or work align with the LIS student’s career goals. Leveraging your networks and making connections to others in the LIS field can be a helpful source of support for LIS student applicants as they pursue other experiences in the field. 
  • Be open, invite critique, make changes, and repeat. We regularly reflect on our hiring processes and we suggest that, immediately after hire, the incoming apprentice is invited to consider the hiring process they just experienced, provide feedback on it, and suggest changes. Student applicant perspectives are invaluable and need to be honored in order to improve processes for future applicants. 
  • Build community among apprentices and highlight their value to the institution. Host debrief sessions where apprentices can share updates on project work, collectively explore successes and challenges, and socialize. An aspirational improvement to CU Boulder’s apprenticeship is inviting the cohort of apprentices for a site visit to explore the physical library and campus spaces that they will answer questions about through chat reference, and to build community. Administrative support and funding for such a site visit or for other professional development opportunities (e.g., attending conferences, funding book purchases to build a student’s professional library) signal that the library values LIS student labor and sees the apprenticeship as an important component of the professional journey to invest in. While requesting funds to support these opportunities may seem intimidating, we encourage you to ask, even if you think the answer will be “no” or “not yet.” We view such funding requests as acts of advocacy and we believe that advocating for ourselves is inherently advocating for others.
  • Think critically and reflect often about the ways the traditional power structures inherent to hiring practices may be disrupted. We appreciate the suggestions of Eamon Tewell, our external reviewer, in considering the possibility of apprentices exclusively leading the hiring process and offering a final hiring recommendation to HR, rather than offering Megan input which leads to her making the final hiring decision as the apprenticeship coordinator. The hiring process could also afford an opportunity for LIS student applicants to interview the highest levels of the library hierarchy before they are even hired. While we provide an opportunity for apprentices to “pick a Dean” to meet with within their first few months of CU Boulder apprenticeship as a way to challenge feelings of intimidation prior to the high stakes meeting with library leadership during their first post-graduate job interview, we can intentionally place a meeting with library leadership prior to the apprenticeship hire so that applicants can learn about leadership’s priorities as they consider if they want to accept an offer to join the institution. 
  • Foster student support networks. Many LIS students were encouraged to apply to the CU Boulder apprenticeship based on the encouragement of peers. Such informal, word of mouth networks are crucial supports for students as they navigate library school and the job search process. Building upon these informal networks while also acknowledging the competing priorities faced by many students, we would ideally like to see student-run listservs, job boards, a dedicated group (similar to the “We Here” Facebook group) for students, and a library Green Book for LIS students which provides information on the quality of mentorship, culture, and institutional support at libraries that employ LIS students.  
  • Expand networks and community among LIS mentors. We would also like to see the development of a community of practice which focuses on LIS student mentorship. Some support may be found for mentors affiliated with specific programs (e.g., the ARL Kaleidoscope Program), in informal networks, and at related gatherings such as the relatively new Conference on Academic Library Management which is hosting its fourth conference in 2024. However, currently, there is not a distinct source of community and support for mentors of LIS students more broadly. 


We hope that the documented hiring practices of CU Boulder’s Ask a Librarian Apprenticeship can act as a testimony for how to improve practical learning experiences for LIS students. We encourage academic libraries to advocate for and invest in paid employment opportunities such as apprenticeships, and when possible, to invite students to participate in the hiring process to provide a realistic work experience that will be valuable when students enter the job market. The benefits of including apprentices in the hiring process are apparent and abundant. Their input can foster inclusion in the hiring process by providing reflection and reassessment of job ads, recruiting, and the interview process. In turn, current apprentices help to reduce barriers for student applicants throughout the hiring process. Also, when applicants see apprentices deeply embedded in the hiring process, it can reflect positively on the institution’s culture, and help applicants feel at ease, knowing they can speak directly with a fellow student about the position and to see if it would benefit their professional goals. 

One of the most meaningful aspects of CU Boulder’s apprenticeship program is its iterative nature. The evolution of our hiring practices embodies this iterative approach and highlights the value of LIS student perspectives and experiences in academic library settings. We hope the curiosity and growth embodied in our own apprenticeship will be mirrored across the profession as more institutions and librarians think deeply about the opportunities they can provide to LIS students.


The authors would like to thank the colleagues who helped to make this article into the piece you are reading today, especially our ITLWTLP editor Jess Schomberg, ITLWTLP peer-reviewer Jaena Rae Cabrera, and our external reviewer, Eamon Tewell, whose invaluable feedback challenged us to interrogate our practices more deeply. This work is the culmination of various rounds of hiring and input from past Ask a Librarian Apprentices; we would like to honor their contributions to improving our hiring practices over the years. In an article so strongly focused on the power of mentorship and succeeding in the academic library job search, we also want to thank all of the mentors who have helped to shape our library journeys: Dawn Harris, Lisa Hopkins, Jamie Lin, Victoria Adjei, Nicole Finzer, Laura Alagna, Kana Jenkins, Motoko Lezec, Kirsten Gaffke, Kimberly Go, Ann Ku, Elise Wu, Noriko Asato, Renee Hill, Carisse Berryhill, Craig Chapin, Arianna Alcaraz, Ray Pun, Tsione Wolde-Michael, Steve Adams, Katrina Fenlon, Alison Oswald, Irene Lewis, Noriko Sanefuji, Steve Hoke, Bill and Nancy Stragand, Farah Nageer-Kanthor, Sharon Friedman, Meredith Bowers, Patrice Folke, Sheila and George Madison, Rose Tabbs, Twanna Hodge, Xiaoli Ma, Gama Viesca, Jennifer Knievel, and Karen Sobel.


American Library Association. (2012). Diversity Counts.

Cunningham, S., Guss, S., & Stout, J. (2019). Challenging the ‘good fit’ narrative: Creating inclusive recruitment practices in academic libraries. Recasting the Narrative: The Proceedings of the ACRL 2019 Conference, April 10–13, 2019, Cleveland, Ohio, 12–21.

Department of Environmental Studies. (n.d.) Develop a plan to recognize and mitigate bias.

Galvan, A. (2015). Soliciting performance, hiding bias: Whiteness and librarianship. In the Library with the Lead Pipe.

Goodsett, M., & Koziura, A. (2016). Are library science programs preparing new librarians? Creating a sustainable and vibrant librarian community. Journal of Library Administration, 56(6), 697–721.

Harper, L. M. (2020). Recruitment and retention strategies of LIS students and professionals from underrepresented groups in the United States. Library Management, 41(2/3), 67–77.

Holler, J. L. R. (2020). Equity budgeting: A manifesto. Marion Voices Folklife + Oral History.

Houk, K., & Nielsen, J. (2023). Inclusive hiring in academic libraries: A qualitative analysis of attitudes and reflections of search committee members. College & Research Libraries, 84(4).

Lacy, M., & Copeland, A. J. (2013). The role of mentorship programs in LIS education and in professional development. Journal of Education for Library & Information Science, 54(1), 135–146.

Lewey, T. D., & Moody-Goo, H. (2018). Designing a meaningful reference and instruction internship: The MLIS student perspective. Reference & User Services Quarterly, 57(4), 238–241.

Shah, M., & Fife, D. (2023). Obstacles and barriers in hiring: Rethinking the process to open doors. College & Research Libraries News, 84(2).

Wang, K., Kratcha, K. B., Yin, W., & Tewell, E. (2022). Redesigning an academic library internship program with equity in mind: Reflections and takeaways. College & Research Libraries News, 83(9).

Welsh, M. E., & Knievel, J. (2022). Navigating in the fog: Shining a light on the library job search process. Atla Summary of Proceedings, 9–14. 

Wildenhaus, K. (2019). Wages for intern work: Denormalizing unpaid positions in archives and libraries. Journal of Critical Library and Information Studies, 2(1), Article 1.

Appendix A: 

Job Ad Used in the Fall 2023 Hiring Cycle

Apprenticeship Announcement

University of Colorado (CU) Boulder Libraries

Ask A Librarian Apprenticeship (Virtual)

Approximately 12 hrs/week throughout Fall 2023 – Spring 2024 academic year, $19-$20/hr 


Gain practical professional experience in a robust academic library. The CU Boulder University Libraries is looking to hire an Ask A Librarian Apprentice who will receive training in research competencies, staff the Ask A Librarian virtual chat service two evenings from 5-8pm MT (Mondays & Wednesdays) and one weekend day from 1-5pm MT each week (Sundays), participate in special projects based on professional interests under the mentorship of a faculty librarian, and explore issues relevant to new academic librarians through professional development opportunities. The successful candidate will provide virtual research assistance in a major academic library that serves a world-class research university. This position is a great opportunity to supplement your graduate studies with experiential learning and explore the field of academic librarianship. 


  • Provide virtual research assistance through our Ask Us! chat service
  • Attend trainings, workshops, and meetings on a virtual meeting platform  
  • Participate in special projects based on professional interests and availability, under the mentorship of the Ask A Librarian Apprenticeship supervisor
  • Explore other internal and external opportunities for professional development, including research, writing, publishing, and presentations, based on interest and availability


  • Currently enrolled as a library & information science graduate student for the duration of the apprenticeship. 
  • Candidates must be eligible to work in the United States at time of hire.
  • Maintain a strong customer service orientation and a desire to provide high quality research assistance.
  • Demonstrate interest in the principles of diversity, equity, inclusion, accessibility, and social justice, and how these relate to the mission and values of CU Boulder’s University Libraries.
  • Interest in exploring a career in academic librarianship.

Additional Information

A core goal of the apprenticeship program is to invite and encourage involvement of MLIS students from traditionally underrepresented groups in academic librarianship. BIPOC (Black, Indigenous, and People of Color) MLIS students are highly encouraged to apply. 

This program begins with trainings which can occur around your schedule in August 2023 and staffing virtual reference shifts with experienced colleagues from mid August through mid-September 2023. The Ask a Librarian Apprentice is expected to complete approximately 12 hours of work per week, including virtual reference shifts on nights and weekends (schedule to be finalized at point of hire), project work, and professional development. The Apprentice will be paid $19-$20/hr and will work through the Fall 2023 – Spring 2024 academic year. For full consideration, please apply by Monday, June 26, 2023. A course schedule providing proof of enrollment in a library science graduate program is required at the time of hire.

To apply, please submit the following documents:

  1. Cover Letter
  2. Resume or CV

Send application materials with “Ask A Librarian Apprenticeship Application” in the subject line to

Appendix B: 

Rubric Used to Evaluate Application Materials in the Fall 2023 Hiring Cycle

Apprentice Application Rubric
This rubric will help to quantify the credentials we seek and determine which applicants we should invite to interview.
Evaluator: _____________________________________
Candidate name: _______________________________
The applicant is currently enrolled as a Masters of Library & Information Science graduate student.
0 = not currently enrolled
1 = enrolled for part of the apprenticeship
2 = enrolled for the duration of the apprenticeship
Has the applicant completed at least one semester/quarter?
No, they are entering their first semester/quarter
They have completed one semester/quarter
Yes, they have completed two semesters/quarters or more
Does the applicant have reference or customer service experience, or did they discuss customer service mindset/philosophy?
0 = No customer service/reference experience; didn’t discuss
2 = Discussed reference/customer service experience/philosophy
Does the applicant already have a position similar to the apprenticeship?
0 = Yes, either formerly or currently employed in a role similar to CU’s apprenticeship
2 = No, the applicant has not had nor is currently employed in a role similar to CU’s apprenticeship
Did the applicant demonstrate or discuss interest in the principles of diversity, equity, inclusion, accessibility, and social justice, and how these relate to the mission and values of CU Boulder’s University Libraries?
0 = No discussion or evidence of DEIA principles
2 = Demonstrated interest around DEIA principles
Discussed interest in exploring a career in academic librarianship.
0 = No, didn’t discuss
1 = Yes, did discuss
Does the apprenticeship fill gaps in the applicant’s training and experience (e.g., would the apprenticeship provide reference experience that they desire but currently don’t have?)?
No, the applicant has a wealth of academic library experience already
Yes, the apprenticeship would fill an important gap
Comments: _______________________________________________________

Appendix C: 

Rubric Used to Evaluate Interviews During the Fall 2023 Hiring Cycle

Apprentice Interview Rubric
This rubric will help us think about candidate responses to questions and rank them to determine a finalist.

Evaluator: _____________________________________

Candidate name: _______________________________

On a scale of 1 to 5, how well did interviewees address each question? 

What motivates you to explore the field of academic librarianship? 
1 = Did not answer 5 = Answer far exceeded expectations!

Tell us about yourself and any past experiences, such as course work or work experience, that would help you in this position. 
1 = Did not answer 5 = Answer far exceeded expectations!

What is your approach to reference/research assistance services? 
1 = Did not answer 5 = Answer far exceeded expectations!

Think of a time where you facilitated a particularly positive customer service interaction. What about that situation went well? What qualities contributed to a positive interaction?
1 = Did not answer 5 = Answer far exceeded expectations!

Please share how you engage with diversity, equity, and inclusion in your current work or studies, and how you hope to bring DEI into this position and academic librarianship.
1 = Did not answer 5 = Answer far exceeded expectations!

If you were to describe yourself in three adjectives or short descriptive phrases, what would they be? If a past teacher/professor/supervisor were to describe you in three adjectives or short descriptive phrases, what would they be?
1 = Did not answer 5 = Answer far exceeded expectations!

What makes the apprenticeship appealing to you?
1 = Did not answer 5 = Answer far exceeded expectations!

Through December 2023, this position will require 2 evening shifts from 5-8pm MT, including Monday and Wednesday evenings, and one weekend shift on Sundays, each week. Based on your schedule, does this work for you? 
Yes No Other: ___________________

Did the interviewee ask questions? (use “other” to describe if there was no time left to ask questions)
Yes No Other: ___________________

Overall reactions/Comments: _______________________________________________________

Suggested ranking: ______

Appendix D: 

Interview Questions Used in Fall 2023 Hiring Cycle

  1. What motivates you to explore the field of academic librarianship? 
  1. Tell us about yourself and any past experiences, such as course work or work experience, that would help you in this position. 
  1. What is your approach to reference/research assistance services? (If they don’t have reference experience: Could you describe any other experience you have providing customer service in a virtual environment, or how your in-person customer service experience might transfer to a virtual environment?) 
  1. This question has two parts: Think of a time where you facilitated a particularly positive customer service interaction (If they are struggling: maybe you were the customer, maybe you were the one providing the service).
    • What about that situation went well?
    • What qualities contributed to a positive interaction?
  1. Please share how you engage with diversity, equity, and inclusion in your current work or studies, and how you hope to bring DEI into this position and academic librarianship. 
  1. This question has two parts:
    • If you were to describe yourself in three adjectives or short descriptive phrases, what would they be?
    • If a past teacher/professor/supervisor were to describe you in three adjectives or short descriptive phrases, what would they be?
  1. What makes the apprenticeship appealing to you?
  1. Through December 2023, this position will require 2 evening shifts from 5-8pm MT, including Monday and Wednesday evenings, and one weekend shift on Sundays, each week. Based on your schedule, does this work for you?  
  1. What questions do you have for us?

Comments Policy:

In the Library with the Lead Pipe welcomes substantive discussion about the content of published articles. This includes critical feedback. However, comments that are personal attacks or harassment will not be posted. All comments are moderated before posting to ensure that they comply with the Code of Conduct. The editorial board reviews comments on an infrequent schedule (and sometimes WordPress eats comments), so if you have submitted a comment that abides by the Code of Conduct and it hasn’t been posted within a week, please email us at itlwtlp at gmail dot com!

Reflection: The first half of my sixth year at GitLab: helping other (Support Engineers) and leaving Support / Cynthia Ng

It’s a bit mind boggling to me that I’m talking about my sixth year at GitLab. It’s not quite been half (5 months), but as I’m internally transferring out of Support, I thought this was a good place to break up “the year”. First time readers may want to check out my previous reflection posts. … Continue reading "Reflection: The first half of my sixth year at GitLab: helping other (Support Engineers) and leaving Support"

Video Game Preservation / David Rosenthal

I have written fairly often about the problems of preserving video games, most recently last year in in Video Game History. It was based upon Phil Salvador's Survey of the Video Game Reissue Market in the United States. Salvador's main focus was on classic console games but he noted a specific problem with more recent releases:
The largest major platform shutdown in recent memory is the closure of the digital stores for the Nintendo 3DS and Wii U platforms. Nintendo shut down the 3DS and Wii U eShops on March 27, 2023, resulting in the removal of 2,413 digital titles. Although many of these are likely available on other platforms, Video Games Chronicle estimates that over 1,000 of those games were exclusive to those platforms’ digital stores and are no longer available in any form, including first-party Nintendo titles like Dr. Luigi, Dillon’s Rolling Western, Mario & Donkey Kong: Minis on the Move, and Pokémon Rumble U. The closures also affected around 500 historical games reissued by Nintendo through their Virtual Console storefronts, over 300 of which are believed not available on any other platform or service.
Below the fold I discuss recent developments in this area.

Salvador writes:
Games released during the digital game distribution era may have content or features tied to online services, which may be (and regularly are) deactivated. According to researcher James Newman, this is sometimes employed by game publishers as a deliberate strategy to devalue used games, shorten their distribution window, and encourage sales of new titles, which has ominous implications for preservation.
This post started with Timothy Geigner's One YouTuber’s Quest For Political Action To Preserve Old Video Games. He lays out the problem:
it’s probably long past time that there be some sort of political action to address the real or potential disappearance of cultural output that is occurring. The way this works far too often is that a publisher releases a game that is either an entirely online game, or an offline game that requires backend server calls or connections to make it work. People by those games. Then, some time down the road, the publisher decides supporting the game is no longer profitable and shuts the servers down on its end, disappearing the purchased game either completely, or else limiting what was previously available. Those that bought or subscribed to the game are left with no options.
The trigger for renewed attention to this problem was Ubisoft's April 1st(!) delisting of The Crew:
With The Crew, millions of copies of the game were played around the world. When Ubisoft delisted the game late last year, the game became unplayable. On top of that, because of copyright law, it would be illegal for fans to keep the game alive themselves by running their own servers, even assuming they had the source code necessary to do so. So fans of the game who still want to play it are stuck.
Kenneth Shephard reported in A Ubisoft Game Is At The Center Of A Fight To Stop Online Game Shutdowns that this triggered an effort to respond:
Ross Scott, who runs Accursed Farms, posted a 31-minute video on the channel, which outlines the problem and how he believes drawing attention to The Crew’s April 1 shutdown could cause governments to enact greater consumer protections for people who purchase online games. As laid out in the video, consumer rights for these situations vary in different countries. France, however, has some pretty robust consumer laws, and Ubisoft is based there.
Here is Scott's video:
Scott surveyed countries around the world looking for the prospects of two possible actions:
  • Lobbying the country's consumer protection agency to take action on the grounds that the company purported to sell the game but failed to make it clear that the game could be rendered useless at any time without warning or compensation.
  • Petitioning the government to take legal or legislative action against this practice.
These actions would be aimed at ensuring that after a vendor's support for a game it sold ended:
  • Games sold must be left in a functional state.
  • Games sold must require no further connection to the publisher or affiliated parties to function.
  • The above also applies to games that have sold microtransactions to customers.
  • The above cannot be superseded by end user license agreements.
These appear to match Scott's goal of a minimal set of consumer rights that is hard to argue against. If the game publishers can't live with them they can always explicitly rent the game instead of selling it. Renting makes it clear that the purchaser's rights are time-limited.

More details of the campaign can be found at Scott's Stop Killing Games website. Scott has so far posted three update videos to his Accursed Farms YouTube channel:
The idea that purchasers are entitled to be told at least the minimum lifetime of the gane is interesting. Ubisoft's excuse for delisting The Crew was that their licenses to music and other content featured in the game had expired. But that implies that, at launch, they knew that they would delist the game at the date when their licenses were due to expire. To make an informed purchase decision customers needed to know that date. Not providing it on the box was clearly deceptive.

Publishers would hate a requirement to put a "playable until" date on the box; it would clearly reduce purchases. They might find Scott's requirements less onerous.

Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 10 June 2024 / HangingTogether

The following post is one in a regular series on issues of Inclusion, Diversity, Equity, and Accessibility, compiled by a team of OCLC contributors.

Knowledge equity, and the role of ontologies 

A purple petunia growing between the cracks of a sidewalk.Photo by Ted Balmer on Unsplash

Wikimedia Deutschland (WMDE) shared key findings from their Knowledge Equity in Linked Open Data project. They discovered that while Wikidata has immense potential for sharing knowledge, it still carries over structural and historical inequities from Wikipedia. The project involved community members working with marginalized knowledge, who faced challenges fitting their knowledge into Wikidata’s Western, academic perspectives. As a result, these communities have started building their own knowledge graphs, finding a sense of freedom and safety in expressing knowledge that reflects their needs. However, the report highlights high barriers to developing the necessary expertise due to scattered documentation and limited technical support. Additionally, the lack of mobile-friendly interfaces further hinders access for marginalized communities who heavily rely on mobile internet. 

Last month we wrote about OCLC Research’s engagement with the community around the WorldCat ontology. Findings in the WMDE report from align well with what we learned, which is that library-based ontologies can exclude other worldviews. In Wikidata, this has led communities to create focused ontologies that represent marginalized knowledge in ways that reflect community epistemologies. It would be useful for those of us working to reimagine descriptive workflows to consider the barriers identified in this report.  Contributed by Richard J. Urban. 

Houston’s LGBTQ history in radio archives 

A piece from NPR’s Morning Edition, Saving Houston’s LGBTQ history through thousands of hours of radio archives, highlights the important role of audio-visual collections in documenting culture and history. As is so often the case, a group of dedicated community members kept and safeguarded the fragile cassette recordings for over thirty years, when they were painstakingly digitized by University of Houston archivists Emily Vinson and Bethany Scott.  

As we kick off Pride month in the US, this story helps to illuminate the important role of radio, and how it is not only a vital medium for communities but also plays a vital role in reflecting history and experiences. The piece also gives some insight into how difficult it is to migrate delicate audiovisual formats to digital before it is too late. (As a sidenote, Vinson also presented on her work with A/V backlogs in a 2020 Works in Progress Webinar: Approaches to Processing Audiovisual Archives for Improved Access and Preservation Planning. Contributed by Merrilee Proffitt

Patterns in library censorship 

In recent weeks, former librarian Kelly Jensen of Book Riot, someone who always tracks the pulse of censorship in the United States, has written a series of pieces doing exactly that.  Each one deserves to be read and absorbed.  In “Are Librarians Criminals?  These Bills Would Make Them So: Book Censorship News, May 3, 2024,” Jensen looks at some of the anti-library — and anti-librarian — legislation under consideration or enacted in eighteen states. “Here’s Where Library Workers are Prohibited From Their Own Professional Organization: Book Censorship News, May 24, 2024,” highlights the bills that seek to keep library workers from becoming part of the American Library Association.  Because ALA is the organization that accredits library and information studies programs across the U.S., Canada, and Puerto Rico, among many other things, these anti-ALA efforts threaten to deprofessionalize library work.  Lest you fear that it’s all bad news, Jensen also tells “How Alabama Library Supporters Took Action and You Can, Too: Book Censorship News, June 7, 2024,” and the story that “Colorado Passes Anti-Book Ban Bill for Public Libraries.” 

The insightful and vital work of Kelly Jensen has been noted in “Advancing IDEAs” on16 April 2024, “Book censorship in academic, public, and school libraries,” on 7 March 2023, “During comic book challenges.”  Her invaluable “Book Censorship News” series notwithstanding, she also shares happier themes, especially regarding Young Adult literature, gifts for the bookish, leisure reading suggestions, and other stuff for those who love books and libraries.  Contributed by Jay Weitz. 

FAIR + CARE survey: establishing current data practices  

The FAIR + CARE Cultural Heritage Network is a new project with the aim to develop, disseminate, and promote ethical good practice guidance and digital data governance models integrating FAIR (Findable, Accessible, Interoperable, and Reusable) Data with CARE (Collective benefit, Authority to control, Responsibility, and Ethics) Principles for Indigenous Data Governance. The FAIR+CARE network aims to reconcile the principles of both standards for future incorporation into data governance models that are both socially and technically compliant and compassionate, with a focus on data related to Indigenous and other descendant communities. In 2021, Hanging Together reported on an event hosted by the OCLC RLP and the National and State Libraries Australia (NSLA) on the CARE Principles.   

A project survey is open until 30 June 2024 and and invites respondents to share their collection, management, preservation, curation, sharing and storage of cultural information practices to gather a landscape of what the field is currently doing.  

The FAIR+CARE principles are valuable for cultural heritage and cultural resources manager. They are equally valuable for libraries, archives and museums that hold cultural heritage objects and information about them as we navigate the world when data reuse and social and cultural expectations are growing, and sometimes conflict. Better guidance for all cultural information professionals is sorely needed to be respectful and transparent in their practice now and into the future. Contributed by Lesley A. Langa 

The post Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 10 June 2024 appeared first on Hanging Together.

#ODDStories 2024 @ Goma, DRCongo 🇨🇩 / Open Knowledge Foundation

On March 05th, 2024 in the Lake Tanganyika coastal city of Baraka in the eastern Democratic Republic of the Congo, Disaster Risk Management in Africa – DRM Africa, an initiative which strengthens the community resilience to natural and anthropogenic hazards in the African Great Lakes region since 2012, with financial support from Open Knowledge Foundation, held an Open Data Day event entitled “Open Data for Risk-informed societies”, in response to the ongoing Lake Tanganyika’s water rise.

Since 2017, adverse effects of climate change have dramatically increased in Lake Tanganyika’s coastal areas disrupting the social and economic tissue of all riparian countries in the African Great Lakes region (Tanzania, Burundi, Zambia and the Democratic Republic of the Congo).

While 2021’s most disastrous rapid rise of Lake Tanganyika has affected over 50,000 basic infrastructures in the Congolese coastal city of Baraka and left up to 50,000 homeless on its shorelines, the nowadays water rise has already surpassed last year’s level which already affected thousands additional basic infrastructures all coastal cities according to reports from the Congolese local disaster management agency.

The proposed event’s overall goal was to strengthen the community’s resilience to the adverse impact of lake Tanganyika’s rapid rise in harnessing the power of open data to address the pressing increase the community’s level of preparedness and ensure a sustainable and resilient future for coastal communities. By leveraging climate collected and available public data, the project aimed at raising awareness, facilitating informed decision-making, and implementing practical solutions to protect vulnerable coastal communities in the riparian countries especially the Congolese coastal city of Baraka and neighborhoods.


In order to reach the proposed project’s goal, the following activities were carried out:

  • Identification of already flooded and flood-prone areas in the city of Baraka: the activity aimed at collecting data in the field to identify and map areas threatened by coastal floods and those already flooded. We collected data such as basic infrastructures already flooded in flood-prone areas in counties of Mwemezi, AEBAZ, Matata and Moma in the city of Baraka.
  • Flood Risk Management capacity building: the activity aimed at using data collected in the field to build the capacity of vulnerable communities in flood risk management. Up to 50 participants including local authorities, and representatives of flood-prone quarters, have participated in the risk management capacity-building workshop. The activity was animated and moderated by myself, Kashindi Pierre, Founding CEO of DRM Africa.


After the project’s completion, the following are outcomes raised:

  • Increased knowledge in flood risk management for up to 50 participants, including vulnerable communities exposed to river and coastal floods from the city of Baraka and neighborhoods.
  • Up to 5,000 basic infrastructures were identified and mapped in the city of Baraka and neighborhoods.
  • Active involvement of local communities in addressing the adverse effects of lake Tanganyika’s rapid rise.

About Open Data Day

Open Data Day (ODD) is an annual celebration of open data all over the world. Groups from many countries create local events on the day where they will use open data in their communities.

As a way to increase the representation of different cultures, since 2023 we offer the opportunity for organisations to host an Open Data Day event on the best date within a one-week period. In 2024, a total of 287 events happened all over the world between March 2nd-8th, in 60+ countries using 15 different languages.

All outputs are open for everyone to use and re-use.

In 2024, Open Data Day was also a part of the HOT OpenSummit ’23-24 initiative, a creative programme of global event collaborations that leverages experience, passion and connection to drive strong networks and collective action across the humanitarian open mapping movement

For more information, you can reach out to the Open Knowledge Foundation team by emailing You can also join the Open Data Day Google Group to ask for advice or share tips and get connected with others.

Curves on a Coordinate Axis / Ed Summers

The narrator, in Tokarczuk’s Flights, explains their difficulty in studying psychology, which I think is also a good commentary on the difficulty of layering quantitative methods over qualitative ones, and the tyranny of categories more generally:

How was I supposed to analyze others when it was hard enough for me to get through all those tests? Personality diagnostics, surveys, multiple columns on multiple-choice questions all struck me as too hard. I noticed this handicap of mine right away, which is why at university, whenever we were analyzing each other for practice, I would give all of my answers at random, whatever happened to occur to me. I’d wind up with the strangest personality profiles–curves on a coordinate axis. “Do you believe that the best decision is also the decision that is easiest to change?” Do I believe? What kind of decision? Change? When? Easiest how? “When you walk into a room, do you tend to head for the middle or the edges?” What room? And when? Is the room empty, or are there plush red couches in it? What about the windows? What kind of view do they have? The book question: Would I rather read one than go to a party, or does it also depend on what kind of book it is and what kind of party?

What a methodology! It is tacitly assumed that people don’t know themselves, but that if you furnish them with questions that are smart enough, they’ll be able to figure themselves out. They pose themselves a question, and they give themselves an answer. And they’ll inadvertently reveal to themselves that secret they knew nothing of till now.

And there is that other assumption, which is terribly dangerous–that we are constant, and that our reactions can be predicted. (Tokarczuk, 2019, pp. 14–15)

It reminds me of a poem by another Polish Nobel Prize winner, Wisława Szymborska’s A Word on Statistics. We can unlock new understanding with words, but we need to enter into them first.


Tokarczuk, O. (2019). Flights. (J. Croft, Trans.) (First Riverhead trade paperback edition). New York: Riverhead Books.

Graduate hourly-paid job: chemistry expert for a computer information system design project (summer 2024) / Jodi Schneider

Prof. Jodi Schneider’s Information Quality Lab <> seeks a paid graduate hourly researcher ($25/hour) to be a chemistry expert for a computer information system design project. Your work will help us understand a computational chemistry protocol by Willoughby, Jansma, and Hoye (2014 Nature Protocols), and the papers citing this protocol. A code glitch impacted part of the Python script for the protocol; our computer information system aims to determine which citing papers might have been impacted by the code glitch, based on reading the papers.

The project can start as soon as possible and needs to be completed in July or early August 2024. We expect your work to take 15 to 20 hours, paid at $25/hour for University of Illinois Urbana-Champaign graduate students. 


  • Read and understand a computational chemistry protocol (Willoughby et al. 2014)
  • Read Bhandari Neupane et al. (2019) to understand the nature of the code glitch
  • Make decisions about whether the main findings are at risk for citing publications. You’ll read sentences around citations to ~80 citing publications.
  • Work with an information scientist to design a decision tree to capture the decision-making process.

Required Qualifications

  • Enrolled in a graduate program (Master’s or PhD) in chemistry at University of Illinois Urbana-Champaign and/or background in chemistry sufficient to understand Willoughby et al. (2014) and Bhandari Neupane et al. (2019)
  • Good verbal and written communication skills
  • Interest and/or experience in collaboration

Preferred Qualifications

  • Experience in computational chemistry (quantum chemistry or molecular dynamics) preferred
  • Interest in informatics or computer systems preferred

How to apply

Please email your CV and a few sentences about your interest in the project to Prof. Jodi Schneider ( Application review will start June 10, 2024 and continue until the position is filled.

Sample citation sentence for Willoughby et al. 2014

“Perhaps one of the most well-known and almost mandatory “to-read” papers for those initial practitioners of the discipline is a 2014 Nature Protocols report by Willoughby, Jansma, and Hoye (WJH).10 In this magnificent piece of work, a detailed 26-step protocol was described, showing how to make the overall NMR calculation procedure up to the final decision on the structure elucidation.”

from: Marcarino, M. O., Zanardi, M. M., & Sarotti, A. M. (2020). The risks of automation: A study on DFT energy miscalculations and its consequences in NMR-based structural elucidation. Organic Letters, 22(9), 3561–3565.


Bhandari Neupane, J., Neupane, R. P., Luo, Y., Yoshida, W. Y., Sun, R., & Williams, P. G. (2019). Characterization of Leptazolines A–D, polar oxazolines from the cyanobacterium Leptolyngbya sp., reveals a glitch with the “Willoughby–Hoye” scripts for calculating NMR chemical shifts. Organic Letters, 21(20), 8449–8453.

Willoughby, P. H., Jansma, M. J., & Hoye, T. R. (2014). A guide to small-molecule structure assignment through computation of (1H and 13C) NMR chemical shifts. Nature Protocols, 9(3), Article 3.

Distant Reader Catalog / Distant Reader Blog


About a year ago I implemented a traditional library catalog against content of the Distant Reader. I used Koha to do this work, and the process was almost trivial. Moreover, the implementation suits all of my needs. Kudos to the Koha community!


About a year ago I got an automated email message from OCLC, and to paraphrase, it said, "Your collection has been successfully updated and added to WorldCat." I asked myself, "What collection?" After a bit of digging around, I discovered a few OAI-PMH data repositories I had submitted to OCLC many years ago, and these repositories contained the content being updated. Through the process of this discovery, I learned I have an OCLC symbol, ININI , and after wrestling authentication procedures, I was able to edit my profile. Fun!

I then got to thinking, "I am able to programmatically create and edit MARC records. I am able to bring up an online catalog. Koha supports OAI-PMH. I could create MARC records describing the content of the Distant Reader, import them into Koha, and ultimately have them become a part of WorldCat. Hmmm..." So I did.


My first step was to create a virtual computer runing Ubuntu, because Ubuntu is the preferred flavor of Linux supported by Koha. I spun up a virtual computer at Digital Ocean. It has 2 cores, 4 GB of RAM, and 60 GB of disk space. Tiny, by my standards. This generates an ongoing personal monthly expense of something like $25.

The next step was to install Koha . This took practice; I had to destroy my virtual machine a few times, and I had to re-install Koha a few times, but all-in-all the process worked as advertised. Again, it was not difficult, it just took practice. I was able to get Koha installed in less than a few days. I could probably do it now in less than eight hours.

The third step was to add records to the catalog. This required me to first use Koha's administrative interface to create authorized terms for both local collections and data types. I then wrote a set of scripts to create MARC records from my cache of content. These scripts were written against curated databases describing: 1) etexts from Project Gutenberg, 2) PDF files from DOAJ journals, 3) articles on the topic of COVID from a data set called CORD-19, and 4) TEI files from a project call Early Print. In each case, I looped through the given database, read the desired metadata, and output MARC records amenable to Koha. At the end of this proces, I had created about .3 million records. The small sample of linked records exemplify the data I created. Simple. Rudimentary. Functional.

To actually load the records I wrote two tiny shell scripts -- both front-ends to Koha's routine. The first front-end simply deletes records. Given a set of MARC records, the second front-end imports them. This importing process is very efficient. Read record. Parse it. Add parsed data to database. After a number of configured records have been added, add them to the index. Repeat for all records. Somebody really knew what they were doing when they wrote


Now that records have been loaded and indexed, the catalog can be queried. For the most part, I use the advanced search interface for this purpose because I'm usually interested in searching within my set of collections. Search results are easily limited by facets. Detailed results point to the original/canonical items as well as the local cache. See the screen shots below:

advanced search interface

results page

details page

What's even better is Koha's support for OAI-PMH. Just use Koha's administrative interface to turn OAI-PMH on, and the entire collection becomes available. My catalog's OAI-PMH data root is located at Returning to OCLC, I updated my collection of repositories to include a pointer to the catalog's OAI-PMH root URL, and by the time you read this I believe I will have added my .3 million records to WorldCat.


The process of creating a traditional library catalog of Distant Reader content was easy: 1) spin up a virtual machine, 2) install Koha, 3) create/edit MARC records, 4) add them to Koha, 5) go to step #3. The process is never done. Finally, you can use the catalog at It is not fast, but it is functional, very. Again, "Kudos to the Koha community!"

Student Note: ChatGPT Ate My Homework. Can LLMs Generate Compelling Case Briefs? / Harvard Library Innovation Lab

The Library Innovation Lab welcomes a range of research assistants and fellows to our team to conduct independently-driven research which intersects in some way to our core work.
The following is a reflection written by Chris Shen, a Research Assistant who collaborated with members of LIL in the spring semester of 2024. Chris is a sophomore at Harvard College studying Statistics and Government.

From poetry to Python, LLMs have the potential to drastically influence human productivity. Could AI also revolutionize legal education and streamline case understanding?

I think so.

A New Frontier

The advent of Large Language Models (LLMs), spearheaded by the release of OpenAI’s ChatGPT in late 2022, have prompted universities to adapt in order to responsibly harness their potential. Harvard instituted guidelines, requiring professors to include a “GPT policy” inside their syllabus.

As students, we read a ton. A quick look at the course catalog published by Harvard Law School (HLS) reveals that many classes require readings of up to 200 pages per week. This sometimes prompts students to turn to summarization tools as a way to help quickly digest content and expedite that process.

LLMs show promising summarization capabilities, and are increasingly used in that context.

Yet, while these models have shown general flexibility with handling various inputs, “hallucination” issues continue to arise, in which outputs generate or reference information that doesn’t exist. Researchers also debate the accuracy of LLMs as context windows continue to grow, highlighting potential mishaps in identifying and retaining important information in increasingly long prompts.

When it comes to legal writing, which is often extensive and detail-oriented, how do we go about understanding a legal case? How do we avoid hallucination and accuracy issues? What are the most important aspects to consider?

Most importantly, how can LLMs play a role in simplifying the process for students?

Initial Inspirations

In high school, I had the opportunity to intern at the law firm Hilton Parker LLC, where I drafted declarations, briefs, demand letters, and more. Cases ranged from personal injury, discrimination, wills and affairs, medical complications, and more. I sat in on depositions, met with clients, and saw the law first-hand, something few high schoolers experience.

Yet, no matter the case I got, one thing remained the same –– the ability to write well in a style I had never been exposed to before. But, before one can write, one must first read and understand.

Back when I was an intern, there was no ChatGPT, and I skimmed hundreds of cases by hand.

Therefore, when I found out that the Harvard Library Innovation Lab (LIL) was conducting research into harnessing LLMs to understand and summarize fundamental legal cases, I was deeply intrigued.

During my time at LIL, I have been researching a method to simplify that task, allowing students to streamline their understanding in a new and efficient way. Let’s dive in.

Optimal Outputs

I chose case briefs as the final product over other forms of summarization, like headnotes or legal blurbs, due to the standardized nature of case briefs. Writing case briefs is not explicitly taught to many, if not most law students, yet it is implicitly expected by law professors to keep up with the pace of courses during 1L.

While these briefs typically are not turned in, they are heavily relied upon during class to answer questions, engage in discussion, and offer analytical reflections. Even so, many students no longer write their own briefs, using cookie-cutter resources behind paywalled services like Quimbee, LexisNexis, and West-Law, or even student-run repositories such as TooDope.

This experiment dives into creating succinct original case briefs that contain the most important details of each case, and go beyond the scope of so-called “canned briefs”. But what does it take to write one in the first place?

There are typically 7 dimensions of a standard case brief offered by LexisNexis:

  • Facts (name of the case and its parties, what happened factually and procedurally, and the judgment)
  • Procedural History (what events within the court system led to the present case)
  • Issues (what is in dispute)
  • Holding (the applied rule of law)
  • Rationale (reasons for the holding)
  • Disposition (the current status or final outcome of the case)
  • Analysis (influence)

I used Open AI’s GPT-4 Turbo model preview (gpt-4-0125-preview) to experiment with a two-pronged approach to generate case briefs matching the above criteria. The first prompt was designed both as a vehicle for the full transcript of the court opinion to summarize and as a way of giving the model precise instructions on generating a case brief that reflects the 7 dimensions. The second prompt serves as an evaluation prompt, asking the model to evaluate its work and apply corrections as needed. These instructions were based on guidelines from Rutgers Law School and other sources.

When considering legal LLM summarization, another critical element is reproducibility. I don’t want a slight change in prompt vocabulary to alter the resulting output completely. I have observed that, before applying the evaluative prompt, case briefs would be disorganized or often random in the elements the LLM would produce. For example, information related to specific concurring or dissenting judges would be missed, analyses would be shortened, and inconsistent formatting would be prevalent. Sometimes even the most generic “Summarize this case” prompts would produce slightly better briefs!

However, an additional evaluative prompt now standardizes outputs and ensures critical details are captured. Below is a brief illustration of this process along with the prompts used.

Diagram: Two-prompt system for generating case briefs using an LLM. Diagram: Two-prompt system for generating case briefs using an LLM.

See: Initial and Evaluative prompts

Finally, after testing various temperature and max_token levels, I settled on the values 0.1 and 1500, respectively. I discovered that lower temperatures best suit the professional nature of legal writing, and a 1500 maximum output window allowed the LLM to produce all necessary elements of a case brief without including additional “fluff”.

Old vs. New

To test this apparatus, I picked five fundamental constitutional law cases from the SCOTUS that most 1L students are expected to analyze and understand. These include Marbury v. Madison (1803), Dred Scott v. Sandford (1857), Plessy v. Ferguson (1896), Brown v. Board of Education (1954), and Miranda v. Arizona (1966).

Results of each case brief are below.

Of course, I also tested the model on cases no LLM had ever seen before. This would ensure that our approach could still produce quality briefs past the knowledge cut-off for our model, which was December 2023 in this case. These include Trump v. Anderson (2024) and Lindke v. Freed (2024).

Results of each case brief are below, with attributes –– temperature = 0.1. max_bits = 1500.

Applying a critical eye to the case briefs, I see a successful adherence to structure and how the model has outputted case details consistently. There is also a clearly succinct tone that allows students to grasp core rulings and their significance without getting overrun with excessive details. This is particularly useful for discussion review and exam preparation. Further, I find the contextual knowledge presented, such as in Dred Scott v. Sandford, allow students to understand cases beyond mere fact and holding but also broader implications.

However, I also see limitations in the outputs. For starters, there is a lack of in-depth analysis, particularly for the concurring or dissenting opinions. Information on precedents used is skimmed over and there is a scarcity of substantive arguments presented. In the example of Marbury v. Madison, jurisdictional insights are also left out, which are vital for understanding the procedural and strategic decisions made in the case. Particularly for cases unknown to the model, there is evidence of speculative language that can occur due to incomplete information, prompt ambiguity, or other biases.

So, what’s next?

Moving forward, I’m excited to submit sample case briefs to law students and professors to receive comments and recommendations. Further, I plan to compare our briefs against “canned” ones from resources like Quimbee and gather external feedback on what makes them better or worse, where our advantage lies, and ultimately equip law students in effective and equitable ways.

Based on initial observations, I also see potential for users to interact with the model in more depth. Thought-provoking questions such as “How has this precedent changed over time?”, “What other cases are relevant to this one?”, “Would the resulting decision change in today’s climate?”, and more, will hopefully allow students to dive deeper into cases instead of just scratching the surface.

While I may still be early in the process, I firmly believe a future version of this tool could become a streamlined method of understanding cases, old and new.

I’d like to extend a special thank you for the contributions of Matteo Cargnelutti, Sankalp Bhatnagar, George He, and the rest of the Harvard LIL for their support and continued feedback throughout this journey.

#ODDStories 2024 @ Bouaké, Côte d’Ivoire 🇨🇮 / Open Knowledge Foundation

As part of Open Data Day 2024, the YouthMappersUAO chapter (Côte d’Ivoire), in collaboration with the Open Knowledge Foundation (OKFN), organized a water point mapping activity in the city of Bouaké called “Water Point Mapping Day”.  The event aimed to train and build the capacity of participants to collect data using Open Source tools, and to map water points in the city of Bouaké. The activity took place over two (2) days. Friday 08 and Saturday 09 March 2024 in Bouaké. On the first day, we were in the American Corner room of the University Alassane Ouattara de Bouaké (UAO), and on the second day, we were in the “cité forestière” in the town of Bouaké. The twenty (2O) participants were reminded of the context of the day: Open Data Day (ODD), the Open Knowledge Foundation, the prerogatives of the YouthMappers community in general, but also the vision of the YouthMappersUAO chapter. They were amazed by the ideology of Open Data, especially for our developing countries, but especially for students enrolled in operational research frameworks in various university courses.

Presentation of the YouthMappersUAO Chapter and Open Data Day by Victorien Kouadio, Interim President of the YouthMappersUAO Chapter

After the introductions, the day’s data collection tools were presented. Mardoché Azongnibo presented the two (2) applications to be used to carry out the activity. These were #Osmtracker and #KoboCollect. A practical session was organized for the two (2) applications. We were able to observe the applications’ settings, including GPS accuracy for good-quality coordinates. After this session, we went out into the field to collect data in groups of four (4) teams, with one team of 5 members.

Final instructions from Dr. Mardoché Azongnibo and departure for collection in the various zones

The day continued with the essential part of the activity, which focused on data collection. In practice, the town of Bouaké, with a surface area of 177,000 hectares, has been divided into four (4) sub-areas. Bouaké South-East, Bouaké South-West, Bouaké North-East and Bouaké North-West. Given the size of the city, we collected data in two (2) parts of the city: Bouaké South-East and Bouaké South-West.

The applications examined were put to good use when collecting data in the field. Participants were able to collect different types of water points throughout the day in the respective areas. From wells to streams, all were collected to update the map of water points in the city of Bouaké. A form edited in Kobotoolbox was used to categorize the different types of water points during the collection with KoboCollect.

A total of 163 water points were collected in Bouaké south. Each participant played an essential role in this collaborative activity.

Breakdown of water points collected and updated during the WaterPointMapping Day organized as part of Open Data Day

Most of the 163 water points collected were built by the community. 88 water points were built by the community, 27 by religious leaders, and 19 by religious leaders. The rest were built by the government and natural water points. Thanks to the commitment and willingness of all concerned, a significant amount of data has been collected, enabling a more accurate and comprehensive map of the water points in the southern part of the city to be created.

The balanced involvement of men and women demonstrates the importance of diversity and inclusion in such community initiatives.

Spatial distribution of water points by category

In conclusion, the event was a great success and provided valuable information in the field of water points. This accurate information will be used to create a participatory map of farmers’ access to water during periods of drought in Bouaké, to improve both the quantity and quality of the OpenStreetMap database. The event was greatly appreciated by the participants, who shared their experiences and knowledge, enabling the public to get involved and learn.

About Open Data Day

Open Data Day (ODD) is an annual celebration of open data all over the world. Groups from many countries create local events on the day where they will use open data in their communities.

As a way to increase the representation of different cultures, since 2023 we offer the opportunity for organisations to host an Open Data Day event on the best date within a one-week period. In 2024, a total of 287 events happened all over the world between March 2nd-8th, in 60+ countries using 15 different languages.

All outputs are open for everyone to use and re-use.

In 2024, Open Data Day was also a part of the HOT OpenSummit ’23-24 initiative, a creative programme of global event collaborations that leverages experience, passion and connection to drive strong networks and collective action across the humanitarian open mapping movement

For more information, you can reach out to the Open Knowledge Foundation team by emailing You can also join the Open Data Day Google Group to ask for advice or share tips and get connected with others.

The library beyond the library / HangingTogether

This post was co-authored with Rebecca Bryant and Richard Urban.

Image of a yellow traffic sign indicating merging, from two directions into one.“Center lanes merge” from Wikimedia Commons

Research libraries have changed radically over the past thirty years. The library of the past was primarily focused on managing an “outside-in” collection of externally purchased materials made available to local users. This was a well-understood role for the research library, and one that was recognized and valued by the library’s stakeholders, including university administrators, other campus units, and faculty and students. In carrying out this collections-focused mission, the library functioned more or less autonomously on campus as the primary provider of collections-related services. Of course, research libraries did act in collaboration with other libraries in supporting certain aspects of collection management, particularly resource sharing and cooperative cataloging. 

Today, libraries still manage important local collections for use chiefly by local users, but with less insularity and more connection to the network: think of shared print programs, collective collections, and the “inside-out” collection (e.g., digitized special collections, electronic theses and dissertations (ETDs), and research datasets). At the same time, the library has become increasingly engaged in the university research enterprise through an expanding array of research support services, assuming new responsibilities in areas such as institutional repositories, research data management (RDM), institutional reputation management through researcher profiles and research information management, and bibliometrics and research impact services. Activities in these areas are often closely aligned with and directly advance institutional needs and priorities, such as decision support.  

OCLC Research has documented these shifts through its research on collective collections, the evolving scholarly record, research support services, and more. This work has led us to two observations: 

  1. Libraries are increasingly engaged in partnerships with other units across campus in order to address new responsibilities in emerging areas of research support. 
  1. For many of these new responsibilities carried out in the context of cross-campus partnerships, the library role, contribution, and value proposition is not clearly defined or recognized by other campus stakeholders. 

In many instances, the partnerships libraries are forming with other units on campus are new, ad hoc, and sometimes experimental, and the roles, responsibilities, administrative organization, and even the partners involved are often in flux and vary from institution to institution. But we also observe examples of more formalized arrangements emerging (more on this below). Looking ahead, we expect that library engagement in these cross-campus partnerships will need to be accompanied by: 

  1. New operational structures that formalize and facilitate library engagement with other campus units to support the university research enterprise. 
  1. Clear articulations of the library value proposition as it is manifested within the context of these new operational structures. 

The emergence of these new operational structures and value propositions are the foundation of what we call the Library Beyond the Library. Research libraries are engaging in new operational structures that extend beyond the confines of library hierarchies. Through these new structures, libraries are projecting their skills, expertise, services, and roles beyond the library into the broader campus environment, in partnership with other parts of the institution. As libraries support institutional priorities through these new channels, they will need to find ways to communicate an increasingly complex value proposition to campus stakeholders who may be unfamiliar with the library’s new roles and responsibilities. 

The Library Beyond the Library conceptual model is closely aligned with our previous OCLC research on social interoperability. We define social interoperability as the creation and maintenance of working relationships across individuals and organizational units that promote collaboration, communication, and mutual understanding. In many ways, social interoperability is about strengthening the “people skills” needed to support robust cross-unit partnerships that increasingly involve the library. Our work on this topic highlighted the need for improved social interoperability between the library and other campus units in the context of deploying and sustaining research support services.  

But ad hoc cross-campus partnerships are maturing into new operational structures. In this sense, the Library Beyond the Library is an amplification of social interoperability, moving beyond personal relationships to more formal connections that can outlast the tenure of specific individuals, and moving beyond partnerships built on temporary, project-focused goals to more permanent arrangements that become part of the institution’s operational structure. 

The Library Beyond the Library is not about changes in internal library organizational structures. These have been evolving, too (see, for example, Ithaka S+R’s report on library organizational structures, which provides strong evidence of the expansion of library capacities and positions in research support). But there seems to be less recognition and documentation of evolving operational structures where library services and expertise extend beyond the library and across the campus enterprise, in collaboration with non-library units. Many research libraries will find these structures increasingly germane to carrying out their mission in a landscape of new roles, responsibilities, and institutional priorities.  

Navigating these changes effectively is an important strategic and risk management consideration for libraries: failure to do so may result in diminished resources, impact, and influence, with a value proposition that becomes increasingly opaque to the rest of the institution. In light of this, extending the library beyond the library is something that we not only observe, but also advise as a strategy for ensuring ongoing library visibility and impact.  

Although it is still early days, there are some examples where new operational structures involving the library and other campus units have emerged: 

  • The University of Waterloo Library has invested in a Bibliometrics and Research (BRI) Librarian who not only monitors institutional performance and provides analysis for institutional leaders, but also serves as the leader of a campus-wide community of practice around research analytics. Through this leadership role, the BRI librarian provides consultation and expert guidance to other campus units using research analytics tools, leveraging a new operational structure that engages other parts of the institution and extends library expertise and influence. 
  • Saskia Scheltjens, head of the Research Services department at the Rijksmuseum and chief librarian of the Rijksmuseum Research Library, joined that institution in 2016 to establish a new research services unit and combine several existing departments. The resulting research services unit is built around the research library, where digitized collections, digital scholarship, digital knowledge production and sharing, as well as digital learning and communications, act in unison with a world-famous physical collection. Saskia has described how “the library needed to be more than a library,” and it now sits at the center of a new “fundamental hybrid reality,” where the library extends its services and expertise beyond the traditional library collection.  
  • At the University of Manchester, the library is extending its role and leadership for research support with the establishment of a new Office of Open Research (OOR). This new unit supports institutional strategic goals to create a more open and responsible research environment, and the OOR website provides a single point of contact for researchers to connect with services provided not only by the library but by other units on campus. The library is positioned at the center—and as a leader—of campus open research activities. While Manchester seems to be the first UK institution with this type of Open Research unit, other institutions are moving in a similar direction: for example, Sheffield University has also been recently been recruiting for a director to lead a new Office of Open Research and Scholarship. 
  • At Montana State University, a new Research Alliance, composed of both library and non-library research support units, is collocated in the library. This partnership includes non-library units in research development, research cyberinfrastructure, and undergraduate research, in addition to library scholarly communications and data management offerings. Each unit retains its place in the existing campus hierarchy, but the library is operationally positioned as the hub of research support for the institution. 
  • At the University of Illinois Urbana-Champaign, the library manages a research information management system (RIMS) that is financially supported by the Office of the Vice Chancellor for Research. By managing a registry of the institutional scholarly record, the library extends its expertise with bibliographic metadata to manage not just library collections, but also faculty profiles, patents, honors, research facilities and equipment, and more, combining data maintained by other campus stakeholders to create a knowledge graph that can inform enterprise-level strategic directions, make expertise discoverable, and support institutional reputation management.

These examples reflect the two key characteristics of the Library Beyond the Library conceptual model:  

  • The partnerships in which the library engages to provide research support services have been formalized into new operational structures that combine the capacities of library and non-library units. A novel operational configuration was created that transcends traditional administrative boundaries, and reflects the array of units around campus contributing toward provision of the services – including the library.  
  • The new units closely connect library value propositions with institutional priorities. For example, Manchester’s Office of Open Research emphasizes that “[t]he University supports the principles of Open Research and researchers are encouraged to apply these throughout the research lifecycle. While engagement with the principles is voluntary, the University expects researchers to act in accordance with funder mandates.” Similarly, Montana State’s Research Alliance makes clear that it brings together units around campus for the purpose of “working together to support and increase the excellence of the university’s research enterprise.” The library’s contribution to these units is surfaced in light of key institutional priorities. 

The Library Beyond the Library is the focus of a new research project at OCLC. Our goal is to describe and illustrate these key changes in library operational structures and value proposition through models and examples. We will also provide an assessment of future directions for libraries regarding these changes, and where possible, suggest gaps and opportunities for data, tools, and other types of operational infrastructure. 

This work builds upon past research at OCLC related to research support (especially research data management) where we have observed the trends underpinning the Library Beyond the Library. But we believe that the main ideas – cross-campus partnerships formalized into new operational structures, along with new articulations of the library’s value proposition – can be extended to other areas of strategic interest to libraries as well.  

To inform our work, we are convening an invitational discussion as part of the OCLC Research Library Partnership (RLP) Leadership Roundtable on Research Support during the week of 17 June, where RLP affiliates will discuss how their libraries are collaborating with other campus stakeholders to provide research support services. Participants have been asked to consider:   

  • How are the library’s research support services evolving in response to university priorities? 
  • How is your library partnering with other campus stakeholders to achieve institutional and library goals in the research support space? 
  • Have cross-campus partnerships in research support led to, or will they lead to, new operational structures? 

RLP Leadership Roundtables provide an opportunity for partner institutions to share information and benchmark services and goals while providing OCLC Research with information to synthesize and share with the broader library community. Participants must be nominated by the RLP institutional partner representative. The RLP Leadership Roundtable on Research Support first convened in March, to discuss current practices and challenges in the provision of bibliometric and research impact services. This gathering was attended by 51 individuals from 33 RLP member institutions in four countries, and highlights from the discussion were synthesized in a recent post.  

We encourage participation from all RLP partner institutions in this upcoming discussion, which will help us refine and expand the ideas in this post as we continue to explore what they mean for libraries and their futures. As with the previous roundtable, we will synthesize the conversation in a blog post for the broader library community. If you have questions about nominations or participation, please contact Rebecca Bryant. We hope to see you there!  

The post The library beyond the library appeared first on Hanging Together.

Running Song of the Day / Eric Hellman

(I'm blogging my journey to the 2024 New York Marathon. You can help me get there.)

Steve Jobs gave me back my music. Thanks Steve!

I got my first iPod a bit more than 20 years ago. It was a 3rd generation iPod, the first version with an all-touch control. I loved that I could play my Bruce, my Courtney, my Heads and my Alanis at an appropriate volume without bothering any of my classical-music-only family. Looking back on it, there was a period of about five years when I didn't regularly listen to music. I had stopped commuting to work by car, and though commuting was no fun, it had kept me in touch with my music. No wonder those 5 years were such a difficult period of my life!

Today, my running and my music are entwined. My latest (and last 😢) iPod already has some retro cred. It's a 6th generation iPod Nano. I listen to to my music on 90% of my runs and 90% of my listening is on my runs. I use shuffle mode so that over the course of a year of running, I'll listen to 2/3 of my ~2500 song library. In 2023, I listened to 1,723 songs. That's a lot of running!

Yes, I keep track. I have a system to maintain a 150 song playlist for running. I periodically replace all the songs I've heard in the most recent 2 months (unless I've listened to the song less than 5 times - you need at least that many plays to become acquainted with a song!) This is one of the ways I channel certain of my quirkier programmerish tendencies so that I project as a relatively normal person. Or at least I try.

Last November, I decided to do something new (for me). I made a running playlist! Carefully selected to have the right cadence and to inspire the run! It was ordered to have to have particular songs play at appropriate points of the Ashenfelter 8K  on Thanksgiving morning. It started with "Born to Run" and ended with either "Save it for Later", "Breathless" or "It's The End Of The World As We Know It", depending on my finishing time. It worked OK. I finished with Exene. I had never run with a playlist before.

1. "Born to Run". 2. "American Land". The first part of the race is uphill, so an immigrant song seemed appropriate. 3. "Wake Up" - Arcade Fire. Can't get complacent. 4. "Twist & Crawl - The Beat. The up-tempo pushed me to the fastest part of the race. 5. "Night". Up and over the hill. "you run sad and free until all you can see is the night".  6. "Rock Lobster" - B-52s. The perfect beats per minute.  7. "Shake It Up" - Taylor Swift. A bit of focused anger helps my energy level. 8. "Roulette". Recommended by the Nuts, and yes it was good. Shouting a short lyric helps me run faster. 9. "Workin' on the Highway". The 4th mile of 5 is the hardest, so "all day long I don't stop". 10. "Your Sister Can't Twist" - Elton John. A short nasty hill. 11. "Save it for Later" - The Beat. I could run all day to this, but "sooner or later your legs give way, you hit the ground." 12. "Breathless" - X. If I had hit my goal of 45 minutes, I would have crossed the finish as this started, but I was very happy with 46:12. and a 9:14 pace. 13. "It's The End Of The World As We Know It" - R.E.M. 48 minutes would not have been the end of the world, but I'd feel fine.

Last year, I started to extract a line from the music I had listened to during my run to use as the Strava title for the run. Through September 3, I would choose a line from a Springsteen song (he had to take a health timeout after that). For my New Year's resolution, I promised to credit the song and the artist in my run descriptions as well.

I find now that with many songs, they remind me of the place where I was running when I listened to them. And running in certain places now reminds me of particular songs. I'm training the neural network in my head. I prefer to think of it as creating a web of connections, invisible strings, you might say, that enrich my experience of life. In other words, I'm creating art. And if you follow my Strava, the connections you make to my runs and my songs become part of this little collective art project. Thanks!

Reminder: I'm earning my way into the NYC Marathon by raising money for Amref. 

This series of posts:

The Great MEV Heist / David Rosenthal

The Department of Justice indicted two brothers for exploiting mechanisms supporting Ethereum's "Maximal Extractable Value" (MEV). Ashley Berlanger's MIT students stole $25M in seconds by exploiting ETH blockchain bug, DOJ says explains:
Anton, 24, and James Peraire-Bueno, 28, were arrested Tuesday, charged with conspiracy to commit wire fraud, wire fraud, and conspiracy to commit money laundering. Each brother faces "a maximum penalty of 20 years in prison for each count," the DOJ said.

The alleged scheme was launched in December 2022 by the brothers, who studied at MIT, after months of planning, the indictment said. The pair seemingly relied on their "specialized skills" and expertise in crypto trading to fraudulently gain access to "pending private transactions" on the blockchain, then "used that access to alter certain transactions and obtain their victims’ cryptocurrency," the DOJ said
Below the fold I look into the details of the exploit as alleged in the indictment, and what it suggests about the evolution of Ethereum.


Lets start with some history. The key issue with MEV is that the architecture of decentralized cryptocurrencies enables a form of front-running, which Wikipedia defines thus:
Front running, also known as tailgating, is the prohibited practice of entering into an equity (stock) trade, option, futures contract, derivative, or security-based swap to capitalize on advance, nonpublic knowledge of a large ("block") pending transaction that will influence the price of the underlying security. ... A front running firm either buys for its own account before filling customer buy orders that drive up the price, or sells for its own account before filling customer sell orders that drive down the price. Front running is prohibited since the front-runner profits from nonpublic information, at the expense of its own customers, the block trade, or the public market.
Note that the reason it is illegal in these markets is that, at the time the front-runner enters their order, the customer's order is known only to them. It is thus "material non-public information". Arguably, high-frequency traders front-run by placing their computers so close to the market's computers that the information about orders on which they trade has not in practice had time to "become public".

I wrote about front-running in cryptocurrencies, describing how it was different, in 2020's The Order Flow:
In order to be truly decentralized, each miner must choose for itself which transactions to include in the next block. So there has to be a pool of pending transactions visible to all miners, and thus to the public. It is called the mempool. How do miners choose transactions to include? Each transaction in the pool contains a fee, payable to the miner who includes it. Miners are coin-operated, they choose the transactions with the highest fees. The mempool concept is essential to the goal of a decentralized, trustless cryptocurrency.
The pool of pending transactions is public, thus front-running is arguably legal and anyone can do it by offering a larger fee. Ethereum's block time is 12 seconds, plenty of time for bots to find suitable transactions in the mempool. It normally contains a lot of pending transactions. Ethereum is currently processing about 1.12M transactions/day (46.7K/hr) and there are around 166K pending transactions, or about 3.6 hours worth. Bitcoin is processing about 700K transactions/day and there are normally around 100K transactions in the mempool, or 3.5 hours worth.

Arguably, this is analogous to high-frequency trading, not front-running by brokers. In The Order Flow I recount how the prevalence of high-frequency trading led institutions to set up dark pools:
When conventional “lit” markets became overrun with HFT bots, investment banks offered large investors “dark pools” where they could trade with each other without the risk of being front-run by algos. But Barclays allowed HFT bots into its dark pool, where they happily front-run unsuspecting investors who thought they were safe. Eventually Barclays was caught and forced to drain its dark pool. In 2016, it was fined $70 million for fraud. It was not the only large bank that accepted money from large investors to protect them from HFT bots and money from HFT traders to allow them access to the investors it was supposed to be protecting.
The Order Flow was in large part sparked by two accounts of attempts to avoid being front-run:
  • Ethereum is a Dark Forest by Dan Robinson and Georgios Konstantopoulos:
    In the Ethereum mempool, these apex predators take the form of “arbitrage bots.” Arbitrage bots monitor pending transactions and attempt to exploit profitable opportunities created by them. No white hat knows more about these bots than Phil Daian, the smart contract researcher who, along with his colleagues, wrote the Flash Boys 2.0 paper and coined the term “miner extractable value” (MEV).

    Phil once told me about a cosmic horror that he called a “generalized frontrunner.” Arbitrage bots typically look for specific types of transactions in the mempool (such a DEX trade or an oracle update) and try to frontrun them according to a predetermined algorithm. Generalized frontrunners look for any transaction that they could profitably frontrun by copying it and replacing addresses with their own.
    Their attempt to rescue about $12K failed because they didn't know a miner and thus couldn't avoid the dark forest in the mempool, where a front-runner bot found it.
  • And Escaping the Dark Forest, Samczsun's account of how:
    On September 15, 2020, a small group of people worked through the night to rescue over 9.6MM USD from a vulnerable smart contract.
    The key point of Samczsun's story is that, after the group spotted the vulnerability and built a transaction to rescue the funds, they could not put the rescue transaction in the mempool because it would have been front-run by a bot. They had to find a miner who would put the transaction in a block without it appearing in the mempool. In other words, their transaction needed a dark pool. And they had to trust the cooperative miner not to front-run it.

    Ths attempt succeeded because they did know a miner.
Reading both is essential to understand how adversarial the Ethereum environment is.

The 2019 paper that published the MEV concept was Flash Boys 2.0: Frontrunning, Transaction Reordering, and Consensus Instability in Decentralized Exchanges by Philip Daian et 7 al:
In this work, we explain that DEX [decentralized exchanges] design flaws threaten underlying blockchain security. We study a community of arbitrage bots that has arisen to exploit DEX flaws. We show that these bots exhibit many similar market-exploiting behaviors— frontrunning, aggressive latency optimization, etc.—common on Wall Street, as revealed in the popular Michael Lewis expose´ Flash Boys. We explore the DEX design flaws that spawned arbitrage bots, measure and model these bots’ behavior, and illuminate systemic smart-contract ecosystem risks implied by our observations.
Daian and co-authors describe five pathologies: Pure revenue opportunities, Priority gas auctions (PGAs), Miner-extractable value (MEV), Fee-based forking attacks, and Time-bandit attacks. Their results find two surprises:
First, they identify a concrete difference between the consensus-layer security model required for blockchain protocols securing simple payments and those securing smart contracts. In a payment system such as Bitcoin, all independent transactions in a block can be seen as executing atomically, making ordering generally unprofitable to manipulate. Our work shows that analyses of Bitcoin miner economics fail to extend to smart contract systems like Ethereum, and may even require modification once second-layer smart contract systems that depend on Bitcoin miners go live.

Second, our analysis of PGA games underscores that protocol details (such as miner selection criteria, P2P network composition, and more) can directly impact application-layer security and the fairness properties that smart contracts offer users. Smart contract security is often studied purely at the application layer, abstracting away low-level details like miner selection and P2P relayers’ behavior in order to make analysis tractable ... Our work shows that serious blind spots result. Low-level protocol behaviors pose fundamental challenges to developing robust smart contracts that protect users against exploitation by profit-maximizing miners and P2P relayers that may game contracts to subsidize attacks
Because it promised profits, MEV became the topic of a lot of research. By 2022, in Miners' Extractable Value I was able to review 10 papers about it.

Then came Ethereum's transition to Proof-of-Stake. As usual, Matt Levine provides a lucid explanation of the basics:
How does the blockchain decide which transactions to record, and in what order? In Ethereum, the answer is: with money. People who want to do transactions on the Ethereum network pay fees to execute the transactions; there is a flat base fee, but people can also bid more — a “priority fee” or “tip” — to get their transactions executed quickly. Every 12 seconds, some computer on the Ethereum network is selected to record the transactions in a block. This computer used to be called a “miner,” but in current proof-of-stake Ethereum blocks are recorded by computers called “validators.” Each block is compiled by one validator, selected more or less at random, called a “proposer”; the other validators vote to accept the block. The validators share the transaction fees, with the block proposer getting more than the other validators.

The block proposer will naturally prioritize the transactions that pay more fees, because then it will get more money. And, again, the validators are all computers; they will be programmed to select the transactions that pay them the most money. And in fact there is a division of labor in modern Ethereum, where a computer called a “block builder” puts together a list of transactions that will pay the most money to the validators, and then the block proposer proposes a block with that list so it can get paid.
Levine then gets into the details:
I am giving a simplistic and somewhat old-fashioned description of MEV, and modern Ethereum has a whole, like, institutional structure around it. There are private mempools, where you can hide transactions from bots. There is Flashbots, “a research and development organization formed to mitigate the negative externalities posed by Maximal Extractable Value (MEV) to stateful blockchains, starting with Ethereum,” which has things like MEV-Boost, which creates “a competitive block-building market” where validators can “maximize their staking reward by selling their blockspace to an open market,” and MEV-Share, “an open-source protocol for users, wallets, and applications to internalize the MEV that their transactions create,” letting them “selectively share data about their transactions with searchers who bid to include the transactions in bundles” and get paid.

What Is Alleged?

We have two explanations of what the brothers are alleged to have done, one from the DoJ's indictment and one from Flashbots, whose MEV-Boost software was exploited.

Dept. of Justice

The DoJ's indictment explains MEV-Boost:
  1. “MEV-Boost” is an open-source software designed to optimize the block-building process for Ethereum validators by establishing protocols for how transactions are organized into blocks. Approximately 90% of Ethereum validators use MEV-Boost.
  2. Using MEV-Boost, Ethereum validators outsource the block-building process to a network of “searchers,” “builders,” and “relays.” These participants operate pursuant to privacy and commitment protocols designed to ensure that each network participant—the searcher, the builder, and the validator—interacts in an ordered manner that maximizes value and network efficiency.
  3. A searcher is effectively a trader who scans the public mempool for profitable arbitrage opportunities using automated bots (“MEV Bots”). After identifying a profitable opportunity (that would, for example, increase the price of a given cryptocurrency), the searcher sends the builder a proposed “bundle” of transactions. following transactions in a precise order: The bundle typically consists of the (a) the searcher’s “frontrun” transaction, in which the searcher purchases some amount of cryptocurrency whose value the searcher expects to increase; (b) the pending transaction in the mempool that the MEV Bot identified would increase the price of that cryptocurrency; and (c) the searcher’s sell transaction, in which the searcher sells the cryptocurrency at a higher price than what the searcher initially paid in order to extract a trading profit. A builder receives bundles from various searchers and compiles them into a proposed block that maximizes MEV for the validator. The builder then sends the proposed block to a “relay.” A relay receives the proposed block from the builder and initially only submits the “blockheader” to the validator, which contains information about, among other things, the payment the validator will receive for validating the proposed block as structured by the builder. It is only after the validator makes this commitment through a digital signature that the relay releases the full content of the proposed block (i.e. — the complete ordered transaction list) to the validator.
  4. In this process, a relay acts in a manner similar to an escrow account, which temporarily maintains the otherwise private transaction data of the proposed block until the validator commits to publishing the block to the blockchain exactly as ordered. The relay will not release the transactions within the proposed block to the validator until the validator has confirmed through a digital signature that it will publish the proposed block as structured by the builder to the blockchain. Until the transactions within the proposed block are released to the validator, they remain private and are not publicly visible.
Note the importance of the relay maintaining the privacy of the transactions in the proposed block.

The indictment summarizes how the brothers are alleged to have stolen $25M:
  1. ANTON PERAIRE-BUENO and JAMES PERAIRE-BUENO took the following steps, among others, to plan and execute the Exploit: (a) establishing a series of Ethereum validators in a manner that concealed their identities through the use of shell companies, intermediary cryptocurrency addresses, foreign exchanges, and a privacy layer network; (b) deploying a series of test transactions or “bait transactions” designed to identify particular variables most likely to attract MEV Bots that would become the victims of the Exploit (collectively the “Victim Traders”); (c) identifying and exploiting a vulnerability in the MEV-Boost relay code that caused the relay to prematurely release the full content of a proposed block; (d) re-ordering the proposed block to the defendants’ advantage; and (e) publishing the re-ordered block to the Ethereum blockchain, which resulted in the theft of approximately $25 million in cryptocurrency from the Victim Traders.
The indictment adds:
  1. Tampering with these established MEV-Boost protocols, which are relied upon by the vast majority of Ethereum users, threatens the stability and integrity of the Ethereum blockchain for all network participants.
This statement has attracted attention. Why should the DoJ care about "the stability and integrity of the Ethereum blockchain"? Note that the brothers are not charged with this, the indictment has three counts:
  1. Wire fraud, Title 18, United States Code, Section 1349.
  2. Wire fraud, Title 18, United States Code, Sections 1343 and 2.
  3. Conspiracy to Commit Money Laundering, Title 18, United States Code, Section 1956(a)(1)(B)(i).
The steps in 11-14 are charged as wire fraud. The indictment then goes on to detail the steps they are alleged to have taken to launder the loot, leading to the money laundering charge.


Flashbots' explanation starts by explaining the role of a relay:
mev-boost works through a commit and reveal scheme where proposers commit to blocks created by builders without seeing their contents, by signing block headers. Only after a block header is signed are the block body and corresponding transactions revealed. A trusted third party called a relay facilitates this process. mev-boost is designed to allow block builders to send blocks that contain valuable MEV to validators without having to trust them. Removing the need for builders to trust validators ensures that every validator has equal access to MEV regardless of their size and is critical for ensuring the validator set of Ethereum remains decentralized.
Notice the traditional cryptocurrency gaslighting about "trustlessness" and "decentralization" in that paragraph:
  • It is true that by introducing a relay they have eliminated the need to trust the validators, but they have done so by introducing "a trusted third party called a relay". The exploit worked because the third party violated its trust. They would likely argue that, unlike the validators, the relay lacks financial incentives to cheat. But a malign relay could presumably also play the role of the malign proposer in the exploit.
  • ETH 5/21/24
    Saying "the validator set of Ethereum remains decentralized" implies that it is decentralized. It is certainly good that the switch to Proof-of-Stake has increaed Ethereum's Nakamoto coefficient from 2-3 to 5-6, as I pointed out last month in "Sufficiently Decentralized":
    A year ago the top 5 staking pools controlled 58.4%, now they control 44.7% of the stakes. But it is still true that block production is heavily centralized, with one producer claiming 57.9% of the rewards.
    But a Nakamoto coefficient of 6 isn't very decentralized. Further, this misses the point revealed by the brothers' exploit. With about 55% of execution clients running Geth and around 90% of validators trusting MEV-Boost's relaying, just to take two examples, the software stack is extremely vulnerable to bugs and supply chain attacks.
Flashbots then explain the bug the brothers exploited:
The attack on April 3rd, 2023 was possible because the exploited relay revealed block bodies to the proposer so long as the proposer correctly signed a block header. However, the relay did not check if the block header that was signed was valid. In the case that the block header was signed but invalid, the relay would attempt to publish the block to the beacon chain, where beacon nodes would reject it. Crucially, regardless of whether the block was rejected by beacon nodes or not, the relay would still reveal the body to the proposer.

Having access to the block body allowed the malicious proposer to extract transactions from the stolen block and use them in their own block where it could exploit those transactions. In particular, the malicious proposer constructed their own block that broke the sandwich bots’ sandwiches up and effectively stole their money.
Then they explain the mitigation:
Usually, proposers publishing a modified block would not only equivocate but their new block would have to race the relay block - which has a head start - to acquire attestations for the fork choice rule. However, in this case, the relay was not able to publish a block because the proposer returned an invalid block header. Therefore, the malicious proposer’s new block was uncontested and they won the race automatically. This has been addressed by requiring the relay to successfully publish a block, thereby not sharing invalid blocks with proposers. The mitigations section covers this and future looking details at more length.
By "equivocate" they mean proposing more than one block in a time slot. Validators responsibilities are:
The validator is expected to maintain sufficient hardware and connectivity to participate in block validation and proposal. In return, the validator is paid in ETH (their staked balance increases). On the other hand, participating as a validator also opens new avenues for users to attack the network for personal gain or sabotage. To prevent this, validators miss out on ETH rewards if they fail to participate when called upon, and their existing stake can be destroyed if they behave dishonestly. Two primary behaviors can be considered dishonest: proposing multiple blocks in a single slot (equivocating) and submitting contradictory attestations.


Matt Levine covered this case in Crypto Brothers Front-Ran the Front-Runners by focusing on front-running:
There is a sort of cool purity to this. In stock markets, some people are faster than others, and can make money by trading ahead of a big order, and people get mad about this and think it is unfair and propose solutions. And when money changes hands for speed advantages — “payment for order flow,” “colocation” — people complain about corruption. In crypto it’s like “let’s create an efficient market in trading ahead of big orders.” I once wrote: “Rather than solve this concern about traditional markets, crypto made it explicit.” That feels almost like a general philosophy of crypto: Take the problems of traditional finance and make them, worse, sure, but more transparent and visible and explicit and subject to unbridled free markets.
And then casting the brothers' actions as front-running:
Ethereum and its decentralized exchanges have a market structure that is like “bots can look at your transactions and front-run them if that’s profitable.” And these guys, allegedly, front-ran the front-runners; they turned the market structure around so that they could get an early look at the front-running bots’ front-running transactions and front-run them instead. By hacking, sure, sure, it’s bad. But it leaves the Justice Department in the odd position of saying that the integrity of crypto front-running is important and must be defended.
I think Levine is wrong here. Just as with high-frequency trading, "crypto front-running" is legal because it uses public information. The brothers were not indicted for front-running. What is illegal, and what the DoJ is alleging, is trading on "material non-public informatiion", which they obtained by wire fraud (a fraudulent signature). The indictment says:
this False Signature was designed to, and did, trick the Relay to prematurely release the full content of the proposed block to the defendants, including the private transaction information.
The DoJ is not defending the "integrity of crypto front-running", it is prosecuting activity that is illegal in all markets.

The next day Levine made the first of two clarifications:
First, though I described the exploit as “front-running the front-runners,” I do want to be clear that it was not just that. This is not a pure case of (1) submitting spoofy orders to bait front-running bots, (2) having them take the bait and (3) finding some trade to make them lose money. (There are prior examples of that, using oddly structured tokens to make the front-runners lose money.) Here, though, the brothers are accused of something closer to hacking, exploiting a weakness in software code to be able to see (and reorder) a series of transactions that was supposed to be kept confidential from them. That is worse; it’s sort of like the difference between (1) putting in spoof orders on the stock exchange to try to trick a high-frequency trading firm and (2) hacking into the stock exchange’s computer system to reverse the HFT firm’s trades. Even if you think that the front-running bots are bad, you might — as the Justice Department does — object to this approach to punishing them.
Exactly. Levine's second clarification was:
Second, I said that “they exploited a bug in Ethereum” to do this, but that’s not quite right. They exploited a bug in Flashbots’ MEV-Boost, open-source block-building software that “approximately 90% of Ethereum validators use” but that is not part of the core Ethereum system itself. (Here is Flashbots’ explanation.) They exploited a bug in how blocks are normally built and proposed on Ethereum. From the names “Flashbots” and “MEV-Boost,” though, you might get some sense of why the case is controversial. The way that blocks are normally built and proposed on Ethereum involves “maximal extractable value” (MEV), where arbitrage traders bid to pay validators for priority to make the most profitable trades. These brothers hacked that system, but not everyone likes that system, because it involves predatory traders front-running more naive traders.

This is also important because, as one reader commented: “A a crucial distinguishing factor here is that James and Anton did not re-order committed transactions; they instead picked an ordering of pending transactions that were favorable to them. Under this lens, the integrity of the blockchain is not compromised; the network explicitly ‘allows’ validators to pick whatever arbitrary ordering of transactions they like; it's just that generally it’s economically favorable for validators to prioritize transactions which pay them the most first.”
Part of Satoshi Nakamoto's genius in designing Bitcoin was that he observed KISS, the important software mantra Keep It Simple, Stupid. The Bitcoin blockchain does only one thing, maintain a ledger of transactions. So it the Bitcoin ecosystem has evolved very slowly, and has been remarkably free of vulnerabilities over the last decade and a half. Ethereum, on the other hand, is a Turing-complete environment that does whatever the users want it to. So over the last less than a decade the Ethereum ecosystem has evolved much faster, accreting complexity and thus vulnerabilities.

Look at Molly White's Web3 is Going Just Great. It is full of exploits of "smart contracts" such as "decentralized exchanges" and "bridges". Try searching for "bitcoin". You only find it in the context of the amuonts raided. It is precisely the fecundity of the Ethereum's programmability that leads to an ecosystem full of buggy code vulnerable to exploits such as the MEV-Boost one.

Daniel Kuhn's What the DOJ’s First MEV Lawsuit Means for Ethereum also discusses the details of the case:
“They used a flaw in MEV boost to push invalid signatures to preview bundles. That gives an unfair advantage via an exploit,” former employee of the Ethereum Foundation and Flashbots Hudson Jameson told CoinDesk in an interview. Jameson added that the Peraire-Bueno brothers were also running their own validator while extracting MEV, which violates something of a gentleman’s agreement in MEV circles.

“No one else in the MEV ecosystem was doing both of those things at once that we know of,” he added. “They did more than just play by both the codified and pinky promise rules of MEV extraction.”
The "gentleman's agreement" is important, because what the brothers were doing creates a conflict of interest, the kind that the SEC frowns upon.

Kuhn quotes Consensys General Counsel Bill Hughes:
“All of the defendants' preparation for the attack and their completely ham-fisted attempts to cover their tracks afterwards, including extensive incriminating google searches, just helps the government prove they intended to steal. All that evidence will look very bad to a jury. I suspect they plead guilty at some point,”
He also discusses a different reaction in the cryptosphere:
MEV, which itself is controversial, can be a highly lucrative game dominated by automated bots that often comes at blockchain users’ expense, which is partially why so many in the crypto community have rushed to denounce the DOJ’s complaint.
Still, others remain convinced that exploiting MEV bots designed to reorder transactions is fair game. “It's a little hard to sympathize with MEV bots and block builders getting f*cked over by block proposers, in the exact same way they are f*cking over end users,” the anonymous researcher said.
Kuhn quotes Hudson Jameson:
Jameson, for his part, said the MEV is something the Ethereum community should work to minimize on Ethereum, but that it’s a difficult problem to solve. For now, the process is “inevitable.”

“Until it can be eliminated, let's study it. Let's illuminate it. Let's minimize it. And since it does exist, let's make it as open as possible for anyone to participate with the same rules,” he said.
Jameson is wrong in suggesting that MEV could be eliminated. It is a consequence of the goal of decentralizing the system. Even the mechanism in place for "anyone to participate with the same rules" requires a trusted third party.

Using a Proposed Library Guide Assessment Standards Rubric and a Peer Review Process to Pedagogically Improve Library Guides: A Case Study / In the Library, With the Lead Pipe

In Brief

Library guides can help librarians provide information to their patrons regarding their library resources, services, and tools. Despite their perceived usefulness, there is little discussion in designing library guides pedagogically by following a set of assessment standards for a quality-checked review. Instructional designers regularly use vetted assessment standards and a peer review process for building high-quality courses, yet librarians typically do not when designing library guides. This article explores using a set of standards remixed from SUNY’s Online Course Quality Review Rubric or OSCQR and a peer review process. The authors used a case study approach to test the effectiveness of building library guides with the proposed standards by tasking college students to assess two Fake News guides (one revised to meet the proposed standards). Results indicated most students preferred the revised library guide to the original guide for personal use. The majority valued the revised guide for integrating into a learning management system and perceived it to be more beneficial for professors to teach from. Future studies should replicate this study and include additional perspectives from faculty and how they perceive the pedagogical values of a library guide designed following the proposed rubric.

A smiling librarian assists a student who is sitting at a computer located within the library.

Image: “Helpful”. Digital image created with Midjourney AI. By Trina McCowan CC BY-NC-SA 4.0


Library guides or LibGuides are a proprietary publishing tool for libraries and museums created by the company Springshare; librarians can use LibGuides to publish on a variety of topics centered around research (Dotson, 2021; Springshare, n. d.). For consistency, the authors will use the term library guides moving forward. Librarians can use Springshare’s tool to publish web pages to educate users on library subjects, topics, procedures, or processes (Coombs, 2015). Additionally, librarians can work with teaching faculty to create course guides that compile resources for specific classes (Berić-Stojšić & Dubicki, 2016; Clever, 2020). According to Springshare (n. d.), library guides are widely used by academic, museum, school, and public libraries; approximately 130,000 libraries worldwide use this library tool (Springshare, n. d.). The library guides’ popularity and continued use may stem from their ease of use as it eliminates the need to know a coding language to develop online content. (Bergstrom-Lynch, 2019).

Baker (2014) described library guides as the “evolutionary descendants of library pathfinders” (p. 108). The first pathfinders were paper brochures that provided suggested resources for advanced research. Often, librarians created these tools for the advanced practitioner as patrons granted access to the library were researchers and seasoned scholars. As the end users were already experts, there was little need for librarians to provide instruction for using the resources (Emanuel, 2013). Later, programs such as MIT’s 1970s Project Intrex developed pathfinders that presented students with library resources in their fields of interest (Conrad & Stevens, 2019). As technology advanced, librarians created and curated pathfinders for online access (Emanuel, 2013). 

Today, due to the modernization of pathfinders as library guides and their ease of discoverability, students and unaffiliated online users often find these guides without the assistance of a librarian (Emanuel, 2013). Search engines such as Google can extend a library guide’s reach far beyond a single institutional website, drawing the attention of information experts and novice internet users alike (Brewer et al., 2017; Emanuel, 2013; Lauseng et al., 2021). This expanded access means a librarian will not always be present to help interpret and explain the library guide’s learning objectives. Stone et al. (2018) state that library guides should be built using pedagogical principles “where the guide walks the student through the research process” (p. 280). Bergstrom-Lynch (2019) argues that there has been an abundant focus on user-centered library design studies but little focus on learning-centered design. Bergstrom-Lynch (2019) advocates for more attention directed to learning-centered design principles as library guides are integrated into Learning Management Systems (LMS) such as Canvas and Blackboard (Berić-Stojšić & Dubicki, 2016; Bielat et al., 2013; Lauseng et al., 2021) and can be presented as a learning module for the library (Emanuel, 2013; Mann et al., 2013). The use of library guides as online learning and teaching tools is not novel; however, their creation and evaluation using instructional design principles are a recent development (Bergstrom-Lynch, 2019). 

A core component of an instructional designer’s job is to ensure that online course development meets the institution’s standards for quality assurance (Halupa, 2019). Instructional designers can aid with writing appropriate course and learning objectives and selecting learning activities and assessments that align back to the module’s objectives. Additionally, they can provide feedback on designing a course that is student-friendly—being mindful of cognitive overload, course layout, font options, and color selection. Additionally, instructional designers are trained in designing learning content that meets accessibility standards (Halupa, 2019).

Instructional design teams and teaching faculty can choose from a variety of quality assurance standards rubrics to reference to ensure that key elements for online learning are present in the online course environment. Examples of quality assurance tools include Quality Matters (QM) Higher Education Rubric and SUNY’s Online Course Quality Review Rubric or OSCQR, a professional development course refreshment process with a rubric (Kathuria & Becker, 2021; OSCQR-SUNY, n.d.). QM is a not-for-profit subscribing service that provides education on assessing online courses through the organization’s assessment rubric of general and specific standards (Unal & Unal, 2016). The assessment process is a “collegial, faculty-driven, research-based peer review process…” (Unal & Unal, 2016, p. 464). For a national review, QM suggests three reviewers certified and trained with QM to conduct a quality review. There should be a content specialist and one external reviewer outside of the university involved in the process (Pickens & Witte, 2015). Some universities, such as the University of North Florida, submit online courses for a QM certificate with High-Quality recognition or an in-house review based on the standards earning High-Quality designation. For an in-house review at UNF, a subject matter expert, instructional designer, and trained faculty reviewer assess the course to provide feedback based on the standards (CIRT, “Online Course Design Quality Review”, n. d.; Hulen, 2022). Instructional designers at some institutions may use other pedagogical rubrics that are freely available and not proprietary. OSCQR is an openly licensed online course review rubric that allows use and/or adaptation (OSCQR-SUNY, n. d.). SUNYY-OSCQR’s rubric is a tool that can be used as a professional development exercise when building and/or refreshing online courses (OSCQR-SUNY, n.d.).

Typically, library guides do not receive a vetted vigorous pedagogical peer review process like online courses. Because library guides are more accessible and are used as teaching tools, they should be crafted for a diverse audience and easy for first-time library guide users to understand and navigate (Bergstrom-Lynch, 2019; Smith et al., 2023). However, Conrad & Stevens (2019) state: “Inexperienced content creators can inadvertently develop guides that are difficult to use, lacking consistent templates and containing overwhelming amounts of information” (p. 49). Lee et al. (2021) reviewed library guides about the systematic review process. Although this topic is complex, Lee et al. (2021) noted that there was a lack of instruction about the systematic review process presented. If instructional opportunities are missing from the most complex topics, one may need to review all types of library guides with fresh eyes. 

Moukhliss aims to describe a set of quality review standards, the Library Guide Assessment Standards (LGAS) rubric with annotations that she created based on the nature of library guides, and by remixing the SUNY-OSCQR rubric. Two trained reviewers are recommended to work with their peer review coordinator to individually engage in the review process and then convene to discuss the results. A standard will be marked Met when both of the reviewers mark it as Met, noting the evidence to support the Met designation. In order for a standard to be marked as Met, the library guide author should show evidence of 85% accuracy or higher per standard. To pass the quality-checked review to receive a quality-checked badge, the peer review team should note that 85% of the standards are marked as “Met.” If the review fails, the library guide author may continue to edit the guide or publish the guide without the quality-checked badge. Details regarding the peer review process are shared in the Library Guide Assessment Standards for Quality-Checked Review library guide. Select the Peer Review Training Materials tab for the training workbook and tutorial.

Situational Context

The University of North Florida (UNF) Thomas G. Carpenter Library services an R2 university of approximately 16,500 students. The UNF Center for Instruction and Research Technology (CIRT) supports two online learning librarians. The online librarians’ roles are to provide online instruction services to UNF faculty. CIRT staff advocate for online teaching faculty to submit their online courses to undergo a rigorous quality review process. Faculty can obtain a High-Quality designation for course design by working with an instructional designer and an appointed peer reviewer from UNF, or they may opt to aim for a High-Quality review after three years of course implementation by submitting for national review with Quality Matters (Hulen, 2022). Currently, Moukhliss serves as a peer reviewer for online High-Quality course reviews. 

After several High-Quality course reviews, Moukhliss questioned why there are no current vetted review standards for the various types of library guides reviewed and completed by trained librarians as there are for online courses and thus borrowed from The SUNY Online Course Quality Review Rubric OSCQR to re-mix as the Library Guide Assessment Standards Rubric with annotations

Literature Review

The amount of peer-reviewed literature on library guide design is shockingly small considering how many library guides have been created. The current research focus has been on usability and user experience studies, although some researchers have begun to focus on instructional design principles. As Bergstrom-Lynch (2019) states, peer-reviewed literature addressing library guide design through the lens of instructional design principles is at a stage of infancy. Researchers have primarily focused on collecting data on usage and usability (Conrad & Stevens, 2019; Oullette, 2011; Quintel, 2016). German (2017), an instructional design librarian, argues that when the library guide is created and maintained through a learner-centered point of view, librarians will see library guides as “e-learning tools” (p. 163). Lee et al. (2021) noted the value of integrating active learning activities into library guides. Stone et al. (2018) conducted a comparison study between two library guides, one library guide as-is and the other re-designed with pedagogical insight. Stone et al. (2018) concluded that “a pedagogical guide design, organizing resources around the information literacy research process and explaining the ‘why’ and ‘how of the process, leads to better student learning than the pathfinder design” (p. 290). A library guide representative of a pathfinder design lists resources rather than explaining them. Lee and Lowe (2018) conducted a similar study and noted more user interaction when viewing the pedagogically designed guide vs. the guide not designed with pedagogical principles. Hence Stone (2018) and Lee and Lowe (2018) discovered similar findings.

Authors like German (2017) and Lee et al. (2021) have touched upon instructional design topics. For example, Adebonojo (2010) described aligning the content of a subject library guide to library sources shared in course syllabi. Still, the author does not expand to discuss any other instructional design principles. Bergstrom-Lynch (2019) thinks more comprehensively, advocating for the use of the ADDIE instructional design model (an acronym for Analysis, Design, Development, Implementation, and Evaluation) when building library guides. The analysis phase encourages the designer to note problems with current instruction. The design phase entails how the designer will rectify the learning gap from the analysis phase. The development phase entails adding instructional materials, activities, and assessments. The implementation phase involves introducing the materials to learners. The evaluation phase enables the designer to collect feedback and improve content based on suggestions. ADDIE is cyclical and iterative (Bergstrom-Lynch, 2019). Allen (2017) introduces librarians to instructional design theories in the context of building an online information literacy asynchronous course but does not tie in using these theories for building library guides.

As Bergstrom-Lynch (2019) focused on best practices for library guide design based on ADDIE, German et al. (2017) used service design thinking constructs to build effective instruction guides. The five core principles of service design thinking are “user-centered, co-creative, sequencing, evidencing, and holistic” (German et al., 2017, p. 163). Focusing on the user encourages the designer to think like a student and ask: What do I need to know to successfully master this content? The co-creator stage invites other stakeholders to add their perspectives and/or expertise to the guide. The sequencing component invites the librarian to think through the role of the librarian and library services before, during, and after instruction. German et al. (2017) advocates for information from each stage to be communicated in the library guide. Evidencing involves the librarian reviewing the library guide to ensure that the content aligns with the learning objective (German et al., 2017). Both authors advocate for instructional design methods but fall short of suggesting an assessment rubric for designing and peer-reviewing guides.

Smith et al. (2023) developed a library guide rubric for their library guide redesign project at the Kelvin Smith Library at Case Western Reserve University. This rubric focused heavily on accessibility standards using the Web Accessibility Evaluation Tool or WAVE. Although Smith et al. (2023) discuss a rubric, the rubric was crafted as an evaluation tool for the author of the guide rather than for a peer review process. 

Although Bergstrom-Lynch (2019), German et al. (2017), and Smith et al. (2023) are pioneering best practices for library guides, they take different approaches. For example, Bergstrom-Lynch (2019) presents best practices for cyclical re-evaluation of the guide based on instructional design principles and derives their best practices based on their usability studies. The Smith et al. (2023) rubric emphasizes accessibility standards for ADA compliance—essential for course designers but a component of a more comprehensive rubric. German et al. (2017) emphasizes a user-centered design through the design thinking method. Moukhliss intends to add to the literature by suggesting using a remix of a vetted tool that course developers use as a professional development exercise with faculty. This OSCQR-SUNY tool envelopes the varying perspectives of Bergstrom-Lynch (2019), Smith et al. (2023), and German et al. (2017). 

Strengths & Weaknesses of the Library Guide

As with any tool, library guides have their strengths and weaknesses. Positives include indications that library guides can play a positive role in improving students’ grades, retention, and overall research skills (Brewer et al., 2017; May & Leighton, 2013; Wakeham et al., 2012). Additionally, library guides are easy to build and update (Baker, 2014; Conrad & Stevens, 2019). They can accommodate RSS feeds, videos, blogs, and chat (Baker, 2014), are accessible to the world, and cover a vast range of library research topics. According to Lauseng et al. (2021), library guides are discoverable through Googling and integrated into online Learning Management Systems (LMS). These factors support the view that library guides hold educational value and should be reconsidered for use as an Open Education Resource (Lauseng et al., 2021).

However, there are no perfect educational tools. Library guide weaknesses include their underutilization largely due to students not knowing what they are or how to find them (Bagshaw & Yorke-Barber, 2018; Conrad & Stevens, 2019; Ouellette, 2011). Additionally, library guides can be difficult for students to navigate, contain unnecessary content, and overuse library jargon (Sonsteby & DeJonghe, 2013). Conrad & Stevens (2019) described a usability study where the students were disoriented when using library guides and reported that they did not understand their purpose, function, or how to return to the library homepage. Lee et al. (2021) and Baker (2014) suggest that librarians tend to employ the “kitchen sink” (Baker, 2014, p. 110) approach to build library guides, thus overloading the guide with unapplicable content.

Critical Pedagogy and Library Guides

In his publication titled “The Philosophy of the Oppressed,” Paulo Freire introduced the theory of critical pedagogy and asserted that most educational models have the effect of reinforcing systems of societal injustice through the assumption that students are empty vessels who need to be filled with knowledge and skills curated by the intellectual elite (Kincheloe, 2012; Downey, 2016). Early in the 21st century, information professionals built upon the belief that “Critical pedagogy is, in essence, a project that positions education as a catalyst for social justice” (Tewell, 2015, p. 26) by developing “critical information literacy” to address what some saw as the Association of College and Research Libraries’ technically sound, but socially unaware “Information Literacy Competency Standards for Higher Education” (Cuevas-Cerveró et al., 2023). In subsequent years, numerous librarians and educators have written about the role of information literacy in dismantling systems of oppression, citing the need to promote “critical engagement with information sources” while recognizing that knowledge creation is a collaborative process in which everyone engages (Downey, 2016, p. 41).

The majority of scholarly output on library guides focus on user-centered design rather than specifically advocating for critical pedagogical methods. Yet there are a few scholars, such as Lechtenberg & Gold (2022), emphasizing how the lack of pedagogical training within LIS programs often results in information-centric library guides rather than learner-centric ones. Their presentation at LOEX 2022 reiterates the importance of user-centered design in all steps of guide creation, including deciding whether a library guide is needed.   

Additionally, the literature demonstrates that library guides are useful tools in delivering critical information literacy instruction and interventions. For instance, Hare and Evanson used a library guide to list open-access sources as part of their Information Privilege Outreach programming for undergraduate students approaching graduation (Hare & Evanson, 2018). Likewise, Buck and Valentino required students in their “OER and Social Justice” course to create a library guide designed to educate faculty about the benefits of open educational resources, partly due to students’ familiarity with the design and functionality of similar research guides (Buck & Valentino, 2018). As tools that have been used to communicate the principles of critical pedagogy, the evaluation of institutional library guides should consider how effectively critical pedagogy is incorporated into their design.  

The Library Guide Assessment Standards (LGAS) Rubric 

For the remixed rubric, Moukhliss changed the term “course” from OSCQR’s original verbiage to “library guide,” and Moukhliss dropped some original standards based on the differences between the expectations for an online course (i.e., rubrics, syllabus, etc.) and a library guide. Likewise, several standards were added in response to the pros and cons of the library guides, as found in the literature. Additionally, Moukhliss wrote annotations to add clarity to each standard for the peer review process. For example, Standard 2 in the remixed LGAS rubric prompts the reviewer to see if the author defines the term library guide since research has indicated that students do not know what library guides are nor how to find them (Bagshaw & Yorke-Barber, 2018; Conrad & Stevens, 2019; Ouellette, 2011). Standard 7 suggests that the librarian provide links to the profiles of other librarian liaisons who may serve the audience using the library guide. Standard 9 prompts the reviewer to see if the library guide links to the library university’s homepage to clarify Conrad & Stevens’s (2019) conundrum that the library guide is not the library homepage. These additional standards were added to ensure that users are provided with adequate information about the nature of library guides, who publishes them, and how to locate additional guides to address the confusion that Conrad & Stevens (2019) noted in their library guide usability study. Additionally, these added standards may be helpful for those who discover library guides through a Google search. 

Moukhliss intends to use the additional standards to provide context about the library guide to novice users, thus addressing the issue of information privilege or the assumption that everyone starts with the same background knowledge. Standard 22 was added to negate adding unnecessary resources to the guide, which Baker (2014) and Conrad & Stevens (2019) cited as a common problem. Standard 27 encourages the use of Creative Commons attribution, as suggested by Lauseng et al. (2021). They found that not only faculty, staff, and students at the University of Illinois Chicago were using their Evidence Based Medicine library guide, but also a wider audience. Recognizing its strong visibility and significant external usage, they considered it a potential candidate for an Open Educational Resource (OER). As library guides are often found without the help of the librarian, Standard 28 suggests that reviewers check that library guide authors provide steps for accessing research tools and databases suggested in the library guide outside of the context of the guide. Providing such information may help to negate Conrad & Stevens’s (2019) findings regarding students’ feelings of disorientation while using a library guide and difficulty navigating to the library homepage from the guide. 

Standard 30 was added so that students have a dedicated Get Help tab explaining the variety of ways the user can contact their library and/or librarians for additional assistance. Standard 31 was re-written so that the user could check for their understanding in a way appropriate for the guide (Lee et al., 2021), such as a low-stakes quiz or poll. Finally, Standard 32 encourages the user to provide feedback regarding the guide’s usefulness, content, design, etc., with the understanding that learning objectives follow an iterative cycle and are not stagnant. Student feedback can help the authoring librarian update and maintain the guide’s relevancy to users and will give students the opportunity to become co-creators of the knowledge they consume.

UNF’s LGAS Rubric for Quality-Checked Review library guide includes an additional tab for a Quality-Checked badge (available on the Maintenance Checklist/Test Your Knowledge tab) and a suggested maintenance checklist (See Maintenance Checklist/ Test Your Knowledge tab) for monthly review, twice-a-year, and yearly reviews. Moukhliss borrowed and remixed the checklist from the Vanderbilt University Libraries (current as of 8/21/2023). The Peer Review Training Materials tab includes a training workbook and training video on the LGAS rubric, the annotations, and the peer review process. Moukhliss provides a Creative Commons license to the library guide to encourage other institutions to reuse and/or remix at the LGAS’s Start Here page

Methodology, Theoretical Model, and Framework

Moukhliss and McCowan used the qualitative case study methodology. Gephart (2004) stated, “Qualitative research is multimethod research that uses an interpretive, naturalistic approach to its subject matter. . . . Qualitative research emphasizes qualities of entities —the processes and meanings that occur naturally” (pp. 454-455). Moukhliss and McCowan selected the exploratory multi-case study so that they could assess multiple student user/learning perspectives when accessing, navigating, and digesting the two library guides. 

The theoretical model used for this study is the Plan-Do-Check-Act cycle. This quality improvement model has evolved with input from Walter Shewart and Dr. Edward Deming (Koehler & Pankowski, 1996). The cycle walks a team through four steps: Plan, Do, Check, and Act. The Plan phase allows time for one to think through problems such as the lack of design standards for library guides. During the “Do” phase, Moukhliss selected and made a remix of the quality review tool SUNY OSCQR. Additionally, she selected a “kitchen sink” (Baker, 2014, p. 10) library guide and redesigned it with the proposed rubric. Moukhliss’s aim was only to remove dead links and/or outdated information when restructuring the guide. The only items deemed outdated were the CRAAP test learning object and selected books from the curated e-book list. The CRAAP test was removed, and no substitution of similar materials was made. The list of selected books was updated in the revised guide to reflect current publications. As Moukhliss restructured the guide, she decided to use tabbed boxes to chunk and sequence information to appease Standards 11, 12, 13, and 15. You may view this restructuring by comparing the original Fake News guide and the revised Fake News guide. The “Do” phase includes Moukhliss recruiting participants to evaluate the two library guides — the original Fake News guide with the Fake News Guide 2 revised to follow the suggested standards and peer review process. Moukhliss and McCowan submitted the library guide study proposal to the Institutional Review Board in November 2023, and the study was marked Exempt. In December 2023, Moukhliss recruited participants by emailing faculty, distributing flyers in the library, posting flyers on display boards, and adding a digital flyer to each library floor’s lightboard display. The librarians added the incentive of 10-dollar Starbucks gift cards to the first 15 students who signed consent forms and successfully completed the 30-minute recorded Zoom session (or until saturation was reached).

Moukhliss interviewed one test pilot (P1) and ten students (P2-P11) for this study and she noted saturation after her seventh interview but continued to ten participants to increase certainty. Although some may view this as a low sample population, the data aligns with the peer-reviewed literature. Hennick & Kaiser (2019) discuss saturation in in-depth interviews and point to Guest et al.’s (2006) study. Guest et al. (2006) determined that after reviewing data from 60 interviews deemed in-depth, they determined that saturation presented itself between Interviews 7-12 “at which point 88% of all themes were developed and 97% of important (high frequency) themes were identified” (Hennick & Kaiser, 2019, para. 5). The Questionnaire framework for this study is centered around Bloom’s Taxonomy. This taxonomy provides a framework of action verbs that align with the hierarchical levels of learning. Bloom’s taxonomy includes verbiage for learning objectives that align with the level of the learning outcomes of remember, understand, apply, analyze, evaluate, and create. McCowan incorporated various levels of Bloom’s Taxonomy as she built the UX script used for this study. Moukhliss interchanged Fake News and Fake News 2 as Guide A and Guide B throughout the interview sessions. After each recorded Zoom session, Moukhliss reviewed the session and recorded the task completion times on the script, recorded the data to the scorecard, and updated data into the qualitative software NVivo. Both script and scorecard are available on the Library Guide Study page. Moukhliss created a codebook with participant information, assigned code names for everyone, and stored the codebook to a password protected file of her work computer to keep identifiable information secure. Moukhliss used the code names Participant 1, Participant 2, Participant 3, etc. and removed all personal identifiers as she prepared to upload the study’s data to a qualitative software system. For coding, the authors chose the NVivo platform, a qualitative assessment tool that can organize data by type (correspondence, observation, and interviews), enable the researcher(s) to easily insert notes in each file, and develop codes to discover themes. Moukhliss coded the interviews based on the LGAS (i.e., Standard 1, 2, 3, etc.). Additional codes were added regarding navigation and content. Moukhliss & McCowan reviewed the codes for themes and preferences regarding library guide design.

The “Check” phase guided Moukhliss and McCowan in considering the implementation of the LGAS rubric and peer review process for library guides at UNF. During this phase, they reviewed participants’ qualitative responses to the Fake News library guide and the Fake News 2 library guide. Data from the “Check” phase will drive Moukhliss & McCowan to make recommendations in the “Act” phase (Koehler & Pankowski, 1996), which will be discussed in the Conclusion.


Moukhliss worked with one test pilot and interviewed ten students for this study. The ten students’ majors were representative of the following: Nursing, Computer Science, Communications, Public Administration, Electrical Engineering, Information Technology, Health Sciences, Philosophy, and Criminal Justice. Participants included two first-year students, two sophomores, three juniors, two seniors, and one graduate student. Eight participants used their desktops, whereas two completed the study on their phones. When evaluating the familiarity of users with library guides, one participant noted they had never used a library guide before, two others stated they rarely used them, and another two students stated that they occasionally used them. Finally, five students stated they did not know whether they had ever used one or not. 

Findings & Discussion

Overall, students were faster at navigating the Fake News 2 Revised guide vs. the original guide except for listing the 5 Types of Fake News. This may be because the 5 Types of Fake News were listed on the original guide’s first page. The overall successful mean navigability for the original guide was 39 seconds, whereas the revised guide’s mean was 22.2 seconds for the successful completion of a task. Moukhliss noted a pattern of failed completion tasks often linked back to poorly labeled sections of the new and revised guides. 

Although the content was identical in both guides except for the removal of outdated information, dead website links from the original guide, and the updated list of e-books to the revised guide, the students’ overall mean confidence level indicated 4.2 for the original guide’s information vs a 4.4 for the revised guide. The mean recommendation likelihood level for the original guide is 6.4, whereas the mean recommendation likelihood level of the revised guide increased to 7.9.

Regarding library guide personal preferences for a course reading, one student indicated they would want to work off the old guide, and 9 others indicated wanting to work from the revised guide for the following reasons:

  • Organization and layout are more effective.
  • Information is presented more clearly.
  • There is a tab for dedicated UNF resources.
  • Easier to navigate.
  • Less jumbled
  • Easier to follow when working with peers.

Regarding perceptions of which guide a professor may choose to teach with, three chose the original guide, whereas the other seven indicated the revised guide. One student stated that the old guide was more straightforward and that the instructor could explain the guide if they were using it during direct instruction. Preferences for the revised guide include:

  • More “interactive-ness” and quizzes
  • Summaries are present.
  • Presentation of content is better.
  • Locating information is easier.
  • The guide doesn’t feel like “a massive run-on sentence.”
  • Ease for “flipping through the topics.”
  • Presence of library consult and contact information. 

Although not part of the interview questions, Moukhliss was able to document that eight participants were not aware that a library guide could be embedded into a Canvas course, and one participant was aware. Moukhliss is unaware of the other participant’s experiences with an embedded library guide. Regarding preferences for embedding the library guide in Canvas, one student voted for the old guide whereas nine preferred the revised guide. Remarks for the revised guide include the inclusion of necessary Get Help information for struggling students and for the guide’s ease of navigation. 

Although not every standard from the LGAS rubric was brought up in conversation throughout the student interviews, the LGAS that were seen as positive and appreciated by students to integrate into a guide’s design include the following Standards: 4, 7, 11, 12, 15, 21, 22, 28, 30, and standards 31. It was noted through action that two students navigated the revised guide by the hyperlinked learning objectives and not by side navigation (Standard 5), thus indicating that Standard 5 may hold value for those who maneuvered the guide through the stated objectives. Moukhliss noted during her observations that one limitation to hyperlinking the object to a specific library guide page is when that page includes a tabbed box. The library guide author is unable to directly link to a specific sub-tab from the box. Instead, the link defaults to the first page of the box’s information. Thus, students maneuvering the guide expected to find the listed objective on the first tab of the tabbed box, and they did not innately click through the sub-tabs to discover the listed objective.

Through observation, Moukhliss noted that six students struggled to understand how to initially navigate the library guides using the side navigation functionality, but after time with the guide and/or Moukhliss educating them on side navigation, they were successful. Moukhliss noted that for students who were comfortable with navigating a guide or after Moukhliss educated them on navigating the guide, students preferred the sub-tabbed boxes of the revised guide to the organization of the original guide. The students found neither library guide perfect, but Moukhliss & McCowan noted there was an overall theme that organization of information and proper sequencing and chunking of the information was perceived as important by the students. Three students commented on appreciating clarification for each part of the guide, which provides leverage for proposed Standard 28.

Additionally, two students appreciated the library guide author profile picture and contact information on each page and three students positively remarked on the presence of a Get Help tab on the revised guide. One participant stated that professors want students to have a point of contact with their library liaisons, and they do not like “anonymous pages” (referring to the original guide lacking an author profile). The final participant wanted to see more consult widgets listed throughout the library guide. Regarding the Fake News 2 Guide, two students preferred that more content information and less information about getting started be present on the first page of the guide. Furthermore, images and design mattered, as one student remarked that they did not like the Fake News 2 banner, and several others disliked the lack of imagery on the first page of the Fake News 2 guide. For both guides, students consistently remarked on liking the Fake News infographics. 

Those supporting the old guide or parts of the original guide, three students liked the CRAAP Test worksheet and wanted to see it in the revised guide, not knowing that the worksheet was deemed dated by members of the instruction team and thus removed by Moukhliss for that reason. One student wanted to see the CRAAP test worksheet repurposed to be a flowchart regarding fake news. Moukhliss noted that most of the students perceived objects listed on the original guide and revised guide to be current, relevant, and vetted. Eight participants did not question their usefulness or relevancy or whether the library guide author was maintaining the guide. Only one student pointed out that the old guide had a list of outdated e-books and that the list was refreshed for the new guide. Thus, Moukhliss’s observations may reinforce to library guide authors that library guides should be reviewed and refreshed regularly as proposed by Standard 22 —⎯ as most students from this study appeared to take for face value that what is presented on the guide to be not only vetted but continuously maintained.

Initial data from this study indicate that using the LGAS rubric with annotations and a peer review process may improve the learning experience for students, especially in relation to being mindful of what information to include in a library guide, as well as the sequencing and chunking of the information. Early data indicates students appreciate a Get Help section and want to see Contact Us and library liaison/author information throughout the guide’s pages. 

Because six students initially struggled with maneuvering through a guide, Moukhliss & McCowan suggest including instructions on how to navigate in either the library guide banner and/or a brief introductory video for the Start Here page or both locations. Here is a screenshot of sample banner instructions:

A sample Fake News library guide banner being used to point students to how to maneuver the guide. Banner states: "Navigate this guide by selecting the tabs." And "Some pages of this guide include subtags to click into."

As stated, Moukhliss noted that most students were not aware of the presence of library guides in their Canvas courses. This may indicate that librarians should provide direct instruction during one-shots in not only what library guides are and how to maneuver them, but directly model how to access an embedded guide within Canvas. 


Library guides have considerable pedagogical potential. However, there are no widely-used rubrics for evaluating whether a particular library guide has design features that support its intended learning outcomes. Based on this study, librarians who adopt or adapt the LGAS rubric will be more likely to design library guides that support students’ ability to complete relevant tasks. At UNF, Moukhliss and McCowan plan to suggest to administration to employ the LGAS rubric and annotations with a peer review process and to consider templatizing their institution’s (UNF) library guides to honor the proposed standard that was deemed most impactful by the student participants. This includes recommending to library administration to include a Get Started tab for guide template(s) and to include placeholders for introductory text, a library guide navigation video tutorial, visual navigational instructions embedded in the guide’s banner, and the inclusion of the guide’s learning objectives. Furthermore, they propose an institutionally vetted Get Help tab that can be mapped to each guide. Other proposals include templatizing each page to include the following: a page synopsis, applicable explanations for accessing library-specific resources and tools from the library’s homepage, placeholders for general contact information, a link to the library liaison directory, a placeholder for the author bio picture, feedback, assessment, and a research consultation link or widget as well as instructions for accessing the library’s homepage.

Following the creation of a standardized template, Moukhliss plans to propose to recruit a team of volunteer peer reviewers (library staff, librarians, library administration) and provide training on the LGAS rubric, the annotations, and the peer review process. She will recommend all library guide authors to train on the proposed LGAS rubric and the new library guide template for future library guide authorship projects and for updating and improving existing guides based on the standards. The training will cover the rubric, the annotations, and the maintenance calendar checklists for monthly, bi-annually, and yearly review. All proposed training materials are available at the LGAS’s Start Here page

Moukhliss and McCowan encourage other college and university librarians to consider using or remixing the proposed LGAS rubric for a quality-checked review and to conduct studies on students’ perceptions of the rubric to add data to this research. The authors suggest future studies to survey both students and faculty on their perspectives on using a quality assurance rubric and peer review process to increase the pedagogical value of a library guide. Moukhliss & McCowan encourage future authors of studies to report on their successes and struggles for forming and training library colleagues on using a quality-checked rubric for library guide design and the peer review process.


The authors would like to express our gratitude to Kelly Lindberg and Ryan Randall, our peer reviewers. As well, we would like to thank the staff at In The Library with the Lead Pipe, including our publishing editor, Jaena Rae Cabrera.


Adebonojo, L. G. (2010). LibGuides: customizing subject guides for individual courses. College & Undergraduate Libraries, 17(4), 398–412.  

Allen, M. (2017). Designing online asynchronous information literacy instruction using the ADDIE model. In T. Maddison & M. Kumaran (Eds.), Distributed learning pedagogy and technology in online information literacy instruction (pp.69-90). Chandos Publishing.

Bagshaw, A. & Yorke-Barber, P. (2018). Guiding librarians: Rethinking library guides as a staff development tool links to an external site. Journal of the Australian Library and Information Association67(1), 31–41.

Baker, R. L. (2014). Designing LibGuides as instructional tools for critical thinking and effective online learning. Journal of Library and Information Services in Distance Learning, 8(3–4), 107–117. 

Bergstrom-Lynch. (2019). LibGuides by design: Using instructional design principles and user-centered studies to develop best practices. Public Services Quarterly, 15(3), 205–223.

Berić-Stojšić, & Dubicki, E. (2016). Guiding students’ learning with LibGuides as an interactive teaching tool in health promotion. Pedagogy in Health Promotion, 2(2), 144–148.

Bielat, V., Befus, R., & Arnold, J. (2013). Integrating LibGuides into the teaching-learning process. In A. Dobbs, R. L. Sittler, & D. Cook (Eds.). Using LibGuides to enhance library services: A LITA guide (pp. 121-142). ALA TechSource.

Brewer, L., Rick, H., & Grondin, K. A. (2017). Improving digital libraries and support with online research guides. Online Learning Journal, 21(3), 135-150.

Buck, S., & Valentino, M. L. (2018). OER and social justice: A colloquium at Oregon State University. Journal of Librarianship and Scholarly Communication, 6(2).

CIRT. (n. d.) Online Course Design Quality Review.

 Clever, K. A. (2020). Connecting with faculty and students through course-related LibGuides. Pennsylvania Libraries, 8(1), 49–57.

Conrad, S. & Stevens, C. (2019). “Am I on the library website?: A LibGuides usability study. Information Technology and Libraries, 38(3), 49-81.

 Coombs, B. (2015). LibGuides 2. Journal of the Medical Library Association, 103(1), 64–65.

Cuevas-Cerveró, A., Colmenero-Ruiz, M.-J., & Martínez-Ávila, D. (2023). Critical information literacy as a form of information activism. The Journal of Academic Librarianship, 49(6), 102786.

Dotson, D. S. (2021). LibGuides Gone Viral: A Giant LibGuides Project during Remote Working. Science & Technology Libraries (New York, N.Y.)40(3), 243–259.

Downey, A. (2016). Critical information literacy: Foundations, inspiration, and ideas. Library Juice Press.

Emanuel, J. (2013). A short history of LibraryGuides and their usefulness to librarians and patrons. In A. Dobbs, R. L. Sittler, & D. Cook (Eds.). Using LibGuides to enhance library services: A LITA guide (pp. 3-20). ALA TechSource.

Gephart, R. P., Jr. (2004). Qualitative research and academy of management journal. Academy of Management Journal, 47(4), 452–462.

German, E. (2017). Information literacy and instruction: LibGuides for instruction: A service design point of view from an academic library. Reference & User Services Quarterly, 56(3), 162-167.

German, E., Grassian, E., & LeMire, S. (2017). LibGuides for instruction: A service design point of view from an academic library. Reference and User Services Quarterly, 56(3), 162–167.

Guest, G., Bunce, A., & Johnson, L. (2006). How many interviews are enough? An experiment with data saturation and variability. Field Methods, 18, 59–82. doi:10.1177/1525822X05279903

Halupa, C. (2019). Differentiation of roles: Instructional designers and faculty in the creation of online courses. International Journal of Higher Education, 8(1), 55–68.

Hare, S., & Evanson, C. (2018). Information privilege outreach for undergraduate students. College & Research Libraries, 79(6), 726–736.

Hennink, M., & Kaiser, B., (2019). Saturation in qualitative research, In P. Atkinson, S. Delamont, A. Cernat, J.W. Sakshaug, & R.A. Williams (Eds.), SAGE Research Methods Foundations.

Hulen, K. (2022). Quality assurance drives continuous improvements to online programs. In S. Kumar & P. Arnold (Eds.), Quality in online programs: Approaches and practices in higher education. (pp. 3-22). The Netherlands: Brill. 

Kathuria, H., & Becker, D. W. (2021). Leveraging course quality checklist to improve online courses. Journal of Teaching and Learning with Technology, 10(1) 

Kincheloe, J. (2012). Critical pedagogy in the twenty-first century: Evolution for survival. Counterpoints, 422, 147–183.

Koehler, J. W. & Pankowski, J. M. (1996). Continual improvement in government tools & methods. St. Lucie Press.

Lauseng, D. L., Howard, C., Scoulas, J. M., & Berry, A. (2021). Assessing online library guide use and Open Educational Resource (OER) potential: An evidence-based decision-making approach. Journal of Web Librarianship, 15(3), 128–153.

Lechtenberg, U. & Gold, H. (2022). When all you have is a hammer, everything looks like a LibGuide: Strengths, limitations, and opportunities of the teaching tool [Conference presentation]. LOEX 2022 Conference, Ypsilanti, MI, United States. 

 Lee, Hayden, K. A., Ganshorn, H., & Pethrick, H. (2021). A content analysis of systematic review online library guides. Evidence Based Library and Information Practice, 16(1), 60–77.

Lee, Y. Y., & Lowe, M. S. (2018). Building positive learning experiences through pedagogical research guide design. Journal of Web Librarianship, 12(4), 205-231.

Mann, B. J., Arnold, J. L., and Rawson, J. (2013). Using LibGuides to promote information literacy in a distance education environment. In A. Dobbs, R. L. Sittler, & D. Cook (Eds.). Using LibGuides to enhance library services: A LITA guide (pp. 221-238). ALA TechSource. 

May, D. & Leighton, H. V. (2013). Using a library-based course page to improve research skills in an undergraduate international business law course. Journal of Legal Studies Education, 30(2), 295-319. doi: 10.11n/jlse.12003

OSCQR – Suny Online Course Quality Review Rubric (n. d.). About OSCQR.

Ouellette, D. (2011). Subject guides in academic libraries: A user-centred study of uses and perceptions. Canadian Journal of Information and Library Science, 35(4), 436–451.10.1353/ils.2011.0024 

Pickens, & Witte, G. (2015). Circle the wagons & bust out the big guns! Tame the “Wild West” of distance librarianship using Quality Matters TM Benchmarks. Journal of Library & Information Services in Distance Learning, 9(1-2), 119–132.

Quintel, D. F. (2016, January/February). LibGuides and usability: What our users want. Computers in Libraries Magazine, 36(1), 4-8. 

Smith, E. S., Koziura, A., Meinke, E., & Meszaros, E. (2023). Designing and implementing an instructional triptych for a digital future. The Journal of Academic Librarianship, 49(2), 102672–106277.

Sonsteby, A. & DeJonghe, J. (2013). Usability testing, user-centered design, and LibGuides subject guides: A case study. Journal of Web Librarianship, 7(1), 83–94.

SpringShare (n. d.). LibGuides.

Stone, S. M., Sara Lowe, M., & Maxson, B. K. (2018). Does course guide design impact student learning? College & Undergraduate Libraries, 25(3), 280-296.

Tewell, E. (2015). A decade of critical information literacy: A review of the literature. Comminfolit, 9(1), 24-43.

Unal, Z. & Unal, A. (2016). Students Matter: Quality Measurements in Online Courses. International Journal on E-Learning, 15(4), 463-481. Waynesville, NC USA: Association for the Advancement of Computing in Education (AACE). Retrieved September 21, 2023 from

Wakeham, M., Roberts, A., Shelley, J. & Wells, P. (2012). Library subject guides: A case study of evidence-informed library development. Journal of Librarianship and Information Science, 44(3), 199-207. 

Building a simple IIIF digital library with Tropy, Tropiiify and Canopy / Raffaele Messuti

Creating and maintaining an online digital collection can be a complex process involving multiple components, from organizational procedures to software solutions. With many moving parts, it's no surprise that building and curating a digital collection can be costly, time-consuming, and demanding to maintain. When dealing with cultural heritage, maintenance and long-term preservation should be our primary concerns. The approach we should always consider is minimal computing.

In this tutorial, I'll show you how to create and maintain a simple IIIF collection using Tropy and Canopy, two powerful tools that can help you build static sites requiring zero maintenance.

There are many other libraries and applications, including free software, that can achieve the same result. However, they often require minimal programming knowledge or the maintenance of server-side applications.


Tropy is a desktop application designed to organize and manage archival research photos, though it's also great for managing almost any kind of image, including invoices or handwritten notes. It doesn't require any online service; you can work offline on your desktop without needing to upload anything.

Although it's yet another Electron application, the UI is very pleasant, minimal, and fast to use. You will quickly notice a significant improvement in your offline workflow compared to using online applications in a browser.

There's an extensive user guide to learn Tropy, I won't cover all the details here. Instead, I want to highlight some features I consider important:

  • A Tropy project is saved into an SQLite database. This is a huge advantage because your data won't be locked inside the application. If you have programming knowledge, you can build a workflow to manage the data of a Tropy project and integrate it into any external application.
  • Tropy can import many image formats, including PDFs and multi-page TIFFs.
  • You can describe images with standard templates (a default Tropy template and a Dublin Core one) or create your own.
  • Tropy can be extended with plugins.

IIIF Plugin: tropiiify

One plugin that stands out is tropiiify. With this plugin you can export a Tropy collection to a static IIIF collection: images will be saved in tiles (no IIIF server required), and manifests and collection files will be generated. You simply need to move the exported output to a static HTTP server (remember to configure CORS).


  • Every document needs to have an identifier. Use whatever you want, for small collections also progressive numbers are sufficient. Alternatively, use UUIDs or any other unique identifiers, like Nanoid (if you don't want to script a Nanoid generator, point your browser to UUID Nanoid Generator and get a new identifier with each reload).

  • You can create multiple export configurations. Set IIIF base id with the full public URL where you are going to publish the export

Here is and example collection (just some book covers shot with smartphone) that can be opened with any IIIF viewer (tify or mirador).

There are many other libraries or applications that can help you achieve the same result (vips, iiif-tiler, iiif-prezi), but they require knowledge of the shell and some scripting/programming to put everything together.


An IIIF export from Tropy is ready to be used with any IIIF viewer out there. But there is another interesting application: Canopy. It's a static site generator for IIIF collections that includes a browsing interface (with facets), a search engine, and a IIIF image viewer (with annotations). Everything bundled in a static site that doesn't need any server-side technology to be served.

Here is a short guide to use canopy (see also their documentation)

Clone the repository

git clone

Install dependencies

npm i


Edit .env with the full public URL where you will publish the static exported collection


Edit config/canopy.json with the IIIF collection manifest

  "collection": "",
  "devCollection": "",
  "featured": [
  "metadata": [


npm run build:static

Deploy online: copy the content of out directory to your http server.

Here is a complete demo

#ODDStories 2024 @ Ningi, Nigeria 🇳🇬 / Open Knowledge Foundation

The MUMSA Initiative, a youth-driven non-profit organization, successfully held a two-day hackathon titled “Hacking for Healthy Food & Green Futures” at Ningi, Nigeria on March 6th and 7th, 2024. This event aligned perfectly with Open Data Day 2024 and empowered young people in Ningi to address critical local challenges through the power of open data. 

Thematic Focus: open data for advancing sustainable development goals (SDGs) – specifically, SDG 2 (Zero Hunger), SDG 3 (Good Health and Well-being), and SDG 13 (Climate Action)

The hackathon brought together passionate young minds from different schools and inside the community to tackle interconnected issues of food security, mental health, and climate change. Participants leveraged local and national open datasets on agriculture, nutrition, weather, mental health resources, and environmental indicators.

Over the two days, teams observed to develop groundbreaking solutions that directly impact their community. These solutions included:

  • Data-driven strategies for identifying areas with food insecurity and optimizing crop selection based on climate data. This aims to empowers local farmers to make informed decisions and improve food production.
  • Development of interventions to address local mental health needs and create awareness campaigns based on real-time data. With esense to increases access to resources and promotes mental well-being.
  • Promotion of climate-smart agricultural practices through data analysis. This approach facilitates the reduction of food waste and fosters progress towards environmental goals.

MUMSA Initiative ensured a well-rounded experience by offering:

Equipping participants with the skills to access, analyze, and utilize open data effectively and connecting participant in teams the facilitators provide guidance and support throughout the hackathon. And encourage the participants to share their note of ideas for wider impact, maximizing the reach and potential of their solutions.

The “Hacking for Healthy Food & Green Futures” hackathon is a testament to the power of engaged youth. This event serves as a model for other organizations and communities seeking to empower young people to use open data and tackle real-world challenges.

About Open Data Day

Open Data Day (ODD) is an annual celebration of open data all over the world. Groups from many countries create local events on the day where they will use open data in their communities.

As a way to increase the representation of different cultures, since 2023 we offer the opportunity for organisations to host an Open Data Day event on the best date within a one-week period. In 2024, a total of 287 events happened all over the world between March 2nd-8th, in 60+ countries using 15 different languages.

All outputs are open for everyone to use and re-use.

In 2024, Open Data Day was also a part of the HOT OpenSummit ’23-24 initiative, a creative programme of global event collaborations that leverages experience, passion and connection to drive strong networks and collective action across the humanitarian open mapping movement

For more information, you can reach out to the Open Knowledge Foundation team by emailing You can also join the Open Data Day Google Group to ask for advice or share tips and get connected with others.

We Are Makers – how we work at Artefacto / Artefacto

While we wear many hats, building things has always been central to our work at Artefacto. We’ve always described ourselves as both makers and librarians and our consultancy, training and design work fits within these identities.  We are especially excited when we can build and share tools, resources and platforms that deliver a user-friendly experience.  [...]

Continue Reading...


Cross-Searching Simplified & Traditional Chinese / Library Tech Talk (U of Michigan)

Catalog Search showing Chinese-language search results
Image Caption

A search for the traditional-character Chinese phrase "戶籍", which shows the same results as the equivalent search for the simplified characters "户籍".

The U-M Library recently added the capability to search across Chinese-language materials in our catalog, regardless of which Chinese character set was used in the query or the record. This improvement expands to our large collection of materials and improves the user experience.

On learning to code without mathematics / Jez Cope

Why is it that so many beginner programming tutorials assume that the learner is both a) comfortable with maths; and b) motivated to learn by seeing simple arithmetic? Go look at your favourite tutorial (I’ll wait) and I’ll give you good odds it starts with some variation of “look, you can use this as a calculator!”

Seriously, there must be so many people who would be great programmers but who never get past the thought:

“Why would I want to install and configure this whole programming environment just to see that 5×7 is 35?”

A lot of people are instantly put off when they see mathematics. That’s not a personal failure, it’s a natural consequence of the (inaccurate) way maths is seen in our culture, as a binary thing where everything is questions that have only one “right” answer and a lot of wrong ones, though it’s actually a beautiful, creative and constantly evolving language. There’s a weird little duality, often instilled in us before we even reach school, whereby on the one hand not being able to “do maths” is seen as a shameful thing while on the other it’s far more acceptable to make a general statement like “oh, I don’t really do maths” than to say “oh, I don’t really do reading”.

That perceived double bind must put off loads of people who would otherwise be very creative solvers of problems involving some maths, and when we make it look like being “good at maths” is a prerequisite of being able to program a computer we put those same people off that too. In any case I believe this is all based on a false premise: maths is much more accessible than many have been led to believe, and learning to code by manipulating text or images has a lot of potential for demystifying mathematical concepts through familiarity.

Aside: there is some evidence1 that natural language skill is a better predictor of programming aptitude than mathematical skill, though I don’t know whether that study has been replicated.

Anyway, that’s why I really like the tutorial for the Racket programming language: it focuses 100% on drawing pictures and introduces programming concepts as ways of composing simple shapes (like squares and circles) into more complex images.

Building on this idea, Christine Lemmer-Webber has a tutorial for Digital Humanities folk called “Building a snowman with Racket” which takes the learner through making a little picture of a snowman using Racket’s slideshow module.

I don’t know about you, but I think that’s pretty cool and I’d like to see more tutorials taking a similar approach.

This post was prompted by:

  1. Prat, C.S. et al. (2020) “Relating natural language aptitude to individual differences in learning programming languages,” Scientific reports, 10(1), p. 3817. Available at:↩︎

DLF Digest: June 2024 / Digital Library Federation

DLF Digest logo: DLF logo at top center "Digest" is centered. Beneath is "Monthly" aligned left and "Updates" aligned right.

A monthly round-up of news, upcoming working group meetings and events, and CLIR program updates from the Digital Library Federation. See all past Digests here

Greetings, DLF Community! Planning is in full swing for the in-person DLF Forum coming up next month in Michigan. The program is available and registration is open while space remains. We’re also looking ahead to next year and have issued a Call for Hosts for next year’s Forum – if your organization might be a good fit, we hope you’ll submit an application or get in touch. Our Working Groups are also remaining busy this month as regular group meetings continue; check out all they’re up to below and on the Community Calendar. We wish you a wonderful June, and we hope to see you at one of our events in the coming weeks!

– Team DLF


This month’s news:

This month’s DLF group events:

DLF Pedagogy Group Slack Chat

The June 2024 DLFTeach Slack chat, “Incorporating political turmoil into your digital pedagogy,” will be a conversational space to share how you’re coping with ongoing global crises in your instructional spaces. While this is not a stage for debating political stances or issues, we do invite you to share the resources and reflections you’ve collected in wrestling with current events as a digital pedagogist.

Join the discussion via Slack, starting June 3rd.

To see the questions, which will be posted in the channel the 1st Friday of the month, click on the pinned document labeled 2023-2024 chats at the top of the channel.


This month’s open DLF group meetings:

For the most up-to-date schedule of DLF group meetings and events (plus NDSA meetings, conferences, and more), bookmark the DLF Community Calendar. Can’t find meeting call-in information? Email us at Reminder: Team DLF working days are Monday through Thursday.

  • Born-Digital Access Working Group (BDAWG): Tuesday, 6/4, 2pm ET / 11am PT.
  • Digital Accessibility Working Group (DAWG): Wednesday, 6/5, 2pm ET / 11am PT.
  • Assessment Interest Group (AIG) Metadata Working Group: Thursday, 6/6, 1:15pm ET / 10:15 PT. 
  • AIG Cultural Assessment Working Group: Monday, 6/10, 2pm ET / 11am PT.
  • AIG Cost Assessment Working Group: Monday, 6/10, 3pm ET / 12pm PT.
  • Assessment Interest Group (AIG) Metadata Working Group: Thursday, 6/20, 1:15pm ET / 10:15 PT. 
  • AIG User Experience Working Group: Friday, 6/21, 11am ET / 8am PT.
  • Committee for Equity and Inclusion: Monday, 6/24, 3pm ET / 12pm PT.
  • Climate Justice Working Group: Wednesday, 6/26, 12pm ET / 9am PT.  
  • Digital Accessibility Working Group: Policy and Workflows subgroup: Friday, 6/28, 1pm ET / 10am PT. 

Interested in joining a current group, reviving a past one, or do you have a general question? Let us know at DLF Working Groups are open to all, regardless of whether or not you’re affiliated with a DLF member organization. Team DLF also hosts quarterly meetings with working group leaders and occasionally produces special events or resources for members. Learn more about working groups on our website and check out our Organizer’s Toolkit

Get Involved / Connect with Us

Below are some ways to stay connected with us and the digital library community: 

The post DLF Digest: June 2024 appeared first on DLF.

Run your Rails gem CI on rails main branch / Jonathan Rochkind

attr_json is basically an ActiveRecord extension. It works with multiple versions of Rails, so definitely runs CI on each version it supports.

But a while ago on attr_json, i set up CI to run on Rails main unreleased branch. I already was using appraisal to test under multiple Rails versions.

(which I recommend; sure it seems easy enough to do this ‘manually’ with conditionals in your Gemspec or separate Gemfiles and BUNDLE_GEMFILE — but as soon as you start needing things like different extra dependencies (version of rspec-rails anyone?) for different Rails versions… stop reinventing the wheel, appraisal just works).

So I added one more appraisal block for rails-edge, pretty straightforward. (This example also uses combustion which I don’t necessarily recommend, I think recent Rails dummy app generated by rails plugin new is fine, unlike Rails back in 5.x or whatever).

The “edge rails” CI isn’t required to pass for PR’s to be merged. I put it in it’s own separate Github Actions workflow, in part so I can give it it’s own badge on the README. (The way things are currently set up, I think you don’t even get “edge rails CI” feedback on the PR — it would be ideal to get it as feedback, but make it clear it’s in its own category and failures aren’t a blocker).

I intend this to tell the person looking at the README considering using the gem, and evaluating it’s health and making guesses about its maintenance level and effective cost of ownership: Hey, this maintainer is continually testing on unreleased Rails Edge. That’s a pretty good sign! Especially that it’s green, means it’s working on unreleased rails edge. And when the next Rails release happens, we already know it’s in a state to work on it, I won’t have to delay my Rails upgrade for this dependency.

And if a change happens on Rails edge main branch that breaks my build — I find out when it happens. If you don’t look at whether your code passes the build on (eg) Rails 7.2 until it’s released, and you find a bunch of failures — it turns out that was basically deferred maintenance waiting for you.

I find out about breakages when they happen. I fix them when I have time, but seeing that red build breakage on “Future Rails Versions” is a big motivator to get it green. (I might have called that “edge Rails” in retrospect, I think that’s a generally understood term?). And when Rails 7.2 really is released — I just need to change my gemspec to allow Rails 7.2, and release attr_json, I don’t have deferred maintenance on compat with latest Rails release piling up for me, and I can release an attr_json supporting the new Rails release immediately, and not be a blocker for my users upgrading to latest Rails release on their schedule.

This has worked out very well for me, and I would really encourage all maintainers of Rails plugins/engines to run CI on Rails edge.

One Heck Of A Halvening / David Rosenthal

The fundamental idea behind Bitcoin is that, if you restrict the supply of something, its price will rise. That is why the system arranges that there will only ever be 21 million Bitcoin by halving the reward paid for mining the next block every 210,000 blocks (about every four years), an event called the "halvening" (or more recently just the halving). It is an article of faith among the crypto-bros that, after the halvening, the price will rise. For example:
In the image below, the vertical blue lines indicate the previous three halvings (2012-11-28, 2016-7-9, and 2020-5-11). Note how the price has jumped significantly after each halving.
The most recent halvening happened on Friday, April 19th. It was eagerly awaited so, six weeks later, it is time to go below the fold and look at the effects.


Lets start back in October when the Bitcoin "price" was in the high $20Ks. This was a problem, because only the most efficient miners could make a profit at that "price". They could look out six months to see the looming halvening, which would mean that even the most efficient, with their income halved, would be losing a lot of money. Something needed to be done. By April the price needed to be at least in the mid $50Ks, or the Bitcoin mining industry would suffer a bloodbath.

Coincidentally, October was when Tether started printing USDT at a rate only before seen during the heady days of 2021. At the start of October there were 83B USDT, by the halvening there were 110B, 32.5% more.

The extra supply of 27B USDT, whose primary function is to buy cryptocurrency, certainly created demand for Bitcoin, whose supply over the same period increased by only 0.95%. This excess of demand over supply, as the crypto-bros would have predicted increased the price. But by 149%, which might have been more than they predicted. Note that it peaked at $73,094 on March 13th, 35 days before the halvening, and has trended lower since.

This huge pump in the BTC "price" meant that the most efficient miners's margins went from scant to extortionate, and sucked in a whole lot of mining rigs that had been turned off as being uneconomic. In the graph of the 30-day moving average hash rate, the rate of increase is lower until around the start of the pump in October, then it increases as the pump takes hold. The earlier rate may represent the rate at which Bitmain can ship the latest rigs, the additional later rate probably represents re-activated older rigs.

David Pan's Bitcoin Miners Power Up Rigs to Record Levels Ahead of ‘Halving’ reported the mining surge a week before the halvening:
Bitcoin mining companies are boosting their cash reserves to cope with the negative impact from the halving through a variety of channels including running their operations at full capacity or expanding them to produce and sell more coins amid soaring Bitcoin prices in the latest rally.
Omkar Godbole's Crypto Miners Run Down Bitcoin Inventory to 3-Year Low in a Strategic Pre-Halving Move had more detail on how they were "boosting their cash reserves":
The number of bitcoin held by miners ... declined to 1.794 million BTC this week, the lowest since early 2021, according to data source CoinMetrics.

The so-called miner balance has fallen by 27,000 since November, implying steady selling in the months leading up to the quadrennial mining reward halving.
The rally has allowed miners to take profits at higher prices and fund equipment upgrades to prepare for the reduced rewards rate, according to algorithmic trading firm Wintermute.

"With miners' holdings still near an all-time high in USD terms ($124 billion), this sell-off appears to be a strategic move for profit-taking and operational upgrades, marking a behavioral change from the last cycle," Wintermute said in a weekly newsletter.
Some miners, having ridden the hype around the halvening and cashed in their winnings, have pivoted to the probably more durable AI hype, as David Pan reported in Bitcoin Miner Core Scientific Converts Data Infrastructure for AI:
The Austin, Texas-based company deployed 16-megawatt data-center capacity for AI startup CoreWeave Inc. and plans to convert more going forward. The move comes amid a slump in Bitcoin mining revenue and growing demand for data centers to host graphics processing units to power AI applications.


What has happened since the halvening? Ten days after the event that was supposed to send the BTC "price" soaring Sidhartha Shukla wrote in Bitcoin Faces Worst Month Since FTX Crash With ETF Demand Cooling:
The much-anticipated Bitcoin halving, a quadrennial event that reduces the supply of new coins in the market and historically acted as a price tailwind, had minimal impact since it happened on April 19.

As the table shows, the flood of new Tethers dried up, and so the BTC price trended down. And, as Olga Kharif reported three weeks later, Trading on Crypto Exchanges Fell for First Time in Seven Months:
Spot trading volume on so-called centralized exchanges such as Coinbase Global, Binance and Kraken tumbled 32.6% to $2 trillion last month, according to data from researcher CCData. Derivatives trading volume also declined for the first time in seven months, falling by 26.1% to $4.57 trillion.
Kharif's graph shows the trading frenzy triggered by the flood of new Tethers peaking in March.

As might have been expected, halving the miners' income even from its pumped-up level has started to force the less efficient players out. The hash rate graph is extremely noisy, with routine daily swings of more than 20%, but the graph of the 30-day moving average hash rate shows that it peaked about a week after the halvening, and a week later started a downward trend.

It isn't just that the miners' income has been halved. It is also that their costs are rising, as David Pan's Bitcoin ‘Halving’ Will Deal a $10 Billion Blow to Crypto Miners explains:
“Power in the US is extraordinarily constrained,” said Adam Sullivan, chief executive officer at Austin, Texas-based Core Scientific Inc., one of the largest public Bitcoin mining companies. “Right now, miners are competing against some of the largest tech companies in the world, who are trying to find space for data centers, which are high energy consumers too.”

The nascent AI industry is drawing in massive amounts of capital, which is making it harder for miners to secure favorable electricity rates with utility companies. Inc. is set to spend almost $150 billion on data centers, while Blackstone is building a $25 billion empire of centers. Google Inc. and Microsoft Corp. are also making hefty investments.

“The artificial intelligence crowd is willing to pay three or four times what Bitcoin miners were paying last year” for electricity, said David Foley, co-managing partner at Bitcoin Opportunity Fund, which has made investments in both public and private miners. That is happening across the globe, he said.
Of course, if the miners' break-even last year was around BTC=$25K, the halvening would put it at $50K. So now, at around $67K, the miners can afford to pay more for power, but not "three or four times" more. Pan points out another problem for the miners:
The tech giants also have an edge in acquiring power from utilities, given their consistent revenue stream, whereas crypto mining revenue fluctuates with the rise and fall in Bitcoin prices. Utilities consider tech companies as more reliable purchasers given their strong balance sheets, said Taras Kulyk, CEO at crypto-mining services provider SunnyDigital.

With that competition in place, low-cost power contracts could be tougher to renew when existing agreements expire. Large-scale Bitcoin miners tend to lock in energy prices, typically for a few years, said Greg Beard, CEO of public Bitcoin miner Stronghold Digital Mining Inc.

Transaction Fees

One might have expected a gradual decrease in the hash rate to start as soon as the supply of mining rewards was halved. Why did miners increase the hash rate after their block reward was halved? The answer is that there was a massive spike in transaction fees. A week before the average fee had been $2.82. The day before the average fee was $19.09. On halvening day the average fee was $127, a 565% rise. A week later it was $4.42. These massive fee spikes caused when everyone wants to transact are inevitable given that Bitcoin is limited to around 7 transactions per second.

The day of the halvening the miners made a total of $108M. This bonanza completely reversed the normal state of the Bitcoin blockchain, in which transactions are around 95% subsidized by inflating the currency with block rewards, to a state in which the block reward was less than 20% of the miners' income. As I discussed in Fee-Only Bitcoin, this is close to the state Bitcoin will be in all the time after three more halvenings in 2036. I predict that the average fee then will be around $90/transaction, and that spikes could easily exceed $540/transaction.

The explanation for the spike in fees lies in something else that deliberately coincided with the halvening which caused a huge surge in demand for transactions. Joel Agbo's What Are Bitcoin Runes? Bringing Memecoins to Bitcoin explains:
Casey Rodarmor, the creator of the Ordinals protocol that lets users create NFT-like inscriptions on the Bitcoin blockchain, is releasing Runes, a new protocol that lets users easily create tokens on top of Bitcoin like Solana and Ethereum. While the BRC-20 and SRC-20 token standards already exist, these are based on Ordinals theory, which can result in UTXO proliferation that spam Bitcoin,

In an attempt to create a healthier way of etching tokens on the Bitcoin network, Rodarmor announced Runes in September 2023 and has been working on this since then. As the protocol nears its launch date, there is an increase in interest around Runes and what it could mean for Bitcoin.
“Creating a good fungible token protocol for Bitcoin might bring significant transaction fee revenue, developer mindshare, and users to Bitcoin.”
- Casey Rodarmor, creator of Runes
Bitcoin Runes launchedon Block 840,000, following the Bitcoin halving in April 2024. The creator does not impact Bitcoin, although it adds to the excitement that follows the fourth Bitcoin halving. Rodarmor states that Bitcoin Runes’ simplicity and overall architecture will aid the primary reason for its development – the creation of fungible tokens on the Bitcoin blockchain.
TheMinerMag's Bitcoin Miners Bag $109M in Halving Day Rewards as Hashprice Soars to Two-Year Highs summarizes what happened:
The soaring transaction fees resulted from on-chain users rushing to create meme-like tokens on bitcoin using the Rune protocol, which is similar to the ERC-20 standard for creating tokens on Ethereum.

Although tokens created on Rune are fungible, the values are subject to speculations on various measures such as how early they were created, the uniqueness and quality of their symbols, and the potential of being listed on exchanges.

According to a Rune explorer, 1,750 Rune projects have been created as of writing, or as the protocol terms it, “etched.” For instance, some of the early “etched” projects are called “MASSIVE•PILE•OF•SHIT” or “DOG•GO•TO•THE•MOON

Because the protocol utilizes bitcoin’s UTXO (unspent transaction output) model, it creates a mechanism where a user’s transaction for a token issuance will first enter bitcoin’s mempool and will be successfully created after the transaction is confirmed.

As bitcoin influencer Jimmy Song put it here, explaining the current fee market dynamics, this mechanism also creates room for “squatters” to snipe a user’s creation by outbidding them using a higher fee.

“Whichever comes first gets the symbol and the asset issuance. But if you want to squat on a good symbol name, you can just look for mempool transactions that are attempting to create a new asset and create your own with a bigger fee,” Song wrote.
This is another form of Miner's Extractable Value in Ethereum. But the sugar high didn't last, as Muyao Shen and María Paula Mijares Torres reported in Bitcoin Miner Boosting Memecoins Allure Already Begins to Wane:
Just as the April 19 “halving” reduced the amount of tokens awarded to miners for validating transactions, network transaction fees jumped as users rushed to mint the speculative coins on Bitcoin for the first time. The process is enabled by the Rune protocol, through which people can create their own fungible tokens. Dune Analytics’ data shows that the total transactions in Runes dropped to around 45,700 on Monday from its peak at above 750,000 on April 23.
Since the protocol launch, Runes have generated 2,169 Bitcoin, or approximately $137 million, in fees, according to data compiled by Dune Analytics user cryptokoryo. The share of transactions related to Runes peaked on April 23, where it counted for 81% of all Bitcoin transactions.

$139M in 9 days to the miners for 71,381 memecoins, whose "market cap" according to is 55,765BTC ($3.8B) down 99.4%). If that is right, at one point the runes were "worth" $635B! What an amazing innovation that creates $635B out of thin air in a few days!


There are two possible, not mutually exclusive, reasons for the flood of 27B USDT:
  1. Speculators, believing that the halvening would send BTC moonwards, bought 27B newly created USDT with $27B USD. They used the 27B USDT to buy BTC, more than doubling the "price".
  2. Holders of BTC pledged it as collateral for loans of 27B newly created USDT from Tether. They used the 27B USDT to buy BTC, more than doubling the "price". They then sold half the newly doubled BTC for USDT, with which they repaid the loan. At the peak anyone who bought BTC between the beginning of the pump and November 11th November had more than doubled their money.
Case B is Tether's "magic money pump"; I wrote about it in 2020's Stablecoins pointing out that, among other research:
Is Bitcoin Really Untethered? by John Griffin and Amit Shams shows that:
Rather than demand from cash investors, these patterns are most consistent with the supply‐based hypothesis of unbacked digital money inflating cryptocurrency prices.
Their paper was originally published in 2018 and updated in 2019 and 2020.
Ponzi Funds by Philippe van der Beck, Jean-Philippe Bouchaud and Dario Villamaina describe a similar "magic money pump" in certain mutual funds whose holdings are concentrated, as BTC hodlers' are:
Flow-driven trading in these securities causes price pressure, which pushes up the funds' existing positions resulting in realized returns. We decompose fund returns into a price pressure (self-inflated) and a fundamental component and show that when allocating capital across funds, investors are unable to identify whether realized returns are self-inflated or fundamental. Because investors chase self-inflated fund returns at a high frequency, even short-lived impact meaningfully affects fund flows at longer time scales. The combination of price impact and return chasing causes an endogenous feedback loop and a reallocation of wealth to early fund investors, which unravels once the price pressure reverts.
This effect mirrors that in Bitcoin. If, for example by wash trading or borrowing loan-backed USDT, HODL-ers can start a BTC price spike other speculators will join in and thus create an "endogenous feedback loop". In ETFs, van der Beck et al estimate that this effect reallocates $500M/day to earlier investors.

I returned to the idea of "magic money pumps" in 2021's Stablecoins Part 2 as news came out that the DoJ had a criminal investigation into Tether. Now, Amy Castor and David Gerard's Tether and sanctions: what’s coming for Paolo’s beautiful launderette discuss law enforcement's increasing pressure on Tether because of its use in rampant sanctions evasion. They list Islamic Jihad, Hamas, Venezuela's oil industry and Russia's war on Ukraine as being funded via Tether:
Chainalysis found that stablecoins like Tether were used in the vast majority of crypto-based scam transactions and sanctions evasion in 2023.

TRM Labs concurred, saying that Tether was the most used stablecoin in illicit crypto flows in 2023. Tether on the Tron blockchain in particular had “cemented its position as the currency of choice for use by terrorist financing entities.”

De rol van AI en machine learning in de toekomst van bibliotheken / HangingTogether

Met dank aan Vincent Jordaan, OCLC, voor het vertalen van de oorspronkelijke Engelstalige blogpost.

Het volgende artikel is onderdeel van een doorlopende serie over het OCLC-LIBER programma “Building for the future” (Bouwen aan de toekomst).

User walking through library stacks, with transparent imagery of thoughts, data, and more arching across shelves.Image generated using Adobe Firefly AI

Het OCLC Research Library Partnership (RLP) en LIBER (een vereniging van Europese onderzoeksbibliotheken) organiseerden op 17 april 2024 een begeleide discussie over Kunstmatige Intelligentie (KI/AI) en machine learning. De bijeenkomst maakte deel uit van de lopende reeks Building for the future (Bouwen aan de toekomst), waarin we onderzoeken hoe bibliotheken werken aan innovatieve dienstverlening, zoals beschreven in de LIBER-strategie 2023-2027.

Leden van het OCLC RLP-team werkten samen met de LIBER-werkgroep om discussievragen op te stellen en de groepsdiscussies te faciliteren. Deze aanpak was dezelfde als bij eerdere gesprekken over research data management en datagedreven besluitvorming.

Deelnemers uit 31 instellingen in 12 landen in Europa en Noord-Amerika namen deel aan de bijeenkomst. In dit artikel vind je een samenvatting van de belangrijkste punten uit de groepsdiscussies.

Nieuwsgierigheid, verwarring en onzekerheid

We begonnen de bijeenkomst met de vraag hoe deelnemers dachten over het gebruik van AI en machine learning in bibliotheken. De reacties liepen uiteen. Deelnemers waren nieuwsgierig naar en geïnteresseerd in het gebruik en de toekomst van AI, maar er was ook scepsis en bezorgdheid.

Word cloud reporting librarian feelings about AI, with feelings of interest, curiosity, uncertainty, and skepticism dominating.Wordcloud die weergeeft hoe bibliotheekmedewerkers over AI denken

Tijdens discussies in subgroepen kwamen de volgende zorgen naar voren:

  • Impact op het milieu door aanzienlijk energiegebruik;
  • Privacy van gebruikersdata;
  • Gebruik van auteursrechtelijk beschermd materiaal in grote taalmodellen (Large Language Models/LLM’s) en onzekerheid over intellectueel eigendom;
  • Misinformatie door onnauwkeurigheid en hallucinaties van generatieve AI;
  • Risico’s op kwaadaardige manipulaties, met name van spraakopnames;
  • Overheersing van de Engelse taal in LLM’s;
  • Informatie-overload

Bijscholing gebeurt alleen. De meeste mensen gaan zelfstandig aan de slag om hun AI-kennis te ontwikkelen door te experimenteren met verschillende tools. Bijna iedereen die deelnam aan deze discussie vond dat ze zich nog in de fase van experimenteren en leren bevonden. Er is dringend behoefte aan meer structuur en ondersteuning. Enkele deelnemers benadrukten hoe ze profiteerden van een teambenadering, bijvoorbeeld door het oprichten van een AI-werkgroep in de bibliotheek of het deelnemen aan begeleide discussies zoals deze.

Welke AI-tools gebruiken ze? We vroegen de deelnemers naar de tools die ze gebruiken. ChatGPT domineerde zoals verwacht de lijst, gevolgd door Microsoft Copilot. Andere tools die ze noemden waren onder andere Transkribus, eScriptorium en DeepL. Dit past bij de interesse van bibliotheken in tekst- en beeldtranscriptie, -analyse en -vertaling. Andere genoemde producten zoals Elicit, Gemini, ResearchRabbit, Perplexity en Dimensions AI passen dan weer meer bij het vinden van informatie en analyseren van onderzoek.

De institutionele context beïnvloedt de discussies over AI in bibliotheken. Veel deelnemers merkten op dat er een sterke institutionele focus is op academische integriteit. Organisaties ontwikkelen op verschillende niveaus – in lokaal verband, consortiumverband of verenigingsverband – beleidsregels en richtlijnen. Een voorbeeld hiervan zijn de principes voor het gebruik van generatieve AI-tools in het onderwijs van de Russell Group, een samenwerkingsverband van onderzoeksuniversiteiten in het Verenigd Koninkrijk. Deze principes benadrukken de academische integriteit en de rol van universiteiten bij het bevorderen van AI-vaardigheden en het verzekeren van gelijke toegang tot deze vaardigheden voor alle leden.

Onderzoeksuniversiteiten gaan nu enterprise-diensten aanbieden. Een aantal Amerikaanse instellingen introduceren lokale chatbots voor gebruik door docenten, medewerkers en studenten. Deelnemers van de University of California-Irvine deelden hun ervaringen met ZotGPT, ondersteund door de instelling en gebouwd op het Microsoft Azure-platform. Ze bieden dit kosteloos aan campusgebruikers aan. De tool kan de toegang tot experimenten verbeteren en tegelijkertijd privacykwesties aanpakken, omdat de data lokaal worden verwerkt. Dit is duidelijk een gebied waar we verdere groei kunnen verwachten.

Mogelijke toepassingen van AI in bibliotheken

We vroegen de deelnemers na te denken over de manier waarop bibliotheken gebruik kunnen maken van AI. Ze noemden veel ideeën die ik heb onderverdeeld in zes gebruikscategorieën:

  • Beheer van metadata;
  • Ondersteuning bij het helpen van gebruikers;
  • Zoekfunctionaliteiten en inhoudsevaluatie;
  • Transcriptie en vertaling;
  • Data-analyse en gebruikersonderzoek;
  • Communicatie en voorlichting.

Het beheer van metadata stond bovenaan de lijst. Verschillende deelnemers toonden interesse in het gebruik van machine learning-modellen voor het genereren van MARC-records. We hebben een aantal voorbeelden gehoord van verkenningen op dit gebied.

De Nationale Bibliotheek van Finland heeft bijvoorbeeld geëxperimenteerd met geautomatiseerde onderwerpsindexering, wat resulteerde in de ontwikkeling van de Annif microservice. In de Verenigde Staten heeft LC Labs van de Library of Congress het Exploring Computational Description (ECD) project opgezet om de effectiviteit van machine learning-modellen te testen bij het genereren van MARC-recordvelden voor e-books. Je leert hier meer over in de Engelstalige opname van dit OCLC RLP webinar.

Andere deelnemers hebben lokale inspanningen beschreven om tekstuele informatie te gebruiken voor het genereren van onderwerptitels en experimenten met tools zoals Gemini. De eerste resultaten waren teleurstellend vanwege de overvloed aan “fictieve data”, maar de deelnemers blijven toch optimistisch over de mogelijkheden.

Naast het creëren van metadata zijn deelnemers ook geïnteresseerd in hoe je AI en machine learning-technologieën kunt gebruiken om de kwaliteit van metadata te verbeteren. Hierbij kun je denken aan detectie van anomalieën en dubbele records of misschien detectie van onjuiste codering van talen in records. OCLC heeft verteld over het gebruik van machine learning voor het identificeren van dubbele records in WorldCat, met input van de catalogiseergemeenschap.

Ondersteuning bij het helpen van gebruikers. Diverse aanwezigen hadden interesse in het gebruik van kunstmatige intelligentie om een chatbot voor bibliotheken te ontwikkelen om gebruikers verder te helpen. Een chatbot die direct vragen beantwoordt op basis van informatie op lokale webpagina’s.

Een deelnemer van de Universiteit van Calgary deelde kort hoe hun bibliotheek een meertalige chatbot genaamd T-Rex heeft geïmplementeerd. Deze chatbot maakt gebruik van een LLM-model en retrieval-augmented generation (RAG). RAG is een methode waarbij een model extra informatie krijgt uit een opzoeksysteem om betere tekst te produceren. Het model is getraind op de websiteteksten van de bibliotheek, waaronder LibGuides (een online gids samengesteld door bibliothecarissen om gebruikers te helpen bij het vinden van informatie over specifieke onderwerpen), openingstijden en meer. Het systeem is al meer dan een jaar operationeel. Medewerkers beoordelen het positief, omdat het de behoefte aan menselijke ondersteuning bij eenvoudige vragen vermindert.[i]

Zoekfunctionaliteiten en inhoudsevaluatie. Deelnemers zijn ook geïnteresseerd in hoe AI-technologieën het zoeken kunnen verbeteren, bijvoorbeeld door zoeken met natuurlijke taalzinnen mogelijk te maken in plaats van met trefwoorden. We hoorden over enkele innovatieve projecten bij nationale bibliotheken ter ondersteuning van het verbeteren van de zoekervaring, zoals een chatbot die vragen beantwoordt op basis van de gedigitaliseerde krantencollectie van de Nationale Bibliotheek van Luxemburg.

Onderzoekers gebruiken diverse tools zoals Scite, Consensus, ResearchRabbit, Perplexity en Semantic Scholar om relevante bevindingen uit een breed scala een bronnen samen te vatten, citatieaanbevelingen te ontvangen en onderzoekslandschappen te visualiseren. De Generative AI Product Tracker, samengesteld door Ithaka S+R, fungeert als een handige gids voor dit uitgebreide ecosysteem.

Deelnemers beschreven ook hoe onderzoekers nieuwe AI-functionaliteiten benutten die zijn geïntegreerd in bestaande onderzoeksindexen zoals Scopus en Dimensions. Vergelijkbaar met het eerder genoemde voorbeeld van de chatbot, lijkt het erop dat deze tools een hybride benadering hanteren. Ze gebruiken hierbij retrieval-augmented generation (RAG) om lokale indexen te bevragen en generatieve AI om de ontvangen informatie te verwerken tot een nauwkeurig antwoord op de oorspronkelijke vraag, met een minimum aan hallucinaties.

Transcriptie en vertaling. Bibliotheekmedewerkers zijn zeer geïnteresseerd in transcriptietools die de toegankelijkheid en het gebruik van cultureel erfgoed-collecties kunnen vergroten. In de discussies hoorden we over spraak-naar-tekst experimenten in de Nationale Bibliotheek van Noorwegen en de Koninklijke Bibliotheek van Denemarken, met behulp van automatische spraakherkenning-technologie (automatic speech recognition/ASR).

Verschillende deelnemers vermeldden het gebruik van de Transkribus en eScriptorium-platformen om tekstherkenning en beeldanalyse van gedigitaliseerde historische documenten te ondersteunen. Er is ook interesse in hoe deze tools onderzoekers kunnen ondersteunen die werken in talen die ze slecht beheersen.

Data-analyse en gebruikersonderzoek stond niet boven aan de lijst, maar meerdere deelnemers gaven aan geïnteresseerd te zijn in het gebruik van datawetenschap en AI-tools om meer te weten te komen over het gedrag van gebruikers. Zo hopen ze het beheer van de bibliotheek te verbeteren.

Communicatie en voorlichting. Eén deelnemer beschreef hoe hun bibliotheek ChatGPT gebruikt om content te genereren voor social media feeds van de bibliotheek, met menselijke beoordeling. Dit lijkt een algemene use case waar ik meer over verwacht te horen.

Verantwoorde processen ondersteunen

De deelnemers bespraken de noodzaak van verantwoorde AI-praktijken. Het ging hierbij vooral over het belang van transparante, controleerbare en inclusieve AI. Er was veel aandacht voor de noodzaak van transparantie van de gegevensbronnen van LLM’s, waaronder onderzoek naar de rechtmatigheid van gegevens die worden gescrapet voor gebruik in trainingssets.

Naast eerdere rapporten zoals Responsible Operations: Data Science, Machine Learning, and AI in Libraries van OCLC Research, zijn er veel andere onderzoeksprojecten, workshops en evenementen die bibliotheken helpen bij het nemen van ethische beslissingen over AI. Enkele voorbeelden zijn:

Deelnemers deelden hoe zij denken dat bibliotheken het voortouw kunnen nemen, onder andere door het voorzitten van campusdiscussies over AI-geletterdheid, toepassingen en goede academische praktijken. De LIBER Data Science in Libraries Working Group (DSLib) heeft besproken hoe bibliotheken kunnen omgaan met AI-gegenereerde desinformatie en nepnieuws.

Leidende rol voor bibliotheken bij AI-geletterdheid

Bibliotheken kunnen een belangrijke rol spelen door het ondersteunen van onderwijs en training in AI-geletterdheid. Veel bibliotheken doen dit ook. Veel deelnemers zien dit als een essentieel onderdeel van informatievaardigheden. Bibliotheekmedewerkers moeten zich snel bijscholen om anderen te kunnen trainen.

Wat hebben bibliotheken nodig om succesvol te zijn?

Tijdens deze gesprekken vertelden deelnemers wat bibliotheken nodig hebben om succesvol verder te gaan. Bibliotheekmedewerkers moeten op zijn minst toegang hebben tot de juiste hulpmiddelen en genoeg tijd om te oefenen en te experimenteren. Alleen dan kunnen zij de expertise ontwikkelen die nodig is om als deskundigen op te treden en de bibliotheekactiviteiten op het gebied van AI te leiden.

Een deelnemer benadrukte bijvoorbeeld dat bibliothecarissen bekend moeten zijn met de beperkingen van LLM’s, zoals het genereren van nepcitaten, om gebruikers goed te kunnen ondersteunen bij het gebruik van chatbots. Daarnaast is er behoefte aan meer professionals met data-analysevaardigheden binnen de bibliotheek. Deze experts zouden deel moeten uitmaken van een multidisciplinair team. Dit sluit aan bij opmerkingen uit een eerdere sessie over het belang van datagedreven besluitvorming.

Momenteel verloopt de ontwikkeling van vaardigheden op een onafhankelijke en ad-hoc basis. Deelnemers gaven aan behoefte te hebben aan meer training, externe ondersteuning en voorbeeld use cases. Bovendien willen ze zinvol samenwerken met anderen in communities of practice, groepen mensen met een gedeelde interesse die regelmatig samenkomen om kennis en ervaringen uit te wisselen en zo hun vaardigheden en expertise te verbeteren.


Word cloud about "what excites you about the future of AI and libraries?" with answers like efficiency, accessibility, and collaboration.Bibliothecarissen zien een hoopvolle toekomst voor bibliotheken en AI

Deze groepsdiscussies zijn waardevol omdat ze bibliotheekprofessionals uit verschillende tijdzones met elkaar verbinden. Sommige deelnemers voelden zich gerustgesteld doordat anderen dezelfde onzekerheden ervoeren tijdens de vroege stadia van ontdekking en experimentatie. Over het algemeen gaven de deelnemers aan zich enthousiast en hoopvol te voelen over de mogelijkheden van AI om de efficiëntie in bibliotheken te vergroten en tijd te besparen.

Kom op donderdag 6 juni naar het plenaire slotevenement van de OCLC-LIBER-serie Bouwen aan de toekomst. Tijdens deze sessie vatten we de belangrijkste aanknopingspunten uit de eerdere discussies samen, gevolgd door een paneldiscussie door vooraanstaande bibliotheekmedewerkers die hun visie geven op hoe onderzoeksbibliotheken kunnen samenwerken in deze uitdagende tijden. Je kunt je gratis aanmelden. Tot dan.

[i] Julia Guy et al., “Reference Chatbots in Canadian Academic Libraries,” Information Technology and Libraries 42, no. 4 (18 december 2023),

The post De rol van AI en machine learning in de toekomst van bibliotheken appeared first on Hanging Together.

Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 28 May 2024 / HangingTogether

The following post is one in a regular series on issues of Inclusion, Diversity, Equity, and Accessibility, compiled by a team of OCLC contributors.

Making presentations accessible for everyone

Image of a crowd of people raising their hands in front of a stage where a person is giving a speech.Photo by Jaime Lopes on Unsplash

June is a busy month for library conferences, and many librarians, including myself, are working on conference presentations. One aspect of preparation is making a presentation accessible. The accessibility checker in presentation software is a good place to begin with creating an inclusive presentation, but there are many accessibility issues, especially those not related to vision, these tools do not address. Many universities and professional organizations have “accessibility tips” that provide useful but incomplete and sometimes vague information. For example, the 2024 ALA Annual Conference has a Presenter Resources page with some good advice like describing images verbally, but also says, “be inclusive of all attendees by avoiding jargon, slang, and assumed knowledge.” The natural question I asked myself is, “How do I avoid jargon in a presentation about cataloging?”

I found more specific guidance about language and other accessibility issues on various websites, so I’m sharing that information in this post. Jargon is part of my profession so I cannot avoid it, but I can avoid losing my audience by spelling out acronyms, as the National Digital Rights Network’s Accessibility Guidelines suggest. These guidelines also suggest using slide numbers in presentations, which was new guidance for me. I also discovered several helpful tips on the Web Accessibility Initiative’s Making Events Accessible Checklist. This checklist contains tips useful for in person and virtual presentations such as avoiding blinking or flashing animations that could cause seizures and describing all relevant visual information in the environment, e.g., describing that half the audience raised their hands in response to a question. An audience member may have a disability that impacts their vision, hearing, movement, speaking, or understanding so all of these should be considered in making presentations accessible. Contributed by Kate James.

Support for survivors of domestic and sexual violence

Because libraries so often serve as a central place for gathering, sharing, and communicating, they can also help to identify, support, and refer community members who may be experiencing various forms of domestic and sexual violence. Victim Witness Advocate Miranda Dube, who is a former academic librarian and co-editor of “LIS Interrupted: Intersections of Mental Illness and Library Work,” will present a free ninety-minute webinar “Supporting domestic and sexual violence survivors at your library” 6 June 2024, 3:00-4:30 p.m. Eastern. Those who attend the WebJunction webinar will learn about recognizing signs of such abuse, when and how to offer support, and identifying state and local resources for assistance and collaboration.

Miranda Dube combines her firsthand experience helping survivors in the AmeriCorps Victim Assistance Program with her library work to promote services and resources for a too-often overlooked population. As she has written, her intention is to foster library environments that both offer real-life help and enable survivors to avoid “revictimization.” Contributed by Jay Weitz.

University of Michigan extends borrowing privileges to Native and Indigenous people

Earlier this month, the University of Michigan Library (OCLC Symbol: EYM) announced it would extend free borrowing privileges to Native and Indigenous people.  This is related to a new territorial acknowledgement put forward by the library.

The extension of borrowing privileges to Native and Indigenous people by the University of Michigan Library marks an effort to acknowledge and redress the failure of the university to honor an 1817 treaty which ceded land to the state in part for the future education of Anishinaabe people. The territorial acknowledgement also notes that not all tribes in Michigan benefit from state recognition, something that is also common in California where I live. The acknowledgement of institutional harms, paired with meaningful actions, are both examples of steps that libraries can take in moving forward. Contributed by Merrilee Proffitt.

Queer Liberation Library makes diverse, LGBTQ+ literature accessible throughout U.S.

On 3 May 2024, the Windy City Times reported that the Queer Liberation Library (QLL, pronounced “Quill”) launched a free online LGBTQ+ library with more than a thousand ebooks and audiobooks that anyone in the United States can access. Users sign up for a virtual library card at the website. Once their application is approved, they can access items from the library’s collection on Libby, a free app that libraries use to distribute online materials to their patrons. QLL was founded by a small group of volunteers who wanted to ensure queer reading materials were accessible to people throughout the country, regardless of what is available at their local libraries. It took nearly two years to raise funds, create a website, and build out the resource. Organizers chose to create a digital library because it required fewer resources to launch and the collection would be more widely accessible. More than 40,000 people currently use the library. Organizers are “committed to curating a collection that reflects the diversity of queer lives and imaginations,” according to the library’s website. “It is a simple fact that more books have been published about cis gay men than aromantics or intersex people, for example. Knowing this, we will actively seek out materials from all parts of the LGBTQ+ community, to resist replicating the historical and ongoing bias within the publishing world.”

As public and school libraries find it more difficult to collect and share LGBTQ+ literature due to bans and threats, librarians are using non-traditional ways to get these materials to patrons. “With the current climate and book bans and lack of access, there’s a need that we’re happy to fulfill,” said volunteer Amber Dierking. “But there’s also just such a delight and joy to be able to do something like this. So, we’re not just filling a need but also having fun with it along the way.” Contributed by Morris Levy.

The post Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 28 May 2024 appeared first on Hanging Together.

Library of Congress: Designing Storage Architectures 2024 / David Rosenthal

I participated virtually in the 2024 Library of Congress Designing Storage Architectures for Digital Collections meeting. As usual, there were a set of very interesting talks. The slides from the presentations are now online so, below the fold, I discuss the talks I found particulary interesting.

NAND, HDD and Tape Storage Technology Trends

As has become traditional, IBM's Georg Lauhoff and Gary M Decad presented a detailed overview of the storage market. Five slides are of particular interest.

The first is their log-linear graph of the progress of areal density of hard disk (HDD), tape and NAND flash over the last three decades:
  • Tape has improved its areal density at a very consistent ~27%/year. This has been possible due in part to improvements in both media and heads. But, as the graph shows, primarily because the bits on tape are currently about the size that bits on hard disk were two decades ago. This gives tape a lot of headroom before it runs into the pyysical limits.
  • HDD was improving its areal density at around 35%/year until, around 2010, it got too close to the physical limits. Since then it has been growing around 6%/year, despite continual Panglossian predictions from the industry.
  • NAND, in which category IBM includes everything from enterprise SSDs to SD cards, saw explosive growth from the late 1990s to the late 2000s, growing about 3 orders of magnitude in around a decade.It then slowed to a mere 32%/year, meaning that it is still growing significantly faster than the other media.
IBM's second pair of graphs shows (a) a log-linear plot of the Exabytes shipped each year since 2008, and (b) the percentage of the total represented by each medium:
  • The graph of Exabytes shipped shows NAND growing three times faster than the other media, that tape's growth rate is again very consistent, and that drops in HDD shipments for 2022 and 2023 have made the total shipments for HDD and NAND about the same.
  • The market share graph shows tape maintaining a small share slightly eroded by HDD. But NAND is rapidly eroding HDD's share.
The next slide shows (a) a log-linear plot of the cost of a terabyte over time (the Kryder rate), and (b) the ratio between NAND and HDD, and HDD to tape:
  • The Kryder rate graph shows both NAND's and HDD's rate slowing in the late 2000s. Tape again has a more consistent rate.
  • The cost ratio graph shows why NAND is rapidly eroding HDD's market share, as its cost disadvantage is decreasing exponentially. The cost ratio between tape and HDD has been fairly constant, which likely explains why tape's market share has suffered slight erosion.
Their table of 2023 costs shows that NAND is more than 3 times as expensive as HDD. But this is an average across the whole range of NAND and HDD markets. Much NAND goes into market segments where it does not compete with HDD, such as SD cards, USB drives and phones. Whereas most HDD is 3.5" drives in PCs and data centers, where it competes only with enterprise flash. So in markets where they compete, the cost differential will be substantially higher.

They have two slides that echo topics I have posted about fairly often. The first points out that industry projections of future areal density (and thus cost) routinely exaggerate the rate of growth by at least 10%. This why I frequently report on how happy the good Dr. Pangloss is with the storage industry.

The second of them expresses skepticism about the prospect of DNA storage impactuing the market. Their cost graph shows the cost to write a terabyte of DNA is around 10 orders of magnitude more than for HDD, and the cost to read it once is around 3 orders of magnitude more than buying a terabyte of HDD. They write:
  • Very slow to read and write.
  • Very expensive.
  • Large Market size needed to develop technology; a sub-tier below tape storage is small.
I pointed this out in 2018's DNA's Niche In The Storage Market. Even if projections about the increase in demand for archival storage pan out, it will be hard for new media to compete with tape.

They also write something mystifying — "Not so stable", citing DNA Data Storage by Tomasz Buko, Nella Tuczko, and Takao Ishikawa of the University of Warsaw. This paper contends that DNA is vastly more stable than tape, or any current medium:
For years, a DNA specimen collected from a 700,000-year-old horse was considered to be the oldest extracted DNA. However, in 2021, this record was pushed to 1 million years. DNA extracted from mammoth teeth was successfully extracted and sequenced. Additionally, scientists managed to sequence 300,000-year-old mitochondrial DNA from humans and bears. These examples perfectly illustrate the longevity of DNA and proves its usefulness for archeological purposes or data storekeeping. If stored in optimal conditions and dehydrated, DNA can possibly endure for millions of years.

Seagate Storage Update

Jon Trantham's look at the hard disk market was interesting. First, his graph of demand for nearline HDDs explains why the industry suffered hard times recently and why they believe the future looks brighter. A period of rapid growth led to an inventory buildup from exuberant demand forecasts, but the inventory has been depleted and demand is now rising. One might be skeptical of the rate of demand recovery in the graph, but at least the US economy suggests rising demand is plausible.

His slide on IDC's market forecast supports my contention that IBM's cost ratio between NAND and HDD is misleading. The non-cloud HDD market is projected to be a small proportion of the total, and to grow slowly. The vast bulk of the HDD market is for cloud storage, and thus the effective cost ratio is between HDD and enterprise SSDs. This would be much greater than IBM's overall estimate of 3.

Seagate projects that the capacity per platter will rise from today's 2.4TB to over 5TB by 2026 (see Dr. Pangloss and IBM) using their Mozaic HAMR technology. Current 16TB drives have 9 1.78TB platters. Seagate recently started shipping 30TB drives with 10 3TB platters. If they are right a 10 platter drive in 2026 would be 50TB, or around 3x the current drive capacity. This would certainly help maintain HDD's market share.

Some additional topics:
  • Seagate is working to move HDDs to the NVMe interface, to simplify and speed up datacenter systems.
  • Their sustainability efforts include trying to avoid customers shredding drives at their end-of-life. This is driven by customers requirement for data security. But Seagate encrypts the data on the platters and securely erase it by overwriting the key. The problem is to persuade customers that this satisfies their requirement. Seagate also wants to recover the rare earth magnets by industrial-scale disassembly of the drives.
  • Given a target of 98% drive reliability, as the number of platters increases the number of heads increases and thus the required reliability of the heads increases. Seagate is advocating that, in the case of head failure, the drive remains in use with reduced capacity. By provisioning a pool of spare drives, and avoiding failing an entire drive if a single head fails, the cost of ownership can be significantly reduced. 40-60% of all drive failures are single head failures.
Western Digital's presentation also focused on sustainability.

Design and Operation of Exascale Archives in Azure

The first slide of Microsoft's Aaron Ogus and Shashidar Joshi's talk shows why cloud systems like Azure are dominating the industry.

Their observations from the experience of running Exabytes of archival storage containing trillions of objects and servicing billions of requests each month are:
  • Writes dominate
  • Reads are infrequent
  • Small reads require low latency random access
  • Large reads require good throughput
  • Archive Storage system needs dynamic provisioning to account for workload variances
All the early studies showed that archival storage workloads are write-domminated, because most archival data is written and very rarely accessed. That's why it has been archived. It is reasssuring to know that this is still true in the cloud era. Given this, and that the Service Level Agreements for cloud archives do not require low latency, it isn't clear why Microsoft thinks this is important.

The challenges Microsoft sees with their current technologies are:
  • Mechanical overheads lead to latencies
  • Environmental conditions limit deployment capabilities
  • Uncertainty with roadmaps, capacities and costs
  • Need for media migrations at EOL
  • Opportunity for new storage technologies
The environmental requirements of current technologies were one reason for Facebook's use of optical storage, it only needed warehouse space not a data center. Anecdotal reports at the meeting revealed serious issues with moving tapes between environments.

The uncertainties are always with us, not least because the economics of long-term storage depend strongly on interest rates. The "need for media migration at EOL" is one major motivation for the use of quasi-immortal media, such as Project Silica. But this is illusory. As I wrote about Facebook's optical cold storage:
No-one expects the racks to sit in the data center for 50 years, at some point before then they will be obsoleted by some unknown new, much denser and more power-efficient cold storage medium.
Earlier this year I discussed Microsoft Research's view of the "opportunity for new storage technologies" in Microsoft's Archival Storage Research. They are pursuing two technologies, DNA storage, and Project Silica. Ogus and Joshi discuss Project Silica, claiming that it provides "Performance (Random IO) per TB metric better than currently available Archive technologies". But part of the design of Project Silica is that the write and read drives are different, and thus:
This allows independent scaling of read and write throughput.
Their claim depends upon the system being configured with enough write (if they meant I) or read (if they meant O) drives.

The Data Storage Industry Gets Ready for AI

Fred Moore made the important point that archival storage faced a significant risk of a Vertical Market Failure (VMF):
  • The Zettabyte scale secondary storage market (cold, archive) has become the exclusive domain of few suppliers.
  • IBM is the only (1) tape drive developer/supplier controlling the entire tape ecosystem specifications.
  • Fujifilm and Sony are the only (2) LTO tape media suppliers.
  • HPE, IBM, Quantum and Spectra are the primary large-scale tape and library suppliers.
  • Seagate, Toshiba and WD are the only (3) remaining HDD suppliers.
  • (Tape - WD), (HDD - Seagate and TDK) are the only R/W head manufacturers.
  • Will current HDD and tape development roadmaps keep pace with demand?
  • HSDCs leverage their bargaining and buying power to drive down prices impacting vendor margins, R&D investments.
  • In the event of a secondary storage VMF, sustainability challenges will become insurmountable for HDDs to address.
  • As supplier profit margins become insufficient, future R&D funding, roadmaps, will place innovation at significant risk.
Archival storage is a very small part of the total storage industry. NAND and HDDs pay for their R&D in the much bigger online and nearline markets. Tape leverages a little of that, in that its head technology is based on HDD head technology, but in general has to fund R&D solely from the archive market. And so will potential novel archival storage technologies, such as DNA and Project Silica. Fundamentally, companies don't want to invest in archival storage because it doesn't generate income, so the market isn't just small, but under significant margin pressure. This is why I wrote Archival Media: Not a Good Business six years ago. If and when NAND eventually displaces HDD for nearline storage, the reduced market will definitely cause a VMF, with knock-on effects on tape.

Join us for the Inaugural Webinar of OPEN GOES COP, a movement advocating for openness in the UN Climate Change Conferences / Open Knowledge Foundation

OPEN GOES COP is a coalition of organisations and individuals aiming to advocate for openness in the context of the UN Climate Change Conferences (COP). We aim to overcome the lack of discussion on the role of ‘openness’ as a necessary condition for addressing the climate crisis and to build the capacity of open movement activists and stakeholders from civil society and academia to influence high-level decisions on related issues.

In this Inaugural Webinar, we call on all those interested in joining the coalition to develop common strategies and work together to make information and materials freely available.

There will be an introduction to the COP processes followed by a brief introduction to the aims of the coalition and a short statement from each participating organisation. The meeting will then be open and conversational to decide together on the next steps. 


  • Adam Yakubu – Institute for Energy Security (IES)
  • Sara Petti and Maxwell Beganim – Open Knowledge Network
  • Monica Granados – Open Climate Campaign
  • Otuo-Akyampong Boakye – Wiki Green Initiatives

The Open Goes COP Inaugural Webinar will be held in English. Future meetings in other languages are in the planning stage.

If you’re working at the intersection of openness and climate change, please come! It will only work if more people and organisations get involved.

At this early stage, the coalition is convened by Wiki Green Initiatives, Open Knowledge Network, Open Climate Campaign, and Open Data Charter.

Event details:

  • 🗓 5 June 2024 (World Environment Day)
  • 🕒 3 pm UTC/GMT
  • 📍 Online (Zoom)

European Tour / Jonathan Brinley

Several years ago, Modern Tribe invited me on the annual team trip. Instead of the “usual” Central American/Caribbean destination, we were meeting in Tuscany. After the trip, Stephanie would join me in Rome, where we would have a few days to explore before heading home. This adventure was scheduled for May 2020—it didn’t happen.

Jump forward four years. The kids are teenagers, finances are more flexible, the world is open again. We decided it’s time to try again, this time as a big adventure for the whole family. In May 2024, we went on the Brinley Family Grand European Tour, a 17-day trip with stays in Rome, London, Paris, and Munich.

I will not herein attempt to capture the entire journey. Rather, I want to highlight a handful of experiences and create a space to share selected photographs. (Stephanie has her own selection of photos over at Laughter & Dance.)

Notable Sights

Great Food

Everything was delicious. Our first evening in Rome included a food tasting tour, and a few friends recommended restaurants. Otherwise we just searched Google Maps for nearby restaurants with 4.5+ ratings.

Magnificent Organs

We encountered a plethora of churches/cathedrals/chapels filled with beautiful organs, although we only had the opportunity to hear two of them: in St. Paul’s Cathedral for the Sunday eucharist, and a short concert in the Salzburg Cathedral that featured three of the seven(!) organs therein.

Odds & Ends

"Sufficiently Decentralized" / David Rosenthal

Mining Pools 5/17/24
In June 2018 William Hinman, the Director of the SEC's Division of Corporate Finance, gave a speech to the Yahoo Finance All Markets Summit: Crypto entitled Digital Asset Transactions: When Howey Met Gary (Plastic) in which he said:
when I look at Bitcoin today, I do not see a central third party whose efforts are a key determining factor in the enterprise. The network on which Bitcoin functions is operational and appears to have been decentralized for some time, perhaps from inception.
Over time, there may be other sufficiently decentralized networks and systems
Below the fold, thanks to a tip from Molly White, I look at recent research suggesting that there is in fact a "central third party" coordinating the enterprise of Bitcoin mining.

I have been pointing out that the crypto-bros claims of decentralization are false for more than a decade, most recently in Decentralized Systems Aren't. In that talk I quoted Vitalik Buterin from 2017 in The Meaning of Decentralization:
In the case of blockchain protocols, the mathematical and economic reasoning behind the safety of the consensus often relies crucially on the uncoordinated choice model, or the assumption that the game consists of many small actors that make decisions independently. If any one actor gets more than 1/3 of the mining power in a proof of work system, they can gain outsized profits by selfish-mining. However, can we really say that the uncoordinated choice model is realistic when 90% of the Bitcoin network’s mining power is well-coordinated enough to show up together at the same conference?
Blackburn et al Fig. 5c
Coordination among Bitcoin miners has a long history. In 2022 Alyssa Blackburn et al's Cooperation among an anonymous group protected Bitcoin during failures of decentralization showed that Bitcoin's centralization problem dated back to its earliest days. They were able to:
estimate the effective population size of the decentralized bitcoin network by counting the frequency of streaks in which all blocks are mined by one agent (bottom-left) or two agents (bottom-right). These are compared to the expected values for idealized networks comprising P agents with identical resources. The comparisons suggest an effective population size of roughly 5, a tiny fraction of the total number of participants.
Bitcoin started in 2009 with one miner (Nakamoto) and two years later it was dominated by five miners. It has been dominated by 5 or fewer mining pools ever since.

In 2014's Economies of Scale in Peer-to-Peer Networks I wrote:
When new, more efficient technology is introduced, thus reducing the cost per unit contribution to a P2P network, it does not become instantly available to all participants. As manufacturing ramps up, the limited supply preferentially goes to the manufacturers best customers, who would be the largest contributors to the P2P network. By the time supply has increased so that smaller contributors can enjoy the lower cost per unit contribution, the most valuable part of the technology's useful life is over.
In December 2021 Alex de Vries and Christian Stoll estimated that:
The average time to become unprofitable sums up to less than 1.29 years.
It has been obvious since mining ASICs first hit the market that, apart from access to cheap or free electricity, there were two keys to profitable mining:
  1. Having close enough ties to Bitmain to get the latest chips early in their 18-month economic life.
  2. Having the scale to buy Bitmain chips in the large quantities that get you early access.
And it wasn't just Buterin that noticed that the big mining pools were "well-coordinated". In 2021's Blockchain Analysis of the Bitcoin Market Igor Makarov & Antoinette Schoar wrote:
Six out of the largest mining pools are registered in China and have strong ties to Bitmain Techonologies, which is the largest producer of Bitcoin mining hardware.
Protos provides much better evidence of just how "well-coordinated" the big pools are in New research suggests Bitcoin mining centralized around Bitmain:
A sleuth found a clue in Antpool’s block template: A manually prioritized transaction immediately after the 6.25 BTC block reward or ‘coinbase’ transaction. This new research by pseudonymous Bitcoin developer 0xb10c seemingly confirms long-rumored practices by Antpool hiding its massive operation under the names of ostensibly independent pools.

In short, it warns that despite tens of thousands of decentralized nodes, Bitcoin might actually be quite centralized from a mining perspective.
There are two sleuths involved, discovering two kinds of evidence. First:
0xb10c detected that Pool, Binance Pool, Poolin, EMCD, and Rawpool show signs of using Antpool’s method for prioritizing the post-coinbase transaction.

Antpool might also use a sixth pool, Braiins, but 0xb10c was still analyzing its merkle branches as of the research publication time. Nearly identical merkle branches might indicate that these five or six pools often use the exact same template as Antpool for selecting transactions to include in a block.

In other words, all of these pools often use Bitmain’s machines, often assemble transactions according to Bitmain’s block template, often prioritize the same manually-configured post-coinbase transaction as Bitmain, and often send coinbase and transaction fees to the same custodian as Bitmain.
Second, mononaut discovered:
A single custodian now controls the coinbase addresses of at least 9 pools, representing 47% of total hashrate.
Mononaut traced coinbase rewards from mining pools AntPool, F2Pool, Binance Pool, Braiins, BTCcom, SECPOOL, Poolin, ULTIMUSPOOL and 1THash, and Luxor. He found suspicious levels of cooperation from these supposedly competitive entities in allocating coinbase rewards to a shared — possibly Antpool-controlled — custodian.

0xb10c couldn’t confirm that SECPOOL and SigmaPool entirely cloned AntPool’s template, although they seemed to share a similar template. In all, it seems unlikely that up to nine major bitcoin mining pools use a shared custodian for coinbase rewards unless a single entity is behind all of their operations.
Thus it appears that, instead of being controlled by 3 large mining pools, Bitcoin's blockchain is actually contolled by a single huge mining pool operating through a set of subsiduaries. And that this pool is controlled by Bitmain.

From Bitmain's point of view, this makes a lot of sense. They have essentially one product, mining rigs. Controlling the mechanism through which the bulk of their customer base is "well-coordinated" would be a big help in generating consistent excess profit.

The image is's 4-day mining history. Extracting the pools mentioned in the research we have this table:
Blocks Mined 5/13-17 by suspects
Binance Pool3.880%221,474
Braiins Pool2.293%13871
In 4 days there should be 576 blocks. 40.744% of 576 is 235 blocks, so close enough. There are some pools mentioned that don't appear in the history (SECPOOL, ULTIMUSPOOL, 1THash, EMCD, Luxor). Equally, there may be "well-coordinated" pools missing from the research. So Bitmain does appear to control significantly more power than the biggest single pool. Foundry USA controls 31.746%, and together with Bitmain's collaborators controls 72.49% of the hashing power. The Bitmain pools are mining almost $4M/day at today's "price".

But we should not worry that the Bitcoin blockchain is even less decentralized than it has been all along. It is in safe hands. Bitmain isn't going to kill the goose that lays the golden eggs.

It is only fair to point out that the Ethereum community has actually improved decentralization slightly. A year ago the top 5 staking pools controlled 58.4%, now they control 44.7% of the stakes. But it is still true that block production is heavily centralized, with one producer claiming 57.9% of the rewards.

No-one really cares that cryptocurrencies are actually centralized; they care that they are seen as decentralized. In Deconstructing ‘Decentralization’: Exploring the Core Claim of Crypto Systems Prof. Angela Walch explains why this appearance is important:
the common meaning of ‘decentralized’ as applied to blockchain systems functions as a veil that covers over and prevents many from seeing the actions of key actors within the system. Hence, Hinman’s (and others’) inability to see the small groups of people who wield concentrated power in operating the blockchain protocol. In essence, if it’s decentralized, well, no particular people are doing things of consequence.

Going further, if one believes that no particular people are doing things of consequence, and power is diffuse, then there is effectively no human agency within the system to hold accountable for anything.
In other words, it is a means for the system's insiders to evade responsibility for their actions.

In Decentralized Systems Aren't I pointed out that:
The fact that the coins ranked 3, 6 and 7 by "market cap" don't even claim to be decentralized shows that decentralization is irrelevant to cryptocurrency users. Numbers 3 and 7 are stablecoins with a combined "market cap" of $134B. The largest stablecoin that claims to be decentralized is DAI, ranked at 24 with a "market cap" of $5B.
I rest my case.

Imagining library futures using AI and machine learning / HangingTogether

The following post is part of an ongoing series about the OCLC-LIBER “Building for the future” program. A Dutch version of this blog post is also available.

User walking through library stacks, with transparent imagery of thoughts, data, and more arching across shelves.Image generated using Adobe Firefly AI

The OCLC Research Library Partnership (RLP) and LIBER (Association of European Research Libraries) hosted a facilitated discussion on the topic of AI and machine learning on 17 April 2024. This event was a component of the ongoing Building for the future series exploring how libraries are working to provide state-of-the-art services, as described in LIBER’s 2023-2027 strategy.

As with the previous sessions in the series, on the topics of research data management and data-driven decision making, members of the OCLC RLP team collaborated with LIBER working group members to develop the discussion questions and support small group discussion facilitation.

The virtual event was attended by participants from 31 institutions across twelve countries in Europe and North America, and this post synthesizes key points from the small group discussions.

Curiosity, confusion, and uncertainty

We kicked off the event by asking participants how they feel about the use of AI and machine learning in libraries, and they responded with a range of complex emotions. While curious and interested about the uses and future of AI, librarians are also skeptical and apprehensive.

Word cloud reporting librarian feelings about AI, with feelings of interest, curiosity, uncertainty, and skepticism dominating.Word cloud reporting librarians’ feelings about AI

In the small group discussions, participants expressed significant concerns about:

  • Environmental impacts due to significant energy usage
  • Privacy of user data
  • Use of copyrighted materials in LLMs and uncertainty about intellectual property ownership
  • Misinformation created by the inaccuracies and hallucinations delivered by generative AI
  • Risks of nefarious manipulations, particularly of voice recordings
  • English language dominance in LLM models
  • The ability to acquire relevant and usable information amidst intense information overload

Upskilling is lonely work. Most people are acting independently to develop their own AI knowledge, through experimentation with an array of tools, and virtually everyone participating in these discussions reported being in the experimentation and learning phase. More structure and support is sorely needed, and a few participants described how they had benefited from a team approach, such as through the establishment of an AI interest group in their library, or by participating in facilitated discussions such as this one.

What’s in their AI tool kit? We asked participants about the tools they are using, and ChatGPT unsurprisingly dominated the list, followed by Microsoft Copilot. Mention of tools like Transkribus, eScriptorium, and DeepL reflect library interests in text and image transcription, analysis, and translation, while a long tail of products like Elicit, Gemini, ResearchRabbit, Perplexity, and Dimensions AI reflect an interest in research discovery and analysis.

Discussions about AI in libraries are strongly influenced by their institutional contexts. Many participants described a pervasive institutional focus on concerns about academic integrity. Policies and guidelines are emerging at local, consortial, and association levels, such as the principles on the use of generative AI tools in education from the Russell Group of research universities in the United Kingdom, which emphasizes not only academic integrity but also the role of universities in supporting AI literacy and equitable access among its affiliates.

Research universities are beginning to provide enterprise services. A few US institutions are launching local chatbots for use by faculty, staff, and students. Participants from the University of California-Irvine shared about the institutionally-supported ZotGPT, built upon the Microsoft Azure platform, which is provided to campus users at no cost. By providing a local tool, the institution can equalize access to experimentation while also overcoming privacy concerns, as the data inputs remain local. This is almost certainly an area we will see more growth in.

AI use cases in libraries

We asked participants to consider the ways libraries can leverage AI, resulting in a rich mine of potentialities, which I have organized into six high-level use case categories:

  • Metadata management
  • Reference support
  • Discovery and content evaluation
  • Transcription and translation
  • Data analytics and user intelligence
  • Communications and outreach

Metadata management topped the list. We heard several participants mention an interest in using machine learning models to create MARC records, and indeed, we heard numerous examples of exploration in this area. For example, the National Library of Finland has experimented with automated subject indexing, resulting in the Annif microservice. In the United States, LC Labs at the Library of Congress has undertaken a project called Exploring Computational Description (ECD) in order to test the effectiveness of machine learning models in the creation of MARC record fields from ebooks. You can learn more via this recorded OCLC RLP webinar. Other participants described local efforts to use textual information to generate subject headings as well as experiments to use tools like Gemini. Participants found their early results disappointing, as they mostly offered a lot of “fictional data,” but still remained optimistic about the potentialities.

In addition to metadata creation, participants are interested in how AI and machine learning technologies may be used to improve metadata quality. This could include anomaly and duplicate record detection or perhaps detection of incorrect coding of languages in records. OCLC has shared about its use of machine learning to identify duplicate records in WorldCat, with input from the cataloging community.

Reference support. Several participants expressed an interest in leveraging AI in order to create a library reference chatbot, in order to instantly answer questions that can be answered by information on local web pages. A participant from the University of Calgary briefly shared how their library has implemented a multilingual reference chatbot called T-Rex, which leverages both an LLM as well as retrieval-augmented generation (RAG), and is trained on the library’s own web content, including LibGuides, operating hours, and much more. In operation for over a year, the effort has been successful and appreciated by librarians, as it reduced the amount of human support required for simple questions.[1]

Discovery and content evaluation. Participants are also interested in how AI technologies can enhance discovery, for example, by enabling searching with natural language phrases in addition to keywords. We heard about some innovative projects at national libraries to support discovery use cases, such as a chatbot answering questions from the digitized newspaper collection at the National Library of Luxembourg.

Researchers are using a number of freestanding tools like scite, Consensus, ResearchRabbit, Perplexity, and Semantic Scholar in order to summarize relevant findings from aggregated content, receive citation recommendations, and visualize research landscapes. The Generative AI Product Tracker compiled by Ithaka S+R offers a useful guide to this ecosystem. In addition, participants described researcher uptake of new AI functionality being built into existing research indexes like Scopus and Dimensions. Like the reference chatbot example above, it appears that these tools use a combination of retrieval-augmented generation (RAG) to query only the local index and generative AI, which processes the returned information into an answer to the original question while minimizing hallucinations.

Transcription and translation. Librarians are keenly interested in transcribing tools, which can increase the accessibility and use of cultural heritage collections. In the discussions, we heard about speech-to-text experimentation (using automatic speech recognition (ASR) technology) taking place at the National Library of Norway and the Royal Danish Library. Several participants mentioned using the Transkribus and eScriptorium platforms to support text recognition and image analysis of digitized historical documents. There’s also interest in how these tools can support researchers working in languages where they have poor proficiency.

Data analytics and user intelligence. While not at the top of the list, more than one participant expressed an interest in using data science and AI tools to learn more about patron behaviors in order to support improved library management.

Communications and outreach. One participant described how their library is using ChatGPT to generate content for library social media feeds, with human review. This seems like a general purpose use case that I expect to hear more about.

Supporting responsible operations

Participants discussed the need for responsible AI practices, particularly the need for AI to be transparent, accountable, and inclusive. There was considerable focus on the need for transparency of LLM data sources, including an examination of the legality of data scraped for use in training sets. In addition to previous reports like OCLC Research’s Responsible Operations: Data Science, Machine Learning, and AI in Libraries, many other research projects, statements, workshops, and events are emerging to guide libraries in ethical decision making about AI. Just a few of these include:

Participants shared their thoughts on how libraries can lead, which included chairing campus discussions about AI literacy, uses, and good academic practices. The LIBER Data Science in Libraries Working Group (DSLib) has been discussing how libraries can interact with AI-generated misinformation and fake news.

Leadership roles for libraries in AI literacy

A principal way that libraries can and are leading is in supporting AI literacy education and training, which many participants described as the newest component of information literacy training. To guide students and researchers, librarians must quickly upskill in order to teach others.

What do libraries require for success?

Through these conversations, I heard participants describe many things that libraries need to successfully move forward. At the most basic level, librarians need access to tools and the time to practice and experiment. Only through these preconditions will librarians gain the content mastery necessary to both serve as campus experts to users and to lead library-based efforts. For example, one participant described how librarians must be familiar with LLM hallucinations, including the creation of fake citations, in order to have the knowledge and confidence to work with patrons using chatbots. Another local need is for more professionals with data analytics skills to be situated in the library, working as part of a cross-functional team, consistent with comments we heard in a previous session about data-driven decision making.

Skills development is independent and ad hoc at this point. Participants want more training guides, external support and sample use cases, and they also want to engage meaningfully with others in communities of practice.

Looking ahead

Word cloud about "what excites you about the future of AI and libraries?" with answers like efficiency, accessibility, and collaboration.Librarians see a hopeful future for libraries and AI

These small group discussions are valuable for connecting library professionals across many time zones. Some participants reported feeling reassured that others were grappling with the same uncertainties at early stages of discovery and experimentation. Overall, participants reported feeling excited and hopeful about the opportunities for AI to support greater efficiency and time-savings in libraries.

Join us on Thursday, 6 June for the closing plenary event of the OCLC-LIBER Building for the Future series. This session will synthesize the high level takeaways from the previous small group discussions, followed by a panel discussion by library thought leaders, who will respond with their perspectives on how research libraries can collaboratively plan in these challenging times. Registration is free and open to all. I’ll see you then.

[1] Julia Guy et al., “Reference Chatbots in Canadian Academic Libraries,” Information Technology and Libraries 42, no. 4 (December 18, 2023),

The post Imagining library futures using AI and machine learning appeared first on Hanging Together.

AI Sauna quick reflection / Open Knowledge Foundation

During the early days of May, AvoinGLAM hosted a co-creation event AI Sauna that brought practitioners of Open Culture from Wikimedia, Creative Commons, Open Future, Flickr Foundation, Meemoo and others together with representatives from Finnish memory institutions and research projects.

National Archives of Finland, AI Sauna event. By Fuzheado – Own work, CC0,

The event kicked off on Monday morning the 6th of May with inspiring talks that led the participants through listening to idea creation in the magnificent old reading room of the National Archives of Finland. The speakers brought inspiring perspectives into the discussion around the impact of AI in the shared online culture from many directions.

→ If you missed the event, you can still watch the inspire talks and the following panel discussion on the event playlist on AvoinGLAM YouTube channel or read the recap in This Month in GLAM.

Before the hacking/co-creation/brainstorming could begin, we invited all guests to enjoy a sauna and a swim in the 9°C seawater at Allas Sea Pool at the Helsinki harbor.

On the following Tuesday morning, the work started at URBAN3 at Maria01, which is the home base for Open Knowledge Finland, AvoinGLAM and Wikimedia Finland. The roughly 4 hours of work was enough to create a plethora of outstanding projects. 

The stream containing these presentations will be available on the AvoinGLAM YouTube channel at a later point.

Further ideas from the Ideas page:

  • Authorship of political artists and Embodied creative process in the context glassblowing by Liisi Soroush
  • Hot topics in the Finnish local letters of the 1860s TuulaP
  • GenAI for Moroccan Arabic Ideophagous
  • History of the Basque Country in 100 objects
  • Summary of all knowledge Susanna

The documentation is forever on AI Sauna pages on Wikimedia Meta, so you can ping the creators and continue work on interesting topics. The project ideas can be found on the Project ideas page, and contacts to most of the participants on the People page.

Check out the slides for Monday and Tuesday that are also available, or the image category on Wikimedia Commons.

Let’s bathe on!

Announcing the COLD French Law Dataset / Harvard Library Innovation Lab

COLD French Law Banner

There is a new addition to the Collaborative Open Legal Data collection: a set of over 800,000 articles extracted from the LEGI dataset, one of France’s official open law repositories, that were programmatically identified as “currently applicable French law” by our pipeline.

This dataset—formatted into a single CSV file and openly available on Hugging Face—contains original texts from the LEGI dataset as well as machine-generated French to English translations thanks to the participation of the CoCounsel team at Casetext, part of Thomson Reuters.

COLD French Law was initially compiled to be used in a forthcoming experiment at the Lab. We are releasing it broadly today as part of our commitment to open knowledge. We see this dataset as a contribution to the quickly expanding field of legal AI, and hope it will help researchers, builders, and tinkerers of all kinds in their endeavors.

The Process

As part of these release notes, we would like to share details about the process used to translate the articles contained in the dataset.

In a field where the volume of data is so important, it’s useful to understand the plausibility of working with a dataset in one language with an LLM trained in another. This process revealed some techniques for not only reliably translating a large set of documents, but also for doing so efficiently. We do not plan to maintain this dataset outside of the needs of our experiments, and are therefore sharing the details of the pipeline so that others may update the data in the future if needed.

Over the course of two months the CoCounsel team ran all ~800,000 articles through a translation pipeline that took each individual entry and translated it from its original French into English using OpenAI’s GPT-4 large language model. One hurdle was the variety of important metadata for each entry that was also in French, and a desire to retain each of the articles in its fullest form.

Via GPT-4’s function-calling feature, the pipeline was able to translate the full entries, and allowed each column of an entry to be translated in a single call (or couple of calls in the limited cases where entries were longer than 2,500 tokens.) This saved weeks of processing. Additionally, this technique outputs individual JSON files for each of the law articles.

With this approach, we were able to run the pipeline for just a few hours each night, and the structure of the dataset remained intact.

Over the course of this process adjustments were made to the prompt based on the expertise of the CoCounsel team and feedback provided by Timothée Charmeil, an LL.M. candidate at HLS, who quality tested samples of the initial outputs.

The final prompt that was engineered by our colleagues is shared below.

The Prompt

COLD French Law dataset on Hugging Face

COLD French Law CLI pipeline on Github

See also: COLD Cases Dataset

Empowering Digital Citizenship: Unlocking the Power of Open Knowledge with Participants of the LIFE Legacy / Open Knowledge Foundation

In today’s digital landscape, understanding open knowledge and digital citizenship is crucial for navigating the online world effectively and responsibly. A recent session delved into these vital topics, equipping participants with the knowledge and tools necessary to thrive in the digital age.

The session commenced with an introduction to open knowledge, highlighting its significance in the digital space. Open knowledge refers to the free and unrestricted access to information, ideas, and resources. This concept is essential in promoting collaboration, innovation, and progress.

Maxwell Beganim, lead of Open Knowledge Ghana and coordinator of the Open Knowledge Network Anglophone Africa Hub, facilitated an interactive discussion on digital citizenship, exploring its various elements and internet knowledge. Digital citizenship encompasses the rights, responsibilities, and skills required to navigate the digital world safely and ethically. The discussion covered critical aspects such as online privacy, security, and etiquette, empowering participants to become responsible digital citizens. This is part of Open Knowledge Ghana’s mandate to help build a world open by design where all knowledge is accessible to everyone against the backdrop of the Open Knowledge Foundation vision.

Kiwix Tool: A Game-Changer for Accessing Knowledge

Ruby D. Brown, Project Coordinator at Open Knowledge Ghana, took participants on a journey through the Kiwix tool, demonstrating its usage and importance. Kiwix is an offline Wikipedia reader, providing access to a vast repository of knowledge even without internet connectivity. This tool is particularly valuable for individuals with limited or no internet access, bridging the knowledge gap and promoting digital inclusivity.

Ruby further took participants through an overview of Wikipedia and met the critical needs of information literacy.

The session culminated with participants installing the Kiwix tool on their laptops, ensuring they have a valuable resource at their fingertips. With Kiwix, users can access a vast library of knowledge, including Wikipedia articles, books, and educational resources, even without internet connectivity.

Read more about Kiwix implementation and Environmental sustainability by Maxwell Beganim and Otuo Boakye Akyampong:

Ghanaian Wikimedian empowers students with offline educational app

Beginning in February 2020, Ghanaian Wikimedian Maxwell Beganim and a community volunteer Boakye Otuo Acheampong started using Kiwix and offline Wikipedia

The session successfully empowered participants of the LIFE legacy Project with a deeper understanding of open knowledge and digital citizenship. By embracing these concepts and leveraging tools like Kiwix, individuals can navigate the digital landscape with confidence, responsibility, and a commitment to lifelong learning. As we continue to evolve in the digital age, we must prioritize digital literacy, inclusivity, and access to knowledge, ensuring that everyone can thrive in the online world.

The Life Legacy project in Ghana is by Paradigm Initiative with Internet Society Ghana Chapter as the country implementation partner. LIFE is an acronym for Life Skills, ICTs, Financial Readiness, and Entrepreneurship. The project is aimed at building the capacity of underserved youth in communities. Paradigm Initiative implements this program through its partners in countries across Africa.

Pew Research On Link Rot / David Rosenthal

When Online Content Disappears by Athena Chapekis, Samuel Bestvater, Emma Remy and Gonzalo Rivero reports results from this research:
we collected a random sample of just under 1 million webpages from the archives of Common Crawl, an internet archive service that periodically collects snapshots of the internet as it exists at different points in time. We sampled pages collected by Common Crawl each year from 2013 through 2023 (approximately 90,000 pages per year) and checked to see if those pages still exist today.

We found that 25% of all the pages we collected from 2013 through 2023 were no longer accessible as of October 2023. This figure is the sum of two different types of broken pages: 16% of pages are individually inaccessible but come from an otherwise functional root-level domain; the other 9% are inaccessible because their entire root domain is no longer functional.
Their results are not surprising, but there are a number of surprising things about their report. Below the fold, I explain.

The Web is an evanescent medium. URLs are subject to two kinds of change:
  • Content drift, when a URL resolves to different content than it did previously.
  • Link rot, when a URL no longer resolves.
The Pew team found link rot in Common Crawl's collections:
  • A quarter of all webpages that existed at one point between 2013 and 2023 are no longer accessible, as of October 2023. In most cases, this is because an individual page was deleted or removed on an otherwise functional website.
  • For older content, this trend is even starker. Some 38% of webpages that existed in 2013 are not available today, compared with 8% of pages that existed in 2023.
And in news sites, government sites and Wikipedia:
  • 23% of news webpages contain at least one broken link, as do 21% of webpages from government sites. News sites with a high level of site traffic and those with less are about equally likely to contain broken links. Local-level government webpages (those belonging to city governments) are especially likely to have broken links.
  • 54% of Wikipedia pages contain at least one link in their “References” section that points to a page that no longer exists.
There is a long history of research into both phenomena. Content drift is important to Web search engines. To keep their indexes up-to-date, they need to re-vist URLs frequently enough to capture changes. Thus studies of content drift started early in the history of the Web. Here are some examples from more than two decades ago:
Link rot is a slower process than content drift, so research into it started a bit later. Here are some examples from more than two decades ago:
I have written about this topic many times, first in 2008's Persistence of Poor Peer Reviewing:
I like to cite an example of really bad reviewing that appeared in AAAS Science in 2003. It was Dellavalle RP, Hester EJ, Heilig LF, Drake AL, Kuntzman JW, Schillin MGLM: Going, Going, Gone: Lost Internet References. Science 2003, 302:787, a paper about the decay of Internet links. The the authors failed to acknowledge that the paper repeated, with smaller samples and somewhat worse techniques, two earlier studies that had been published in Communications of the ACM 9 months before, and in IEEE Computer 32 months before. Neither of these are obscure journals. It is particularly striking that neither the reviewers nor the editors bothered to feed the keywords from the article abstract into Google; had they done so they would have found both of these earlier papers at the top of the search results.
The first surprise is that the Pew report lacks any acknowledgement that the transience of Web content is a long-established problem; like Dellavalle et al it is as if it was a new revelation.

Even before published research had quantified it, link rot and content drift were well understood and efforts were underway to mitigate them. In 1996 Brewster Kahle had founded the Internet Archive, the first of several archives of the general Web. Two years later, the LOCKSS Program was the first effort to establish a specialized archive for the academic literature. Both were intended to deliver individual pages to users. A decade later, Common Crawl was set up to deliver Web content in bulk to researchers such as the Pew team; it is not intended as a mitigation for link rot or content drift.

Although Common Crawl was a suitable resource for their research, the second surprise is that the Pew report describes and quantifies the problem of link rot, but acknowledges none of the multiple, decades-long efforts to mitigate it by archiving the Web and providing users with preserved copies of individual pages.

#ODDStories 2024 @ Goma, DRC Congo 🇨🇩 / Open Knowledge Foundation

The ongoing war between M23 rebel group and the regular army in eastern Congo war-torn regions is causing massive population displacements who lack essential aid and basic needs.

On March 7-9, 2024 in the city of Goma (North Kivu province) in the eastern Democratic Republic of the Congo, Media Sensitive to Disasters – MSD Network held an Open Data Day event. MSD is a media network aiming to increase the media’s coverage of risk and disasters.

Entitled “Forced Displacement Open Mapping”, the event was organized in the context of tensions where M23 launched two bombs on March 7, 2024 in Mugunga (Western coast of the city of Goma up to 5km from the battlefields).

The overall goal of the event was to identify newly established displaced camps in eastern Congo war-torn regions for humanitarian assistance.

By developing a timeline feature to illustrate the evolution of forced displacements over the course of the ongoing war and how these displacement patterns change in response according to recent war development, the proposed aimed at producing spatial data that help humanitarian, media, local organizations and the Congolese officials design and implement evidence-based interventions beneficial for displaced deprived from basic needs.

Up to 15 participants have been involved in the displaced camp identification and mapping activities through the North Kivu province which have been coordinated by Rachel KIYUNGI and animated by Cleophas Byumba, Expert in Mapping and Fact-checking.

Project’s activities

Displaced camp in the battlefields (March 07-08, 2024)

The activity aimed at identifying displaced camps in the FARDC-M23 Battlefields including surroundings of the Capital city of Goma and throughout the North Kivu province.

Surveyors composed of the MSD Network members have been involved during the process and after the activity’s completion, up to 10 cities concentrated with displaced have been identified including; Mweso, Kitchanga, Nyanzale, Rugari, Kanyarutshinya, Rusayo, Kanyabayonga, Sake, Mugunga, Bulengo and the province capital city of Goma (hosting up to 6 displaced camps)

Displaced Camps Open mapping (March, 09, 2024)

The activity aimed at mapping forced displacements in the FARDC-M23 battlefield while strengthening the capacity of participants in open mapping with UMAP (an OpenStreetsMap tool). Up to 10 participants participated in the mapping activity with the following links;


  • Up to 30 participants have been directly involved in the event activities,
  • Up to 10,000kms have been covered by the event,
  • Increased knowledge in open mapping of up to 20 participants,
  • Up to 20 cities and villages were added to the map
  • Etc.


The proposed event activities have been threatened by ongoing hostilities between the Congolese army and the M23 rebels and numerous violations of event organizing freedom on the battlefields.

About Open Data Day

Open Data Day (ODD) is an annual celebration of open data all over the world. Groups from many countries create local events on the day where they will use open data in their communities.

As a way to increase the representation of different cultures, since 2023 we offer the opportunity for organisations to host an Open Data Day event on the best date within a one-week period. In 2024, a total of 287 events happened all over the world between March 2nd-8th, in 60+ countries using 15 different languages.

All outputs are open for everyone to use and re-use.

In 2024, Open Data Day was also a part of the HOT OpenSummit ’23-24 initiative, a creative programme of global event collaborations that leverages experience, passion and connection to drive strong networks and collective action across the humanitarian open mapping movement

For more information, you can reach out to the Open Knowledge Foundation team by emailing You can also join the Open Data Day Google Group to ask for advice or share tips and get connected with others.