Discovery Systems in 2020: Issues and Trends

I think it’s about time to get started my name is Diane golden Burkhardt I’m with the Coalition for Network information CNI and you have reached a webinar that is part of CNI is spring 2020 virtual membership meeting and we’re so glad that you made some time out of your day to join us here today today’s webinar will be a panel discussion including conversation about current trends and issues in discovery systems talking about various kinds of features user behaviors identifying needs by analyzing transaction logs and also using artificial intelligence and machine learning for discovery our panelists will also be talking about the recent ohiolink and Ithaca white paper on user centered library systems and the concept of full library discovery you may have been fortunate enough to catch a talk on that report that we also had at CNI in April and we will chat out a link to that video if you didn’t get a chance to see that our talk today is entitled discovery systems in 2020 issues and trends and we’ll be hearing from four speakers we’ll be hearing from Lorcan Dempsey of OCLC tom kramer of Stanford University and Bill Misha and Michael Norman of the University of Illinois at urbana-champaign before I hand it over to our speakers I just want to orient you very quickly to a few features of the webinar environment one is that we have a Q&A box if you look at the bottom of your screen there’s a little button that says QA if you click on that box will pop up you can simply type in your questions or your comments in that box at any time and after our panelists are have completed their entire presentation I’ll come back on to moderate those questions we also have a chat box as I alluded to earlier will be sharing some information with you there but you should also feel free to use that chat box to communicate with us and communicate with the other attendees on this webinar so without further ado I want to thank everyone once one more time for being with us here today and on a special thank you to our panelists for their presentation today and with that it’s over to you Lorcan thank you very much Diane pleased to be here when we were talking about this session bill suggested that I say a few things maybe by way of general introduction but also say a little bit about the PTA a operationalizing collective collections report and full full library discovery as was mentioned given that times have changed somewhat since that discussion and we’re in unusual circumstances it seemed to me that it would be sensible to say a little bit about rediscovering discovery in a changed environment and I’m going to talk about three things very very briefly and they’re really quite high level and I think will impact and change the way we think about discovery over the next while they accelerate our current trends the my comments are partly based on three sources I just thought I’d put up very quickly a resource discovery for the 21st century library I have an introduction in this volume it’s coming out next months from Fassett publishing in the UK the BT a report that was mentioned and then I really still blog entry a couple of days talking a couple of days ago talking

about the ways in which collections have changed in the current environment the way in which we think about collections differently the way in which the relationship between the library and the collection has changed the library collecting activity as the word has peeled away from the the locally managed collection in various ways so I’m going to talk about three things and they’re going to relate each three things to a pandemic effect so one of the things we’re seeing at the moment is a lot of discussion about what will change what will persist what will be accelerated what might go away so three pandemic effects have been very pronounced in the library context first obviously there’s been a forced migration online but I think the effect of that is as people begin to interact with services online we really begin to think about what does a holistic online experience mean what does it mean to provide the full library experience in an online environment the second thing is really a focus on mission universities and colleges are really now very focused on their distinctive impact very focused on where they should be putting emphasis very focused on strategic directions when they come out of their current situation and I think for libraries there is going to be an increased focus on alignment with evolving institutional priorities and this means that they will want to optimize there will be pressure on budgets there will be pressure on institutional alignment being seen to contribute to institutional priorities at a critical time at a difficult time so on line on mission and optimize and I’m going to say a little bit about a discovery effect of those three pandemic effects so a little bit about full library discovery a little bit about the discoverability of institutional assets in the context of a focus on research and then under optimize stretching a little bit thinking about d2d discovery to delivery but suggesting that increasingly we’re going to think about GTD in the context of decision support or dashboard they’re increasingly we’re going to have data-driven decisions we’re going to want to think about how to optimize things in the context of data usage data and traffic that suggests behaviors choices and so on so discovery – delivery – – border okay so first of all hella stick online experience now I am using PowerPoint in the way it was supposed to be used here with lots of bullet points I’ve shifted back to bullet points for this presentation so I think one of the effects one of the pandemic effects we see is that the library identity is a strange hybrid between a set of services and an actual building a physical manifestation and symbolically manifest on campuses and I think one of the effects of the pandemic will be to make that fully online experience very real and the experience of the library on the identity of the library manifest in that online experience this means as a target or as a new target on on the horizon moving forward we’re going to have more pressure to think about how to deliver fully online how to deliver the full range and richness of the library experience online so you have things like consultation and expertise you have how do you substitute for the face-to-face interaction that creates the relationships that allows you to develop research support or other services clearly there are information integration issues integration into learning management interaction and programming big focus on personal interaction you know we’ve had discussion about customer relationship management systems but also profiling over the years and how to get into the user flow how to use social more so I think all of these things mixes of these things going to become more prevalent as we think about that fully online library experience at the same time we’re seeing a new relationship between collections and at the library a new relationship between the library and collections increasingly we’re thinking about facilitating access to collections that may not be locally owned and curated we’re thinking about collective access to collections across groups of libraries across a consortium so our model of collections was of the careful construction of a locally acquired collection but in a sense even though we still quite central to library

organization in operation a lot of what we do has has moved away from that because now what we’re really thinking about is how do you optimally satisfy research and learning needs from a facilitated network of resources there are resources that are acquired locally there are resources that are collaboratively provided there’s open resources there’s commercial resources so we still provide literature search and so on through that discovery layer but you don’t own everything in the discovery layer resource guides are really interesting phenomenon they’re like tribbles in in Star Trek they you looked in the room once and there were two or three of them and now all libraries have all of these resource guides and this is a signal if you like I mean signal of various things but one thing is that you’re facilitating access to a range of things that are arrayed around the needs of that particular course that particular subjects in that particular area clearly big emphasis on open access thinking about how to deploy array open access resources how to access them more effectively open educational resources we facilitate access to a whole array of network resources free network resources and try and tie library resources into those at the same time even you know we connect to acquisitions we connect to and ways in which discovery connects to and ways in which we acquire materials through demand driven acquisition we’re offering spot acquisitions the ability to order a document increasingly we’ll see more smart fulfillment around resort sharing the integration of acquisition resource sharing discovery to develop a richer view on to what is available to and and the person you know we will by the professor of the book from Amazon rather than and request it if it’s not available within a certain amount of time so this whole set of services beginning to provide facilitated access to a network of resources an interestingly we used to be in a situation where the collection drove discovery you know you’ve had a collection and you wanted to discover what was in the collection in a demand-driven or a facilitated environment that sort of flipped a little bit you know discovery what somebody has access to tends to sometimes influence or drive the collection and what comes out of this as an early manifestation is a focus on full library discovery thinking about not just access to a literature research but access to the range of materials that you facilitate access to and I think over the next why we might see a sort of trend emerge here where you have different levels so historically then the library provided access to the acquired library collection that what was bought in licensed she provided discovery to that we’re beginning to see through bento box displays through a focus on full library discovery and Bill will talk about this in a while access to the website to events to various things to programs access to expertise by pulling up relevant live librarians or experts in response to a particular query and access to a broader facilitated collection yes you have the articles you have the Library catalogue but you also have a web search and you have potentially access to Google Scholar various other things so we’re beginning to see this broader array of things pulled in now beyond that over the horizon again there is how do we think about that full library experience so I think we’re seeing a move as the library and library discoveries are peeled away from that library collection to think about this broader array and currently we’re sort of thinking about an array of services across wider aspects of the library and I think that trend will continue as we think about that full library experience the second thing I said was and was thinking about the mission so we did some work a while ago with Ithaca and Sonora where we sort of pulled out a model saying that universities tend to have three poles three emphases and these will vary depending on the institution that you’re in clearly there’s a distinctive research focus where doctoral research scholarship but then you know focus on liberal education broad undergraduate education and from any institutions quite a strong focus on preparation for professions on credentialing on moving forward and most institutions will have a combination of these but they will lean one way or the

other and consorting because there contain a mix of these quite often be taa less so because it’s a consortium appears ohiolink very much so you have Case Western Reserve University you go high OH State very strong and research very strong in undergraduate education you have a somewhere like Franklin University very strong and career preparation other institutions very strong and undergraduate education but universities are going to get much more purposeful much clearer about where their a distinctive value resides and sharpening what they do to deliver value in that context and I think one result of that is thinking in the context of a CNI audience thinking about research institutions and I’m thinking about what has happened over the last while where research itself has been affected by the pandemic so a strong pandemic effect and here’s the impact on the research culture that you know we’re experiencing all around us now sort of temporary temporary ceasing of some types of laboratory research but at the same time really big focus on short circuiting processes and practices to get material to get research outputs out earlier big focus on collaboration across disciplines across institutions urgency about reporting results much greater use of open channels and then concern about assessing validity and relevance which has really come up in the last week or two at the same time we could look at the way in which publishers are making deals for temporary access and various other things they’re sort of changing the way in which we think about how research is communicated some of that will bound that will rebound some of that might stick but we’re we’re in this period of questioning also about research stronger desire to showcase expertise potential contributions within the institution so I think we’ll see coming out of this research libraries much more purposefully curating managing making more discoverable research outputs like preprints and research data and also becoming may be more involved in at the disclosure the discoverability of expertise on campus in a way that’s already quite common in other other parts of the world and and a lot of activity in the US as well the very clear we’re we’re very familiar with how this manifests itself but I think this focus on discoverability of institutional resources will grow currently when we talk about discovery quite often we talk about discovery of outside in resources the ability to find articles the ability to find books the ability the ability to find resources more generally I think one of the things that we’ll see in research institutions now is this focus on discoverability of inside out resources discoverability of institutional assets institutional materials so things like research data preprints institution repository and I quite like the way that Purdue webpage is organized because it very clearly shows at the top you have the discovery discovery up materials might be and at the bottom you have discoverability of Purdue assets Purdue intellectual outputs so you have Purdue epubs the institutional repository publications you have a archive special collections and archives and then you have perv which gives access to us stasia so these are all institutional assets that you’re interested in sharing with the world and the dynamic is very different here because sure you want your local population to see these and understand what you have but from a reputational point of view from a scholarly point of view from a dissemination for interview you want to share these materials with the rest of the world you want to push them out you want to make them discoverable so that inside-out focus becoming more interesting at the same time we have seen quite a few libraries becoming involved in the development of expertise systems on campus this is the one at Minnesota where you’re looking at faculty profile and outputs and sharing those with the world number three optimize everybody is going to be very focused on optimizing against particular goals optimizing their collections their services and really you have to choose the goals and and you know in teaching a learning institution you will be very optimized on immediate support for learning student success retention thinking about that new experience some other environments you you may opt Imai Zing’s just thinking about collections they’ll be much more optimizing for value and the discussion with publisher is very interesting in that regard probably more optimizing for open

certainly for curricular support and some emphasis on regional local affairs and big push towards collaboration I’d suggest there will be more collaboration and I think one area that will maybe get a bit more emphasis is pluralizing collections diversifying collections representing and respecting communities that are overlooked and ignored in the context of collections that have been developed according to and a certain characteristics or criteria so all of this means though that there will be an increased emphasis on decision support because to optimize you need data you need to understand how things are being used what’s not used how to really focus in on and making choices so data is required to support choices and that leads to now bill had had suggested talking about the operationalizing the collective collection reports which we did for VTA a last year and really here what we were looking at was discovery to delivery the complex array of services within a major library consortium that allows them to share materials across those libraries and discovery in that context is part of a very complex ecosystem because Illinois is part of ETA it’s also part of Illinois infrastructure ohio’s in Ohio link Rutgers is in policy but also in a variety of other organizations potentially so we have a very rich very diverse very complex ecosystem within which libraries are sharing material and making decisions about their collections at the same time these are large relatively autonomous self-standing relatively wealthy institutions that are building large collections Illinois a good example so what we recommended at a very high level is that the libraries begin to think about the optimal distribution of collections how do you begin to manage your collections at the at the level of the consortium as well as at the level of the institution to mean that libraries can specialize because the network will take care of things they’re not specializing in or that shared print facilities are in in particular areas or that you do eventually move to a prospective coordination model where you share interest in subjects but thinking about the distribution of collections across this optimal distribution this needs to be supported by efficient network fulfillment tying together the various requesting delivery discovery systems that currently exist in a smarter way but all of this depends on system-wide awareness this all depends on knowing what’s available knowing where it is knowing what terms it’s available under and a lot of the inefficiency of the current system is that you lack forward knowledge of those things the systems have to go and look or people have to make joy between systems so it’s a very fragmented environment so from a system-wide weariness point of view what we were saying was that increasingly we will see the need to think about more data-driven decisions more data-driven systems more data-driven choices around integrations choices around where collections go distribution of collections and this will mean sort of greater integration between Discovery resource sharing and acquisition because they’re all about making choices about materials it will mean greater coordination between shared print digitization specialization at individual institutions all of this depends on better system wide awareness better data it depends on dashboarding pooling transaction data Holdings data acquisitions data into a way of looking at things now this is very aspirational it’s on the horizon but we can see that we’re sort of gradually moving in this direction where we want to have ways of making decisions about collections that discovery can contribute data to but then because we’re in this facilitated environment might be influenced by what happens we can see in the licensing arena consortium manager in Rome from the same company but a lot of data about what is being bought unsub recently renamed a lot of coverage around that at the moment we provide green glass in the monographs area so I think increasingly we’re going to see dashboards decision support systems that help manage this

increasingly fluid way in which we look at collections so the discovery choices will be made in data driven environments that they help shape because choices that people making discovery can be factored in you have downloads you have you know the whole way in which they play in this ecosystem but then in turn they’re shaped by because you want to offer for discovery things that you recognize or valuable so in this sort of more collective more facilitated more fluid environment really sort of beginning to see data play a bigger part so that was what I wanted to say rediscovering discovery three examples of how the current changes may make us think a little bit differently about discovery very much in a library environment thank you thank you very much Larkin assume everybody can see my screen here we’re going to talk a little bit about Michael and I are going to talk a little bit about really an extension of what we did in 2017 at a CNI briefing discovery trends in particular we’re going to talk about mental systems what we learned from transaction log analysis and then some of the other elements that are going into what we’re seeing is really a charter of transformation in discovery Larkin touched on a lot of those points in it and I think that’s a very good introduction again we’ve also opted here a little bit for putting together some fairly dense slides idea being that these can be useful for later reference and let me go through these fairly quickly so we’re still thinking that the library discoveries is at a crossroads so Roger wrote Roger Sean Phil wrote a nice briefing in 2014 about academic libraries reconsidering their vision for discovery Tom and his group has done a lot of work at Stanford and it’s a quote here from Katherine Coleman about a revolution discovered he’s going to talk about that later and then the ohiolink white paper which was referenced by Diane and she also actually gave you a link to this as earlier presentation this is this ohiolink of manifesto essentially which basically is proposing that we Rhian how we’re designing ILS systems and discovery systems that typically they’ve centered on the on the collection and not on the user and that we need to do is look at systems that are much more user dependent they talk a lot about some of the things that Larkin mentioned the full library discovery inside out libraries and providing modern business intelligence to library systems again a miss killer go over this quickly the full library discovery is something that a lot of people tried to incorporate now into discovery systems we spent a lot of time working on this moving beyond the retrieval of collection materials including local information local services local content in our case we’ve integrated websites web guide information and subject specialist lists course management content etc and the goal is really to bundle and interconnect these related information services this interoperability briefly historically there was a few of you they can still remember what we used to call super catalogs with the loaded abstracting indexing services so will you remember when we libraries are loading the BRS software locally and loading indexing services moved from there to federated search systems to web-scale discovery systems which are now used literally in thousands of academic libraries web skill discovery systems that are characterized by having the metadata of full-text content aggregated into a single flow solid ated indexed as you’re searching one large system more recently we’ve seen the introduction of really hybrid Bentall style systems that that do utilize some broadcast searching for federated searching techniques but they’re typically done over the top of web-scale discovery systems what mental systems are characterized with by having a result displays presented in a zoned or

a partitioned screen display with content grouped by type and material so typically there’s a search for articles a search for books a search for on the library website a search for journals by title etc as a very rich rich literature on web-scale discovery services there’s a nice building by a wall that’s the University of liège they the literature Center is around a number of issues that are connected with web-scale discovery services one is a general confusion with blended result displays so a lot of people have moved into bento displays or Benton style systems basically doing this in reaction to users concern with the scene displays that that blend book results monographic results Journal article results dissertations newspaper articles all into one result display this effects known item retrieval and the relevancy rankings you might be doing a new one item search and it would be on the second or third page of a blended result displaying and web-scale discovery system but also are seeing things about that concerns about the lack of full live or discovery the lack of access to local services and concerns about better addressing known item searching advantages of bento type system again is a DOS partition results by material type and format one of the things that we found and others have found is that a large percentage of searches that have been there are done by users are known item searches but they addressed the websql discovery issues of blended results relevancy ranking known item access they incorporate full library discovery features and they are able to provide nice one Quick Links out to full text this is something that a number of us have been working on to try to expedite full text retrieval and bypass the link resolver example of our page and we’re going to talk about some of the specifics here one can show two slide of this also technically if you look at the results in the articles page of the upper left hand side you’ll see links to open access articles links to a table of contents PDF links links to article data if you look at the article link on the upper left number two you’ll see links to data sets and these are all done by essentially integrating a number of what are basically siloed services into the into the bento style display and here’s a sort of a model of our display and the elements in the the bento display we have a suggestion box and articles on the Left catalog items on the in the center subject suggestions on the right we have a place to do some advertising and this all makes up our console style display now features in our bento system we provide a lot of context specific adaptive search assistance spelling suggestions links to live guys direct links for frequently perform searches limit suggestions we identified DOI so that for some types of DOI and we put a link directly to the DX dy or dy dot org site a link to our ask a librarian online chat journal title links direct links to PDF is available do you know why probably sure open URL custom value added links next slide talks about that we also try to record recommend several relevant subject ni services and provide links to to those that when clicked on open up at the point of completed search and then librarian in departmental library subject contents again these are all following in the philosophy of providing full library discovered specifically in terms of our system we’ve added a number of sort of value added links over the top of the article api’s so we use F stick ueda-san Scopus api for article results we take these results take the DOI DOI is out of these results then they certainly go out and provide links to clickable all metric badges to give the all metric attention scores using scolex we pull out the data

set and article data links on paywall pulls out the open access links browsing pulls out the direct PDF place that you should table content links the PDF links are a complement the links we get from the EBSCO discovery service here’s list and expect you to memorize this of the Bentall libraries that we’re following go back refer to this later if you want to look at some of the some of the examples we’re looking at about 42 different dental libraries academic libraries right now and interestingly enough we’ve we found in the last year about 10 libraries have dropped a bid to approach so this used to be a figure that was over 50 libraries some of those are primo installations so we’re going to talk a little bit about the problems with the primo API is later but we have a nice spreadsheet that sort of characterizes the features feature sense of all of these pentacel instances they all have the books tune article area so everybody is recommending monographs from the online catalog and articles from typically a web-scale discovery service and the one skill discovery services that are being used for articles you’ll see it a good number of them we’re using summon but a number of them are using primo and Epico discovery service in addition to the articles and books we’re finding that website search is probably the most popular in terms of what else is being provided there’s 34 of the 42 providing that research guides some 24 journal title links or journal title searches databases digital collections it’s just repository contacts which is something that the number of us are providing has grown dramatically resilience is about 18 libraries providing them so in terms of observations about these bento systems feature sets vary a lot of bento versions do not do spell checking do not provide top-level direct links spell checking turns out to be critically important you’ll see later when we look at an analysis of our clickers only three employ the one click to full-text without going to the link resolver option which are uses Illinois I find very very useful they all FAK varies sometimes it’s a separate application like viewfinder backlight sometimes it’s part of the web-scale discovery service so we are seeing systems where the catalog results are from the web-scale service article results from the rescue service perhaps the archives and manuscript results our digital collections might also be from the scale service but they’re just being separate until Windows Vento provides a lot of local control and customization you can see that by looking at the various options various options of people have been using but does require programming service staff maintenance and a fairly significant amount of work to locally maintain a particular system there are a couple of systems now that are available that are being used in more than one institution but a lot of the systems are still growing you you we’ll talk a little bit about custom dogs we think it’s very important to look at users search behaviors and what we learned from users search behaviors we have a very heavily instrumented transaction log a program it records all user actions suggestions the system makes all those search remove formulations identify sessions on all the click throughs and click throughs actually routed through one of our websites where they’re recorded and then redirected so we know click throughs into external resources we have a lot of transaction logs going back and something like 11 or 12 years when this study goes up to April 2018 where we looked at billion and a half searches and a million and a half click throughs and then took out a sample of about 5,400 searches we redid the searches analyzed these four types of search success rate particularly user behaviors two important points two or three important points here one of the things that we’ve seen is that the average words for query is going up dramatically so we’re now at 6.1 words for query there’s a lot of copy and paste searching where people are taking results we also know this from our focus group interviews and from user survey we did last year where people are searching

Google or Google Scholar pulling out references and pasting them into our system there are only about – a little over two searches per session 60% of the sessions are onesearch we look at the use of our suggestions about 20% of the searches of suggestions made and almost a third of those the person follows a suggestion particularly the did you mean spelling suggestions and drink links we’re seeing a lot a lot of local DOI searches this is been growing over the years this is the sigh huh phenomenon where people will love to put a DOI into a system and blow up the full text the other really important point is that in the sample of these 5400 searches we found that about 64% of them now which is growing up and last time we looked at this our known item searches often these are titled word searches author and a couple words from the titles many cases of full citation in fact where we looked at the sample percentage and a half of these searches were new an item which isn’t a lot of the sample but this extrapolates 245 per day where people are literally copying and pasting a full citation into our system if we look at the usage within the bento my article links are 58% of 57% of the usage books and monographs less the OPAC a lot of use of our suggestions added links even though some things like the library links their context which is one tenth of one percent it still means it’s more than once it’s happening more than once a day in fact if you look at the click-through actions here for last month or the month of April one hundred forty one thousand searches thirty-seven hundred or almost four thousand fix per day full-text clicks are sixteen hundred per day a lot of clicks in the browsing a lot of clicks into our Bing API results the did you mean spelling suggestion is 55 per day so those systems that don’t offer us spelling suggestions or don’t offer did you mean you can see how heavily let’s use literally fifty five times a day in our system the all open-access least 20 per day this has been increasing since we started more teaching at that at Illinois the direct link suggestions these are the commonly frequent searches of about 20 per day a to see journalists about 20 a day ask a librarian four times a day and again even emailing a subject librarian twice per day we have a lot of library services that are not used every day so how many services use twice a day we would consider successful we did a easiest search users surveyed November 2018 but 483 responses shorter 30 users providing comments 24 the response for daily users of the system and 40% of them were daily or weekly users we got a lot of nice suggestions but there was a high level of satisfaction but the entire mental approach um but I put up one slide here which is a question here about discovery in general and I think a lot of the literature kind of centers around this it has to do with whether or not library should even be the starting point for users seeking content there’s a nice I think is Plus our survey question what important is a gateway function which dipped over the years and now has gone back up so there’s a growing consensus I think among users that the library is in fact a valid starting point you might argue that library systems have always played a supplementary role and that’s true in many cases and you might also argue and a lot of people have that our focus should be in 1801 item discovery we’ve done an awful lot in our system and you can see that with the known item searches at 64% that the importance of providing access to known items searches a number of plans here for next steps within bento systems you can take a look at this as a lot of Megan indexes and especially discovery services these are dimensions and lands so I’m just going to talk about machine learning and AI techniques we still have a lot of complementary digital services when open data data management services publication metrics visualizations course mansion content factory profile systems now that we can add in I’m gonna turn this over to Michael now is going to talk a little bit about our pre-roll implementation I thank you Bill hopefully you can hear me mute there but yeah bill and I thought would be a good idea to just give you a little introduction to the implementation that

we’re going through for primo and so in June 2020 we’re going to be migrating to ex-libris as Alma and primo ve systems and that’s gonna that’s really a new deployment model that combines the back-end processes of both Alma and primo into one integrated platform currently we work with separate systems in that with the Voyager and our catalog discovery is view find so that will be a new process for us of really going into this combined back-end processes and then you know really almost in real time that being reflected in the primo ve library catalog and we’re transitioning with 91 other 90 other libraries in the iShare consortium so you know what kind of systems are OMA in primo B Alma really is a unified resource management system that allows libraries to manage their print but really what we’re looking forward to is helping manage our electronic resources and service is really into single environment that’s that’s going to be a big change for us so we’re looking forward to that for an electronic title just a reminder to everybody all my really create it’s an electronic inventory so a portfolio that really then permeates Alma and associates the electronic access and Premo so a primo catalog and all instances it really can match on identifiers so Alma also has a network zone that we’re really getting accustomed to that takes over the function of that Union catalog the network zone we’ve encountered some problems with that within the consortium the network zone has really built on using a first end premise and that is really the first copy that comes in to the catalog the Union catalog this network zone really becomes the master record for the consortium and this master record is a share bibliographic record that is linked to our local Holdings or the local holdings of each of the libraries that have that that titled that material and much of our local data and we do have a lot of it particularly in our rare books and special collections materials really didn’t migrate over from Voyager into that master record and so we’ve been doing a lot of customization work to reintroduce that local information there are ways to do it but having to put some time and effort into that and then one of the big issues that we’ve encountered and Bill can talk more about this is really these localized URLs to Erie sources populating the master record this would be a link to a recent electronic resource that one of the other I share libraries may have and it’s then that pops into the master record but we may not have access to it or where it has their proxy appended to it so we’ve run into some issues with that so just important even though we do have access to primo through the with the consortium we actually had access to primo six seven years ago we had a pilot where we had access to it for for about three years and then we went away from it and so now we’ve come back to it but we do want to emphasize the easy search bento will remain the library’s primary and default discovery service available basically in the single search box on the library white webpage primo will replace view fine as the catalog search we’ll talk a little bit more about how we’re doing that here in a later slide but we are really testing certain features of the primo essential index particularly the ProQuest collections which when we had primo previously six seven years ago we did not have access to the ProQuest collections a lot of newspaper collections at that time and so now we do and so we’re testing a lot of those features with the primo essential index and we do still continue to see the benefits of really these separate bento zones here bill it’s got the image of our library’s front gateway so there’s the single search box when you do a search within it then you get results and so again as bill was showing earlier and Lorcan was showing earlier you’ve got your articles section there you’ve got your in the middle there is our library catalog and so here in a few weeks primo will be will be populating that and then one of the things we’re looking forward to we’ve been testing and we may be able to offer since we do have access to the primo central index the primo central discovery index is newspaper articles and we as some of our users as bill was doing some of these surveys is

that or comments is that newspapers is one of the things that maybe they’d like to see emphasized a little better and so we do have access to that well we’ve run into it’s really painful right now that were work trying to work through is these API issues and Bill mention this and I mentioned a little earlier is that we are currently using the primo Search API to pull in results from the Library catalogue into easy search and we are very pleased with how quickly results came in from view fine and it many times it’s it’s less than a second to pull in those results but averages maybe one to two seconds response time with primo we’re encountering really some slowness and performance of those they with those api’s and the results coming back through those API calls and it’s really averaging about ten to eleven seconds per search and and we were getting a lot of comments from testers about just how slow that is and so API performance for the premium catalog really really definitely needs to be optimized and improved to gain really the full benefits we have of easy search right now and we’ve been working with ex libris we were hoping that maybe some improvements are coming here in a few weeks and but those API is just really critical and you saw all the api’s that bill mentioned that he’s pulled into into the articles being able to pull in some of this other other information so I think that’s the last slide there but then we can turn it over to Tom yeah hmm second find the right screen to share this looks like alright that appears to be working correctly on my end I’m going to pursue hello everyone it’s a pleasure to almost be in San Diego with you kind of wish I were there right now after two and a half months at home I’m gonna talk a little bit about a different facet of discovery which is some of the things that seem to be emerging right now on the leading edge in some cases the bleeding edge but which may soon become core parts of all of our discovery environments in one way or another the first of them is linked data and I used to do a lot of work with Dean craft of Cornell and you have this beautiful slide showing Eden and the Tower of Babel and how linked data would be able to basically be a Babel Fish and allow us to work collectively across all of the different schema and vocabularies ontology zand domains where we have our data and I think we still haven’t achieved that but we are making some progress and one of the areas where I think the progress is most notable and where link data is most important is when linked data serves as the bridge or the gateway between library data and our ways of representing knowledge and resources and that of the web in general and we really look at this in two ways getting library data out onto the web for discovery reuse and linking but also pulling data from the wider web into our environments and as beautiful and as much fun as it is to to talk about ontologies and RDF and the Semantic Web really where most libraries are interested in linked data is because it’s going to enhance discovery and so through the LV for PE and the LD for l projects and the series of projects that have been funded by the mellon Foundation and in partnership primarily between Cornell Stanford Harvard University and the University of Iowa we’ve been focusing most recently on seeing if we can augment existing discovery environments with linked data features one thing that we do know about linked data is that it’s not going to appear quickly and it’s not going to appear magically where one day everyone is using market-based systems in the current environment and then the next day we’ll be in this new a wonderful and somehow different world where everyone is using different interfaces different systems and different feats so we’ve focused as part of the LD for key grants middle D for community on trying to do incremental enhancements to existing discovery environments and we’ve identified five areas where we think this might be most fruitful the those are represented here and I’ll give you a sample better for each one of these and where I think we’re seeing some progress and where we’re seeing some challenges

the first represents to a way to get a lot of very data out on to the web in general and in many cases many of our catalogs are being indexed by Google and other harvesters and search engines one way to accelerate that is by doing better and more rigorous schema markup so schema.org markup in their catalogs this exposes the data to harvesters work can be incorporated into the web of data and then eventually emerge through search engine optimization and higher searches we have over the course of the ld50 project seeing some progress on this we’re by and large focusing on black light as an open source application using summer underneath for a couple of reasons one is that lots of the LD for PD institutions are using black light the tulips open source and using common technologies which even if you’re using you find or if you’re using a commercial search engine a lot of the lessons and techniques are portable the second area and this really exemplifies bringing external data from the web into library Discoverer environments is knowledge panels and I think people are generally familiar with this from Google we’re now beginning to see this more and more commonly within library discovery interfaces I believe it’s it’s now a feature that is included in or at least some primo instances and this is a great example from the University of wisconsin-madison which has a home-built discovery environment where they put a lot of work into integrating the knowledge panel but also building a service around it including what are the ethical and service considerations for when they find that data we’ve doing more work and more consideration about how to get better forms of browse and I think it’s telling that most browse interfaces from this decade look like they may have been coded or designed ten years ago or even 20 years ago spatial brow seems to be the one the one exception to that and we’re seeing some good breakthroughs there a semantic search so instead of searching just on the text can you search on the meaning of the text so if I search for heart attack could I actually get search results for myocardial infarctions the best example of this and this is using mesh from the National Library of Medicine and search there this is yet to become common technology or common appearance in library most library search engines but perhaps this is on our honor future and then finally as bill was suggesting I was really glad to see this type-ahead autosuggest and spellcheck suggestions really had a chance to see people helping with a known article search but also general browse and progress on this especially semantically where progress would be a great advance we’ve seen some very good work on this from the University of Ghent in Belgium we recently presented on a backlight link data workshop that we held at Stanford last September and October of all of these techniques the one that seems the most relevant and the most visible is the knowledge panels and at this same workshop that I just referenced about 25 people came together and began to crowdsource a document on how to add a knowledge panel to your existing discovery environments this document is links you can see the bitly at the bottom so bitly ld4 – KP – recipe and it’s written as a how-to document it’s currently graphed but it covers everything from what is the knowledge panel why you might want to use it – identifying data sources considering a the minimum amount of data that you need to have a quality display and then the technical strategies for actually if you are interested in any of these techniques and of these advances or if you have your own advances that are working with link data we would welcome you in the LD for discovery affinity group business a set of biweekly calls that are open to anyone there’s an extensive knowledge base that has been built up over about the last must be eight months or one year at this point and the co-chairs are Goethe Khan from Cornell University and Jesse K from Stanford you can see at the bottom or if you do a search on LD for discovery affinity group is not surprisingly the first result that you find the other area that I would like to forecast just briefly is artificial intelligence and the potential to impact library discovery I believe there are two big opportunities emerging specific to discovery now there are a lot of places in a lot of ways that artificial intelligence is going to and has already

affected the library the library services in the information environment this includes back-of-house operations this includes digital curation this includes things like chat box and reference services but specifically on discovery I think we’re seeing that artificial intelligence introduces it just changes the equation in terms of the kinds of metadata that are possible to derive and produce and it also can set new opportunities around new interfaces so artificial intelligence is something that might scale in a way that are libraries of technical services departments and even the networks of data exchange are have been unable to do or at the same piece at the same scale and that’s the same expanse and looking at the common techniques that are really emerging as now not even state-of-the-art but very commonplace in many different industries the ability to process lots of different formats of information whether it’s text images or time-based media are really able to generate lots of what looks like traditional based metadata so this could be techniques for named entity recognition or text classification for texts for images doing aboutness or labeling for descriptive metadata generation or recognizing objects that are within an image which might be for description or just extracting parts like better OCR fort when beast media TPS speech to text is really a game changer not only in accessibility but just overall discovery and there’s lots of examples where structural analysis of things like video allow people to pick apart these complex time-based objects and yet just the captions or just the segment’s that are interesting I think all of these techniques are going to become standard for library processing in not too distant future I think actually within the next five years if not 10 the second opportunity that we’re really seeing emerge with it artificial intelligence is new interfaces I think one of the best examples of this there’s actually some academic ones that I could cite but none that I could find right before this presentation running in production but Google is one that you might be familiar with and its reverse image search so by entering an image can you find other images that are like this so here’s an example of a killer rabbit from medieval times if you paste this into Google images you’ll actually find lots of different images that are like this now this can be done completely without any text and without any descriptive metadata it’s a potentially revolutionary approach we also see different types of interfaces for different types of recognition this is an example of a knowledge graph exploration from you know which might be familiar with people in this audience and this is actually interesting because there are two ways to enter this one is by keyword search of a concept but another bot is by uploading or examining existing documents you know we’ll understand what the concepts are within that document and then draw links to other documents and other concepts within this environment it’s a completely different kind of discovery and search very different from no item searching breathing keyword searching at Stanford I must admit we are still getting our feet under s for AI and I think the exciting thing is that’s probably true almost everywhere even people who work far ahead of us we’re still at the very beginning of a revolution um we have two sets of projects going one is around theses and dissertations and can we do more in richer descriptive metadata and descriptive work on these to advance discovery we happen to have the full text for the theses and dissertations that were deposited electronically at Stanford so this is a rich proving ground and currently we’re looking at comparing multiple different models and seeing how we might expose the richer metadata through various interfaces we’re doing the same thing with images with image labeling with object recognition and with image based search so my colleague at Stanford Claudia tingle has done a great case study comparing what happens when you use commercial search engines or commercial machine learning engines clarify Google and Auto ml the cloud vision and auto ml and how good are these for academic purposes and you at the bottom of this you can see some of the results they’ve been working with with basically under subscribe set at 50,000 images taken from an archaeological dig over the last 20 years as I said I think we are all collectively at the beginning of this and there’s a real opportunity for libraries archives and museums to better understand and to build our own capacity for leveraging artificial intelligence

for discovery but for other areas as well with the National Library of Norway the British Library is some Sounion Institute and the bibliothèque Nationale de France Stanford is helping form a open community which looks very similar to others in the space for things like triple if’ and is in fact patterned after is can we collectively work together to understand what the use pieces are build common technologies models and capacity building so if anyone is interested or is already active in artificial intelligence invites you to start participating in this as well and with that I will end and see if there is any time for Q&A discussions and I think for any of the panelists thanks thanks so much Tom and thank you to all of our panelists what a wonderful sweeping overview of discovery systems the demands on these systems the potential really extraordinary thank you so much for that great collection of presentations and given the hour I’m mindful of folks time and we already do have some questions so let me just dive right in beginning with Rob Karl on oh-hoo comments regarding primo with 10 to 11 seconds per search question mark what are reasonable performance expectations for first search shouldn’t it be sub sub second response for API queries as we already have today via solar I would yes that’s what I would anticipate bill can probably talk better on this than I can but yeah we did a little study to get that information back to ex-libris about API calls and just the speed that those were coming back and and so that’s that’s what the time is on those and so we’re we’re they they’ve actually done some modifications to the primo Search API to see if they can speed that up and then we’re also one we’re going to be on production server farm here in a few weeks and so we’re hoping that that might speed up some of the activity too with but bill you you would know more about performance of api’s and I would a typical search for us we will use an EBSCO API Scopus open paywall scolex altmetric thinking the other one sub Rosine being so we’ll send out 10 or 15 asynchronous searches at one time we get things back in a matter of seconds do you find is the other very fast one so yeah a response sign that we’re seeing you know and that’s anything above 4 seconds is really unacceptable okay unusual thank you thanks very much thanks for that question Rob alright the next question comes from Stephen Bell when you talk about AI and discovery could that include voice BOTS they can search the discovery layer and return results on a screen or send to an email and work with common search assistance people have on their phones in homes etc absolutely I’ll answer that though others might want to jump in as well there’s one of the at the fantastic futures conferences that we’ve held for the last two years much of the activity around AI has been around natural language understanding and conversational agents so clearly this is a different modality for conducting queries and one of the interesting things to me is the way the AI stacks upon itself Karen carrion II from WGBH has been very active in using AI for better understanding and access to videos and what she and her colleagues at Brandeis University have demonstrated is it’s actually deep parsing seizing the part of video is about 9 or 11 different functions that stack on top of each other so if you do segment analysis and find out where the captions are you extract the captions and you do the OCR and the captions and then you do any the extraction on the extracted text so I think conversational agents are just one wall and linked in this chain interesting I’d also had that Eric Freiburg from EBSCO EDS does a demo of the use of Alexa with with EBSCO ideas for the voices search and the results are spoken back to interesting so it’s a very Richard Larkin you seem to be on mute if you’re talking where we’re all experimenting with Alexa certainly that’s I mean there still is a gap just thinking from my own experience you know asking for Irish names or

various other things that’s that’s but it’s it’s definitely worth exploration yeah interesting thanks Stephen now another question thanks for the wonderful and rich presentation the complexity of the sources that need to be managed is unbelievably complex I don’t mean this question to come across as a kind of well what have you done for me lately question but many libraries are struggling also to integrate museum objects into their discovery systems can someone comment on that aspect of discovery someone wants to take that one huh I can we’ve looked at doing this with archaeology Center at Stanford and also the Museum the challenge are one of the challenges is just the differences in the level of description and also the schema it’s just the interfaces our heart and what the indices expect and what the patrons might expect given the current layouts for library based environments are very challenging it seems to me that that bento actually could be a good approach to this and also approaches like knowledge panels or like hyperlinks that link out to other types of environments or potentially successful ways of dealing with this we’ve been part of an immersive scholarship grant that was awarded to North Carolina State University we’ve been looking at virtual reality implementations and I guess a lot of different emerging immersive technologies and these are all very fertile areas and at some point we need to we need to try to integrate some of these things moving forward on a lot of fronts I I just I just thought I would put that question out to our attendees as well if there’s anyone in the audience who has any experience with this or thoughts about it if you just raise your hand I can turn your microphone on and you can participate live in here in the conversation since we still do have quite a large audience but that’s a great question and thank you so much for bringing that to our panel today and just to relay from Rob Carr Delano who asked about the Search Search times he just thanks you for your responses and his comment is that these were all great presentations so thank you so much and just to reiterate what I said I’m sorry Tom good I was just gonna if there aren’t obvious questions I was wondering if I could ask Bill one of the things I’m curious about if your research is uncovered where our library patrons typically starting their search the library homepage bento for the catalog I don’t know if you have any data on that yeah in fact that’s interesting because we just had a recent discussion about this and we kind of looked at the literature the if the gas plus our survey ask people where do they start their searches there’s been a couple of other large surveys of of users asking you know do you start your search at an a nice service or at the catalog or at a search engine or in the library and of course there’s very answers and people are starting their searches in all these different places some of this that for us comes down to this idea that we want to be able to provide the best delivery services we can so you know we know that people are starting at Google or Google Scholar they’re going if they’re coming in from off-campus now which everybody is and they don’t have the proxy link in front of the Google Scholar search and in fact it’s impossible to even do this within Google to put a proxy in front they’re being asked to pay for articles they’re being asked to login to articles so we’re seeing a lot of people are doing these copy-and-paste searching where they’re literally taking something from Google or Google Scholar and pasting it into our system and we’re really encouraging that we’re hearing that and user surveys we heard we saw that in our survey we did in November and again I think better integration of all these different silos was what we’re all trying to do yeah I am I am that’s the

Holy Grail the the museum question was interesting just going back to Tom’s answer and and some of what Bill was saying it seems to me that you know for a long time we were very obsessed with Google like searches but even Google doesn’t do Google like searches anymore you know you do a google search and you get back you get back a river of results I mean there’s a load of advertising at the top but you know they have the knowledge card if there are scholarly articles they pull them out if there are images they pull them out if there’s news they pull that out so I mean effectively they’re giving you a sort of bento style result without the boxes so as I say even Google doesn’t do a google-like search anymore so I think you know Tom’s point about the different you know library archive museum shared searching cross-domain searching as you know I called it in a previous life you know a big aspiration for many years and because of the different curatorial traditions the different metadata traditions the different orientations of the services difficult to pull together but I do think that sort of Maura bento style approach might be interesting there but also the link data I mean one of the things we’re doing and we’re we’re working with Dominus his colleagues in this context is you know with Mellon support developing entity backbones I think over time if people share so if we have persistent identifiers for people for places for a variety of things and we begin to share that infrastructure and have those identifiers in our descriptions in their discovery lives it gives you a way of connecting things together at some level but it but it’s still some way out some way out in the future and it sort of interfaces then can take advantage of some of those ways of doing things but different contexts or domains can link to a similar or you know link to the same entity infrastructure for particulars Aird entities I think in the future that’s something that we should see more often and we see it happening already with wiki Dacians on so so at least a historical good point here in the chat that the were do you start question differs a little bit about whether or not you know what I am search or a topical searching and the relevancy relevancy ranking of these services I think it’s really critically important for classes I teach I typically do a demo of Google Scholar and pull it up and do a search for the term federated search and what I find typically is that the first handful of results is an article that I co-wrote in 1999 well if I was going to tell people can people information on it and the topic federated search a 1999 articles not going to be very useful even I’ve written 20 articles since then so we do a lot of analysis of relevancy ranking in these web-scale systems and in ni services and I think that’s a really critical really critical element thank you thank you all for your thoughts thanks Lisa for that comment if we have any other questions please feel free to type those in and inviting any attendees who wish to still around and have a chat with final sincere thanks for panelists for your time sharing to our attendees for making time with us here today so thank you everyone thanks to everyone for attending thank you