“The barriers to website and social media archiving solutions in the heritage, museums and higher education markets are limited budget, time constraints and lack of resource.” Paula Keogh, Arkivum.
Following the recent announcement of the MirrorWeb and Arkivum partnership that delivers a hybrid, end-to-end digital archiving solution, we sat down with both organisations to discuss the barriers to website and social media archiving solutions in the heritage, museums and higher education markets.
Joining us were MirrorWeb’s CEO, David Clee, along with Arkivum’s Head of Marketing, Becks Hicks, and VP in Higher Education, Archives, Libraries & Heritage, Paula Keogh. We would also like to thank the DPC and Sean Rippington, Digital Archives Officer at the University of St Andrews, for their support and contributions to this article.
Read on to learn more about the seven barriers to website and social media archiving solutions in the heritage, museums and higher education markets.
1. Lack of an Archiving Solution
A lack of available solutions has made website and social media archiving difficult for the heritage, museums and higher education markets, as highlighted by Arkivum’s Paula Keogh: “I’ve heard stories of librarians having to archive their tweets by copying and pasting them into their library management system, and then reference them back on their online catalogue. Because what else could you do when Twitter first emerged, when there wasn’t really a solution to know about, to hear about, and to implement?”
The lack of available digital archiving solutions has been influenced by a community in which there have been many discussions about problems but a lack of tangible solutions. “The danger, and I’ve heard this from archivists, IT managers and record managers, is that the conversation only gets as far as talking. It’s easy, especially in this market, to talk about problems and to identify problems, even to analyse some of the problems, without knowing quite how to solve them,” says Arkivum’s Paula Keogh.
And it is only recently that state-of-the-art, secure and cloud-native web and social media archiving solutions have become available in the market - with MirrorWeb and Arkivum’s hybrid, end-to-end digital archiving solution now the most comprehensive data lifecycle management solution and service available for the heritage, museums and higher education markets.
2. Lack of Awareness of Archiving Solutions
There are emerging solutions, thanks to partnerships like MirrorWeb and Arkivum’s, but there remains a lack of knowledge about how to get website or social media archiving solutions. As Arkivum’s Paula Keogh states, “When we’re talking about something as new in archival reference terms as social media and web archiving, people don’t think there is a solution.”
MirrorWeb’s David Clee confirms this: “From the ARA Conference, something apparent was that archivists didn’t know it was possible to archive Instagram, or Facebook, or Twitter, and to capture the metadata around each of those platforms and each of those interactions.”
He adds that MirrorWeb and Arkivum’s hybrid, end-to-end digital archiving solution makes this easier than ever: “Whether it’s websites, blogs, surveys, behind a firewall, social media, Instagram, or even Snapchat that lasts a matter of minutes, we can capture it, and this will make up the rich tapestry of history.”
To try the hybrid, end-to-end digital archiving solution, and see how it can benefit your organisation, request a demo here.
3. Meeting Big Data Requirements
These are complex information-driven organisations and house a lot of research data, etc., and so a concern amongst heritage, museums and higher education markets is whether a digital archiving solution can meet their big data requirements.
Sean Rippington from the University of St Andrews raises common questions within such organisations. These include whether available solutions can cope with large websites, the regularity of crawls and the cost of storage, and if this data can then be available for practical use.
This is demonstrated in the museum sector. “The thing everybody’s talking about in the museum sector is huge image libraries that are massively difficult to manage,” says MirrorWeb’s Phil Ogden, as he goes on to outline the solution: “To me, that needs a cloud solution, and that’s basically where you need to be, to make them indexed, to make them searchable, to make them useful.”
We have found cloud-native archiving to be a scalable, cost effective solution for our customers, and no matter the size of your website or social media accounts - whether you are The National Archives with over 5,000 website and an archive of 120 TB or have a small website or social account - we can meet your data requirements.
4. Lack of Budget and Resources
Arkivum’s Paula Keogh says that, amongst organisations in the heritage, museums and higher education markets, “The initial reaction is that web and social archiving is hard, that this is expensive, that this is going to need external funding and an expert.”
It is an assumption held by many organisations, but cost will depend on the size of the dataset (e.g. a single or multiple websites) and the frequency of archiving. As an example, for financial clients the cost is likely to be more expensive as they have to archive every day. If archiving takes place once a year or once a month, dependent on your requirements, the price would be cheaper for customers.
The availability of scalable, cloud-native archiving solutions mean organisations are now able to decrease overheads needed to maintain infrastructure, including outgoings on space for local servers, power, etc. and hardware upgrade cycles, and reduce costs by meeting storage requirements on a project by project basis.
A lack of budget, however, can also mean a lack of resources to have the capacity to identify and engage with digital archiving services - which is why MirrorWeb now provide the option for users to get research access to the hybrid, end-to-end digital archiving solution, giving organisations the opportunity to get ‘under the bonnet’ and see how easy it is to use.
5. Convincing Senior Management
“The people with the budget and the problem is the IT Director, the information governance officer, the data protection officer, the freedom of information officer, and the board of trustees. It’s those people with the pain problem,” says Arkivum’s Paula Keogh.
But, as highlighted by Sean Rippington, Digital Archives Officer at the University of St Andrews, it can be difficult raising awareness amongst people in such senior management positions about the need for web and social archiving - but work is being done in the community to support practitioners in convincing senior management to take digital preservation action.
For example, MirrorWeb are contributing to the DPC Executive Guide, due for release early in 2019, which will act as a helpful resource for practitioners to persuade senior executives to take digital preservation action.
6. Sensitive Data
Sean Rippington, Digital Archives Officer at the University of St Andrews, raises questions about the legal restrictions of social media, including copyright, GDPR and the ethical concerns of capturing so much data on students.
This is in most part due to complications of personal data regulations and platform restrictions, which can make it difficult for institutions to produce a useful policy for their researchers and research support staff.
These form the basis of ongoing discussions within the heritage, museums, higher education, and wider markets, with calls for more open data at one end of the spectrum and calls to protect personal data at the other.
7. Support Moving to a New Provider
Whether organisations are managing digital archiving in-house or using an external service provider, they want to know there is a clear strategy, the associated costs and how best to proceed should they need to move to a different digital archiving provider.
This is something MirrorWeb has enjoyed great success with for The National Archives when we helped them move to a new web and social media archive provider in two weeks. That was over 5,000 websites from 1996 to the present, as well as tweets and videos from government social media accounts, with the data footprint of the archive being over 120 TB in 2018.
So, whether you need to move a lot of or a small amount of archive data, the principles of moving the data remain the same, and no matter your requirements, we will be able to support your move to a new provider.
The MirrorWeb and Arkivum's Hybrid, End-to-End Digital Archiving Solution
Arkivum had increasingly been asked about website and social media archiving by their existing customers, but were unable to provide a satisfactory solution. This is because the concept was still cutting-edge and the technology in conception and development within the community at large.
Then MirrorWeb emerged into the market in conjunction with The UK National Archives when they launched their new web and social media crawler tech-stack with automated QA features built on AWS cloud technology back in 2016.
Taking a lead from other successful technology sectors, both companies identified that concentrating on what they do well and specialise in is best practice - but collaborating to bring the specialisms together to improve and satisfy the customer need is the way forward.
MirrorWeb and Arkivum’s hybrid, end-to-end digital archiving solution is the most comprehensive data lifecycle management solution and service for website and social media archiving in the heritage, museums and higher education markets.
The portal provided by MirrorWeb is user-friendly, light-touch with minimal user input to setup, crawl and replay web and social media archives in high-fidelity, and cost-effective as it harnesses the power of the cloud - meaning cost is no longer a barrier to capturing and archiving the future digital content we all need today, and which we need for tomorrow once we understand how to use it and where it fits in the overarching archivist’s asset bank.
The WARCs created are passed seamlessly and automatically via an API to Arkivum’s Perpetua system, preserving and future-proofing the web and social media content for all time.
The partnership now covers safeguarding, digital preservation, compliances, records management and integration, and data discovery, filling a major gap for institutions seeking digital preservation as a comprehensive service with open-standards and a no-vendor-lock-in philosophy.