A university’s digital content is a precious commodity that can deliver value in the short-term, medium-term and for generations to come.
But even though organisations are creating and storing greater amounts of data than ever before - the volume of data created digitally on a daily basis is 2.5 quintillion bytes of data - it can all easily be lost forever without the right planning and foresight.
Ultimately, for digital content to drive long-term success, we need to take proactive steps to capture, protect and future-proof web and social media data.
"Capture it all today, and work out how we might use it tomorrow"
If only, you might say… and it’s actually something of a counter-intuitive statement for the university archivist confronted by today’s challenges.
This is because archivists are usually conflicted by the unenviable task of deciding what should and what cannot be archived, due to limitations of storage capacity, lack of resource and diminishing budget.
The future archive will be liberated of these restrictions as digital storage is infinite, scalable and relatively low cost in comparison.
- Why digital archiving for universities
- Hybrid end-to-end digital archiving solution for universities
- Case study: University of Westminster
When you then add in full text search and faceted search technology, AI and machine learning, the ability to collate, instantly recall and find the most minute piece of data in a mind blowing university of items across every conceivable format will revolutionise the way we think about the archive of the future.
But, before we run too fast, let’s consider where we are today.
We know there are many drivers for digital archiving in universities, but what are the barriers preventing organisations from engaging with a web and social media archiving solution?
We take a closer look:
1. Lack of an archiving solution
A lack of available solutions has naturally made website and social media archiving difficult for universities.
This has even resulted in unfortunate situations whereby librarians have had to archive their tweets by copying and pasting them into their library management systems, and then reference them back on their online categories - because what else you could do when Twitter first emerged and there wasn’t really a solution to know about and to implement?
This lack of available digital archiving solutions has been influenced by a community in which there have been many discussions about problems but a lack of tangible solutions - and it is only recently that state-of-the-art, secure and cloud-native web and social media archiving solutions have arrived in the market to support organisations.
Keeping a record of web and social media content can be critical for long-term success. Download our guide and find out how to capture, protect and future-proof this data.
2. Lack of awareness of digital archiving solutions
There are emerging web and social media archiving solutions, but there remains a lack of knowledge about them or how to proceed with a digital archiving solution. This is mainly because, when we’re talking about something as new in archival reference terms as web archiving and social media, people don’t think there is a solution.
This is apparent at recent events such as the ARA Conference where many archivists were unaware that it was possible to archive Instagram, Facebook or Twitter, and to capture the MetaData around each of those platforms and each of those interactions.
There is the need to educate the market about available digital archiving solutions for universities - because whether it’s a website, blog, survey, behind a firewall, social media, Instagram, or even snapchat that lasts a matter of minutes, it can be captured and preserved.
3. Meeting big data requirements
Universities are complex information-driven organisations and house a lot of research data, etc., and so a concern amongst organisations in the higher education market is whether a digital archiving solution can meet their big data requirements.
Associated factors include whether available solutions can cope with large websites, the regularity of crawls and the cost of storage, and if this data can then be available for practical use. Large image libraries are also viewed as difficult to manage and presenting a big data requirement.
We have found cloud-native digital archiving to be the best solution for meeting organisation’s big data needs. It is a scalable, cost effective solution able to meet organisation requirements no matter the size of a website or social media account.
4. Lack of budget and resources
The initial reaction amongst organisations in the higher education market is that web and social media archiving is hard, expensive and going to need external funding and an expert. It is an assumption held by many organisations, but cost will ultimately depend on the size of the dataset (e.g. a single or multiple websites) and the frequency of archiving.
As an example, for financial clients the cost is likely to be more expensive as they will have to archive every day. If archiving takes place once a year or once a month, dependent on a university’s requirements, the price would be cheaper.
Scalable, cloud-native archiving solutions mean organisations are able to decrease overheads that would otherwise be needed to maintain infrastructure, including outgoings on space for local servers, power, etc. and hardware upgrade cycles - and reduce costs by meeting storage requirements on a project by project basis.
A lack of budget, however, can also mean a lack of resources to have the capacity to identify and engage with digital archiving services - which is one of the reasons why MirrorWeb now provide higher education organisations with the opportunity to access a free 30GB ‘research archive’ to ‘get under the bonnet’ and inspect the digital archiving solution.
5. Convincing senior management
It can be difficult raising awareness amongst people in such senior management positions about the need for social and website archiving - but work is being done in the community to support practitioners in convincing senior management to take digital preservation action.
For example, MirrorWeb are contributing to the DPC/UNESCO ‘Executive Guide on Digital Preservation’. This is due for release early in 2019 and will include tailorable messages and evidence to enable digital preservationists across all sectors and organisation types to create a document for the attention of senior management within their organisation.
6. Support moving to a new provider
Whether organisations are managing digital archiving in-house or using an external service provider, they want to know there is a clear strategy, the associated costs and how best to proceed should they need to move to a different digital archiving provider.
This is something MirrorWeb has enjoyed great success with for The National Archives when we helped them move to a new web and social media archive provider in two weeks. That was over 5,000 websites from 1996 to the present, as well as tweets and videos from government social media accounts, with the data footprint of the archive being over 120 TB in 2018.
John Sheridan, Digital Director at the UK National Archives, discusses how MirrorWeb moved their web and social media archive to our solution.
So, whether you need to move a lot of or a small amount of archive data, the principles of moving the data remain the same, and no matter your requirements, we will be able to support your move to a new provider.
7. Sensitive data
Questions around sensitive data cover issues such as the legal restrictions of social media, including copyright, GDPR and the ethical concerns of capturing so much data on students. This brings into focus complications of personal data regulations and platform restrictions, which can make it difficult for institutions to produce a useful digital preservation policy for their researchers and research staff.
These form the basis of ongoing discussions within universities and other organisations, with calls for more open data at one end of the spectrum and calls to protect personal data at the other. Although there is no clear direction at present, we can say with certainty that MirrorWeb’s solution is GDPR compliant.
Get your free 30GB HE Essentials 'research archive'
MirrorWeb and Arkivum’s hybrid, end-to-end digital archiving solution is the most comprehensive data lifecycle management solution for website and social media archiving in universities.
David Clee, CEO of MirrorWeb, gives a quick demo on how to use the MirrorWeb portal to view archives of your website.
The portal provided by MirrorWeb is user-friendly, light-touch with minimal user input to setup, crawl and replay web and social media archives in high-fidelity, and cost-effective thanks to cloud technology.
The WARCs created are passed seamlessly and automatically via an API to Arkivum’s Perpetua system, preserving and future-proofing the web and social media content for all time.
Click the banner below to access your free 30GB ‘research archive’ and receive further details on the special ‘HE Essentials’ package to start your university’s extended and comprehensive archive service from as little as £1,800 per year for a combined web and social solution.