What rapid data growth means for universities in 2020
May 06, 2020 • 7 min read
The amount of data in the world is set to rise tenfold - from 16 ZB to over 160 ZB - between 2016 and 2025, according to a report from IDC.1 (to amass 1 ZB of data you would have to fill over 34 billion smartphones to full capacity).
It’s a mind-blowing amount of data. So, ask yourself:
- Have you ever thought about how much data is created by your university in a single day?
- How much content is shared and distributed online by students, lecturers and other staff?
- How much content is being channeled via social feeds, internal communication platforms and your website?
Digital content is a precious commodity. But it can all easily be lost forever without the right planning and foresight.
Ultimately, for digital content to drive long-term success, we need to take proactive steps to capture, protect and future-proof web and social media data.
How are universities responding to this?
Web archiving is becoming more important for universities and HEPs in general. Having specialised archiving technology (or 'crawls') scour your digital estate, to create ISO-compliant records is increasingly becoming the solution for these organisations.
Of course, this makes sense as universities are well acquainted with the need to archive not only for compliance reasons. For years, these institutions have been preserving information of importance, keeping records that are incredibly valuable to students, faculty, researchers, wider society, and the general public.
To help universities during these uncertain times MirrorWeb has created a heritage and compliance service, making access to digital archiving solutions easier and ensuring long-term preservation is guaranteed.
However, universities' ongoing shift towards digitisation means many organisations now have key information that can only be found online.
As more data is created, and we're well on track to 2025's heady predictions, it's worthwhile realising what this means for universities:
1. Greater digital fragility risk
Like a lot of organisations, universities believe their digital assets (websites, online content, research, student data etc) is protected enough in terms of security standards.
While this does protect against immediate threats, universities are increasingly exposing themselves to the long-term risk of 'digital fragility'. This is the concept of continual data loss from an infrastructure that fails to keep up to date with technological needs.
Here are a few common examples of digital fragility mistakes:
Obsolete technologies and formats - From floppy disks to phased-out file formats, it often takes only a few years for hardware or software to become obsolete and unsupported. This can make it non-trivial or even impossible to retrieve the content stored there.
Use of third-party platforms - Organisations now publish a vast amount of digital content via web and social media platforms. These platforms may not be in a company's direct control and use complex, interactive, non-standardised formats.
Reliance on content management and backups - Most organisations make provisions for short and medium-term secure data storage, but without considering how to protect and future-proof this content so it can still be accessed and used in the long-term.
- Barriers to digital archiving in universities
- Hybrid end-to-end digital archiving solution for universities
- Case study: University of Westminster
The consequences of not capturing digital content today is incomprehensible to the potential impact it will have on each future generation’s ability to access these archives. If you’re not capturing web and social content now, to do it retrospectively in 12 months’ time is virtually impossible.
2. New regulatory standards
University websites now represent a key record of what is happening at any given time and act as a key repository for official documents. For example, many university publications have been replaced by online web publications while information such as course materials, research outputs and blogs are found on websites.
Preserving digital assets like these isn't just best practice, but a regulatory requirement. As well as having to adhere to GDPR and accessibility standards, universities have to meet requirements set forth by the CMA over the information they publish for students.
This online information is vital to capture for legal reasons and you can learn more about it here (Advice on Consumer Protection Law for UK Higher Education Providers). Complying with such regulations is an important part of protecting universities against the risks of non-compliance (which can result in hefty fines and even worse reputational damage).
3. Heightened need to protect research data
Researcher databases need preserving for future generations. Many times, resource products can involve the creation of websites which help ensure compliance with open data initiatives, funder requirements and the Research Excellence Framework (REF).
Such websites are highly valuable assets, which is why many universities create archives of them to support REF. In addition to satisfying compliance requirements, and ensuring there is a fully interactive record made to stand the test of time, this also provides evidence of university research outputs used or praised by external parties.
Most universities also encourage their researchers to deposit large web and social datasets withinspecialist data centres. This makes the data more discoverable to the research community who might reuse them.
4. More opportunities to strengthen legacies
We’ve had a long time to get used to moving from the physical archive to the digital archive, ensuring that vital records are always accessible. Unfortunately, due to the rate at which digital content is being created and the threat of digital fragility (as discussed above) time isn't a luxury universities have with digital preservation.
Even if we’re not entirely sure about the potential insight that will be generated from archived digital data in the future, it’s better to capture it. This means you not only own this data but it means that as we learn, evolve and develop, we can use this historical information to our advantage, knowing that it's always accessible for future generations to come.
In the same way universities champion legacies with statues, plaques and cavernous libraries, the same should be done for the vast digital estates being created.
This isn't just about best practice record-keeping, but helping protect a legacy that will only become more valued in time. Creating fit-for-purpose archives of these digital estates will increasingly become an aim for many forward-thinking HEPs.
5. Data isn't just an asset, it's an investment
Website and social media archiving helps to preserve an organisation’s investments in digital communications - for example, professionally produced videos and blog content - that would otherwise be at risk of being lost.
We've already discussed how universities will be forced to think about their digital estates differently. However, this is another way that HEPs will increasingly come to regard their data: as an investment, as well as an asset.
With websites, digital content and social media now requiring greater investment, the output of these projects is becoming more valuable.
This is content and data that might have provided short to mid-term value, but which now has long-term value and brings real benefits to organisations. This means universities, like all businesses, are starting to think more about how they protect this value in the long term.