On the 20th anniversary of the open LTO tape standard, more companies are turning to the reinvented digital tape-in-the-cloud technology.
在开放的LTO磁带标准诞生20周年之际,越来越多的公司开始使用全新的云数字化磁带技术。
By Katia Moskvitch
卡蒂亚·莫斯科维奇(Katia Moskvitch)
Remember the days of Walkman, with your favorite tape stuck inside? The era of audio cassettes may be long gone, but quietly, tape technology lives on in our digital world.
还记得随身听的时代吗? 盒式录音带的时代可能早已过去,但悄无声息地,磁带技术在我们的数字世界中持续存在。
And alongside tomorrow’s quantum computers and today’s artificial intelligence, IBM continues reinventing tape — and supplying more and more tech giants like Microsoft and several others with good ol’ cartridges. Microsoft now has tape libraries deployed in at least 18 data centers around the world.
除了明天的量子计算机和当今的人工智能之外,IBM还在重新发明磁带-并为越来越多的科技巨头(例如Microsoft和其他一些巨头)提供优质的墨盒。 微软现在在全球至少18个数据中心中部署了磁带库。
Soon, companies may start adding really futuristic tape tech; IBM, for one, is now developing the first-ever quantum computing-safe tape drive.
很快,公司可能会开始添加真正的未来磁带技术。 例如,IBM现在正在开发首款量子计算安全的磁带机。
Once quantum computers surpass traditional computers, possibly within the next decade, they will likely be able to break the currently widely used asymetric (public/private key) encryption. With this new tech though that relies on quantum-safe algorithms that are part of the CRYSTAL (CRYptographic Suite for Algebraic latticeS) suite, your data should stay safe.
一旦量子计算机超越传统计算机,可能在未来十年内,它们将有可能打破目前广泛使用的非对称(公钥/私钥)加密技术。 尽管有了这项依赖于CRYSTAL(代数晶格的密码套件)套件中一部分的量子安全算法的新技术,您的数据仍应保持安全。
But that’s still to come.
但这仍然要到来。
Today, the reinvention of tape is all about blurring the borders more than ever between the physical cassette and the abstract digital space. The approach to create tape is still similar to what it was nearly a century ago, when German-Austrian engineer Fritz Pfleumer invented magnetic tape to record sound in 1928. One major difference is that back then tape was analogue. The first digital tape drive came out in 1951, invented by UNIVAC. IBM followed, releasing its own tape drive in 1952.
如今,磁带的重新发明是要比以往更加模糊物理盒带和抽象数字空间之间的边界。 创建磁带的方法仍然与近一个世纪前的情况类似,当时德国奥地利工程师Fritz Pfleumer于1928年发明了磁带来录制声音。一个主要区别是,当时的磁带是模拟的。 第一个数字磁带驱动器由UNIVAC发明于1951年。 随后,IBM于1952年发布了自己的磁带机。
What we have today is still that same digital tape tech — but on steroids. It’s much faster, offers more storage space than ever before — and most of all, the 21st century tape has gone into the cloud.
我们今天拥有的仍然是相同的数字磁带技术,只是在类固醇上。 它比以前快得多,提供了更多的存储空间-最重要的是,21世纪的磁带已经进入了云计算。
For decades, tape had been the best option for backup and recovery of data, but the market started shrinking in early 2000s because of stiff competition with HDD technology combined with data deduplication techniques. Hard disk drives aren’t new, although not as old as tape: IBM made the first HDD, called the IBM Model 350 Disk File, in 1956. It was huge, with 50 24-inch disks inside a cabinet the size of a cupboard, and could store 5MB of data.
几十年来,磁带一直是备份和恢复数据的最佳选择,但是由于与HDD技术和重复数据删除技术的激烈竞争,磁带市场在2000年代初开始萎缩。 硬盘驱动器不是新的,尽管不如磁带老。IBM于1956年制造了第一个HDD,称为IBM Model 350 Disk File。它非常大,在柜子里有50个24英寸磁盘,大小与橱柜一样大。 ,并且可以存储5MB的数据。
Just like the tape technology, HDD tech has kept evolving. It reached the storage capacity of 1 terabyte (TB, or 1,000GB) in 2007, then hit 16TB in 2019 for the largest commercially available HDDs — all while shrinking to just a few inches in size. A 20TB HDD is likely to be launched in 2020/21; Western Digital, one of the leading HDD manufacturers, demoed its 20TB datacenter hard drive in June 2019.
就像磁带技术一样,HDD技术也在不断发展。 它在2007年达到1 TB(TB,即1,000GB)的存储容量,然后在2019年达到了16TB的最大商用HDD,而尺寸却缩小到只有几英寸。 可能在2020/21年推出20TB硬盘; Western Digital是领先的HDD制造商之一,于2019年6月对其20TB数据中心硬盘进行了演示。
Tape to the Rescue
救援胶带
The world has kept producing more and more data but for years it was possible to continue to store it with the constant storage budget. But it can’t continue indefinitely — there’s a fundamental physical limit to how much bits can be squeezed.
世界一直在生产越来越多的数据,但是多年来,有可能以不变的存储预算继续存储数据。 但是它不能无限期地持续下去-可以压缩多少位存在基本的物理限制。
“Data is growing even faster than in the past, because of the new kinds of AI and analytics, of all the applications that are driven by data”, says IBM physicist Mark Lantz. I meet him at a small lab at IBM Research in Zurich. The air is permeated by a low frequency humming noise of turning reels — from top to bottom, the room is filled with multiple generations of tape drives, with hundreds of cassettes either inside or neatly stacked on shelves.
IBM物理学家Mark Lantz说:“由于AI和分析的新类型,数据驱动的所有应用程序的数据增长速度甚至超过了过去。” 我在苏黎世IBM研究中心的一个小型实验室与他会面。 转动的卷轴发出低频嗡嗡声,使空气充满了—从上到下,房间里充满了多代磁带机,里面有数百个盒式磁带,或者整齐地堆放在架子上。
The size of the bits written on tape are about 100 times larger than on HDD, meaning that it’s possible to keep scaling tape at the same rate for at least two more decades. Thanks to the so-called backwards compatibility of Linear Tape Open (LTO), which is celebrating its 20th anniversary this year, and enterprise tape drives, it’s easy to migrate old tapes to new ones.
写在磁带上的位的大小大约是HDD的100倍,这意味着有可能在至少二十年内以相同的速率保持缩放磁带。 得益于今年庆祝成立20周年的线性磁带开放(LTO)和企业磁带驱动器的向后兼容性,可以很容易地将旧磁带迁移到新磁带。
The latest digital IBM tapes have the capacity of 20TB just like the hard disk drives. The highest aerial density of magnetic tape demonstrated to date is about 201 billion bits per square inch — so a tape the size of a typical hard drive could store about 330 terabytes of information. “We’re at the renaissance of tape technology,” says Lantz.
就像硬盘驱动器一样,最新的数字IBM磁带具有20TB的容量。 迄今为止,磁带的最高空中密度约为每平方英寸2010亿比特-因此,与普通硬盘一样大的磁带可以存储约330 TB的信息。 Lantz说:“我们正在磁带技术的复兴中。”
One key advantages of tape is that for large amounts of data, tape systems are much more cost effective to purchase and operate. They are about eight times cheaper, as one tape drive can be used with many cartridges. Operational costs are low because the power consumption of a tape system is much less and when tapes aren’t in use, they require no power at all.
磁带的一个主要优点是,对于大量数据,磁带系统的购买和操作更具成本效益。 它们便宜约八倍,因为一个磁带机可用于许多盒带。 运营成本很低,因为磁带系统的功耗要低得多,而且当不使用磁带时,它们根本不需要电源。
Tape can also play a vital role in a company’s modern data protection plan. This ‘vintage’ technology is a very effective tool against cyberthreats. Unlike with other storage media, it’s possible to easily pull a tape cartridge offline and simply store it on a shelf, creating a physical barrier or “air gap” between hackers and your data. The air gap is a security measure critical to preventing more sophisticated ransomware and malware that could otherwise corrupt the data.
磁带在公司的现代数据保护计划中也可以发挥至关重要的作用。 这项“老式”技术是抵御网络威胁的非常有效的工具。 与其他存储介质不同,可以轻松地将盒式磁带脱机并将其简单地存储在架子上,从而在黑客和您的数据之间形成物理障碍或“空气间隙”。 漏洞是一项安全措施,对于防止可能会破坏数据的更复杂的勒索软件和恶意软件至关重要。
The first tech giant to start using digital tape tech was Google. While the company kept taps on their tape storage at first, the news spilled when during a software update for Gmail, about 1 percent of all of the Gmail accounts got deleted unintentionally. While Google has a redundant system of data centers around the world with multiple copies of data, the software gets replicated across all of the data centers — so the same error happened everywhere. The data was lost in all the data centers — but because Google also had a copy on tape, it was possible to rebuild the lost accounts.
Google是第一家开始使用数字磁带技术的技术巨头。 虽然该公司起初只是利用磁带存储,但在Gmail进行软件更新期间,这一消息流传开来,但所有Gmail帐户中约有1%被无意删除。 Google在全球范围内拥有一个冗余的数据中心系统,其中包含多个数据副本,但该软件却可以在所有数据中心之间进行复制-因此,相同的错误到处都会发生。 数据在所有数据中心都丢失了-但是由于Google还在磁带上有副本,因此有可能重建丢失的帐户。
But is it really the same type of tech as the cassette many of us had in our Walkman in the 1980s and 90s? Fundamentally, the technology is not very different. It all comes down to an electromagnet, magnetic material with the coil wrapped around it. It generates a magnetic field to write data onto the surface, and a sensor then reads it back again.
但这真的和我们中的许多人在1980年代和90年代在随身听中使用的盒带技术相同吗? 从根本上说,技术并没有很大的不同。 一切都归结为电磁体,磁性材料包裹着线圈。 它产生磁场以将数据写入表面,然后传感器再次将其读回。
Over time, this technology has been shrinking, with bits of information written on the magnetic surface becoming smaller and smaller. At the same time, the data rate has been getting faster — today, the rate of a tape drive is 400 megabytes per second. It’s much faster than early tape used to be, and modern tape systems are also highly automated. Tape drives are kept in tape libraries where robots are in charge — a robotic system takes a cartridge and loads it into the drive. Automation increases reliability — no more sloppy humans dropping the reels while mounting it on. Tape is still there but people have been taken out of that loop.
随着时间的流逝,该技术一直在缩小,写入磁性表面的信息位越来越小。 同时,数据速率越来越快-如今,磁带机的速率为每秒400兆字节。 它比以前的磁带快得多,并且现代磁带系统也高度自动化。 磁带驱动器保存在由机器人负责的磁带库中—机器人系统将盒带装入驱动器。 自动化提高了可靠性-安装时无需再草率地将卷轴放下。 磁带仍然在那里,但是人们已经被淘汰了。
Soaring to the cloud — and beyond
飙升至云端-甚至超越
And then there’s the impact of the cloud. In the past, most of the data in the cloud was on hard disk. But hard disks are expensive and power-hungry, as they are constantly spinning. As they spin, they generate heat and need to be cooled, driving power costs up. But the tape just sits in a slot and doesn’t consume any power, meaning the OPEX and cost of ownership is much lower. Because of these advantages, hyperscale cloud providers are now starting to introduce tape into their infrastructure to have a low-cost storage solution, says Robert Haas, the head of the Cloud and AI Systems Research department at IBM Research in Zurich.
然后就是云的影响。 过去,云中的大多数数据都在硬盘上。 但是,由于硬盘不断旋转,它们既昂贵又耗电。 当它们旋转时,它们会发热并需要进行冷却,从而提高了电源成本。 但是磁带只是放在插槽中,不会消耗任何电能,这意味着运营成本和拥有成本要低得多。 苏黎世IBM Research云与AI系统研究部门负责人Robert Haas说,由于这些优势,超大规模云提供商现在开始将磁带引入其基础架构中以拥有低成本存储解决方案。
There is a trade-off though: with the HDD, one can access data in a few tens of milliseconds. But in a tape system, with the robot getting the cartridge, loading it into a drive and fast forwarding it to the right place — all that process takes tens of seconds at least.
不过,这是一个折衷方案:使用HDD,人们可以在几十毫秒内访问数据。 但是在磁带系统中,随着机械手取下盒带,将其装入驱动器并将其快速转发到正确的位置,所有这些过程至少要花费数十秒钟。
For many applications — particularly backup and archival needs — that is not an issue. Data for such applications is ‘cold’ data, meaning it hasn’t been accessed in months and just sits in the cloud or in data centers. “On social media, when somebody first posts something, it’s beautiful. It’s known as hot data. But a week later, nobody looks at it and turns into cold data. As data ages, it becomes less frequently accessed,” says Lantz.
对于许多应用程序-尤其是备份和存档需求-这不是问题。 此类应用程序的数据是“冷”数据,这意味着数月以来没有被访问过,仅位于云或数据中心中。 “在社交媒体上,当有人第一次发布某些内容时,它就很漂亮。 这就是热门数据。 但是一周后,没有人看到它,并变成冷数据。 随着数据的老化,它变得越来越不经常访问。” Lantz说。
And in future, this access latency — the time it takes to get to the cold data — may be resolved anyway with the help of artificial intelligence. At IBM, researchers are working on combining tape and AI to predict accesses to data. This way, data would be brought back from tape even before a user wants to read that touching social media post from last month or watch a year-old cute puppy video.
而且将来,这种访问延迟(获取冷数据所花费的时间)仍然可以借助人工智能来解决。 在IBM,研究人员正在研究结合磁带和AI来预测对数据的访问。 这样,甚至在用户想要阅读上个月那篇感人的社交媒体帖子或观看一岁的可爱小狗视频之前,数据也将从磁带中带回。
Today, social media giants have huge data centers with hard disk drives used for backup of primary data. To make it more cost effective, the power is turned off for 94 percent of the disks at any given time, and to access a piece of data, it’s necessary to wait until the disks where it is stored are powered up. As a result, it can take several hours before a piece of data can be accessed — while with tape, it only takes tens of seconds. “Because of that, there is lot of interest to extend the adoption of tape for the backups and more generally for all archival data, creating an active archive,” says Lantz. “There’s a large demand for that in the cloud from hyperscales.”
如今,社交媒体巨头拥有巨大的数据中心,其硬盘驱动器用于备份主要数据。 为了提高成本效益,在任何给定时间都会关闭94%磁盘的电源,并且要访问一段数据,必须等到存储该磁盘的磁盘通电后才能使用。 结果,访问一条数据可能要花费几个小时,而使用磁带则只需要几十秒钟。 Lantz说:“因此,有很多兴趣扩展磁带在备份中的使用,并且更普遍地在所有档案数据中采用磁带,从而创建一个活动的存档。” “超大规模对云的需求很大。”
So if you do come across an old Michael Jackson tape from 1980, don’t chuck it in the bin. Who knows what technology we will have in a decade or two that may suddenly be able to play it.
因此,如果您确实碰到了1980年的旧Michael Jackson磁带,请不要将其丢入垃圾箱。 谁知道十年或两年后我们将拥有的可能突然使用的技术。
翻译自: https://medium.com/@IBMResearch/why-more-tech-giants-opt-for-ibms-tape-data-storage-in-the-cloud-92b698990217