We in Telegram
Add news
March 2010 April 2010 May 2010 June 2010 July 2010
August 2010
September 2010 October 2010
November 2010
December 2010
January 2011
February 2011 March 2011 April 2011 May 2011 June 2011 July 2011 August 2011 September 2011 October 2011 November 2011 December 2011 January 2012 February 2012 March 2012 April 2012 May 2012 June 2012 July 2012 August 2012 September 2012 October 2012 November 2012 December 2012 January 2013 February 2013 March 2013 April 2013 May 2013 June 2013 July 2013 August 2013 September 2013 October 2013 November 2013 December 2013 January 2014 February 2014 March 2014 April 2014 May 2014 June 2014 July 2014 August 2014 September 2014 October 2014 November 2014 December 2014 January 2015 February 2015 March 2015 April 2015 May 2015 June 2015 July 2015 August 2015 September 2015 October 2015 November 2015 December 2015 January 2016 February 2016 March 2016 April 2016 May 2016 June 2016 July 2016 August 2016 September 2016 October 2016 November 2016 December 2016 January 2017 February 2017 March 2017 April 2017 May 2017 June 2017 July 2017 August 2017 September 2017 October 2017 November 2017 December 2017 January 2018 February 2018 March 2018 April 2018 May 2018 June 2018 July 2018 August 2018 September 2018 October 2018 November 2018 December 2018 January 2019 February 2019 March 2019 April 2019 May 2019 June 2019 July 2019 August 2019 September 2019 October 2019 November 2019 December 2019 January 2020 February 2020 March 2020 April 2020 May 2020 June 2020 July 2020 August 2020 September 2020 October 2020 November 2020 December 2020 January 2021 February 2021 March 2021 April 2021 May 2021 June 2021 July 2021 August 2021 September 2021 October 2021 November 2021 December 2021 January 2022 February 2022 March 2022 April 2022 May 2022 June 2022 July 2022 August 2022 September 2022 October 2022 November 2022 December 2022 January 2023 February 2023 March 2023 April 2023 May 2023 June 2023 July 2023 August 2023 September 2023 October 2023 November 2023 December 2023 January 2024 February 2024 March 2024 April 2024 May 2024
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
20
21
22
23
24
25
26
27
28
29
30
31
News Every Day |

Machine learning for video transcoding

At YouTube we care about the quality of the pixels we deliver to our users. With many millions of devices uploading to our servers every day, the content variability is so huge that delivering an acceptable audio and video quality in all playbacks is a considerable challenge. Nevertheless, our goal has been to continuously improve quality by reducing the amount of compression artifacts that our users see on each playback. While we could do this by increasing the bitrate for every file we create, that would quite easily exceed the capacity of many of the network connections available to you. Another approach is to optimize the parameters of our video processing algorithms to meet bitrate budgets and minimum quality standards. While Google’s compute and storage resources are huge, they are finite and so we must temper our algorithms to also fit within compute requirements. The hard problem then is to adapt our pipeline to create the best quality output for each clip you upload to us, within constraints of quality, bitrate and compute cycles.


This is a well known triad in the world of video compression and transcoding. The problem is usually solved by finding a sweet spot of transcoding parameters that seem to work well on average for a large number of clips. That sweet spot is sometimes found by trying every possible set of parameters until one is found that satisfies all the constraints. Recently, others have been using this “exhaustive search” idea to tune parameters on a per clip basis.


What we’d like to show you in this blog post is a new technology we have developed that adapts our parameter set for each clip automatically using Machine Learning. We’ve been using this over the last year for improving the quality of movies you see on YouTube and Google Play.


The good and bad about parallel processing



We ingest more than 400 hours of video per minute. Each file must be transcoded from the uploaded video format into a number of other video formats with different codecs so we can support playback on any device you might have. The only way we can keep up with that rate of ingest and quickly show you your transcoded video in YouTube is to break each file in pieces called “chunks,” and process these in parallel. Every chunk is processed independently and simultaneously by CPUs in our Google cloud infrastructure. The complexity involved in chunking and recombining the transcoded segments is significant. Quite aside from the mechanics of assembling the processed chunks, maintaining the quality of the video in each chunk is a challenge. This is because to have as speedy a pipeline as possible, our chunks don’t overlap, and are also very small; just a few seconds. So the good thing about parallel processing is increased speed and reduced latency. But the bad thing is that without the information about the video in the neighboring chunks, it’s now difficult to control chunk quality so that there is no visible difference between the chunks when we tape them back together. Small chunks don’t give the encoder much time to settle into a stable state hence each encoder treats each chunk slightly differently.

Smart parallel processing



You could say that we are shooting ourselves in the foot before starting the race. Clearly, if we communicate information about chunk complexity between the chunks, each encoder can adapt to what’s happening in the chunks after or before it. But inter-process communication increases overall system complexity and requires some extra iterations in processing each chunk.


Actually, OK, truth is we’re stubborn here in Engineering and we wondered how far we could push this idea of “don’t let the chunks talk to each other.”


The plot below shows an example of the PSNR in dB per frame over two chunks from a 720p video clip, using H.264 as the codec. A higher value of PSNR means better picture quality and a lower value means poorer quality. You can see that one problem is the quality at the start of a chunk is very different from that at the end of the chunk. Aside from the average quality level being worse than we would like, this variability in quality causes an annoying pulsing artifact.


Because of small chunk sizes, we would expect that each chunk behaves like the previous and next one, at least statistically. So we might expect the encoding process to converge to roughly the same result across consecutive chunks. While this is true much of the time, it is not true in this case. One immediate solution is to change the chunk boundaries so that they align with high activity video behavior like fast motion, or a scene cut. Then we would expect that each chunk is relatively homogenous so the encoding result should be more uniform. It turns out that this does improve the situation, but not as much as we’d like, and the instability is still often there.


The key is to allow the encoder to process each chunk multiple times, learning on each iteration how to adjust its parameters in anticipation of what happens in across the entire chunk instead of just a small part of it. This results in the start and end of each chunk having similar quality, and because the chunks are short, it is now more likely that the differences across chunk boundaries are also reduced. But even then, we noticed that it can take quite a number of iterations for this to happen. We observed that the number of iterations is affected a great deal by the quantization related parameter (CRF) of the encoder on that first iteration. Even better, there is often a “best” CRF that allows us to hit our target bitrate at a desired quality with just one iteration. But this “best” setting is actually different for every clip. That’s the tricky bit. If only we could work out what that setting was for each clip, then we’d have a simple way of generating good looking clips without chunking artifacts.


The plot on the right shows the result of many experiments with our encoder at varying CRF (constant quality) settings, over the same 1080p clip. After each experiment we measured the bitrate of the output file and each point shows the CRF, bitrate pair for that experiment. There is a clear relationship between these two values. In fact it is very well modeled as an exponential fit with three parameters, and the plot shows just how good that modeled line is in fitting the observed data points. If we knew the parameters of the line for our clip, then we’d see that to create a 5 Mbps version of this clip (for example) we’d need a CRF of about 20.

Pinky and the Brain



What we needed was a way to predict our three curve fitting parameters from low complexity measurements about the video clip. This is a classic problem in machine learning, statistics and signal processing. The gory mathematical details of our solution are in technical papers that we published recently.1 You can see there how our thoughts evolved. Anyway, the idea is rather simple: predict the three parameters given things we know about the input video clip, and read off the CRF we need. This prediction is where the “Google Brain” comes in.


The “things we know about the input video clip” are called video “features.” In our case there are a vector of features containing measurements like input bit rate, motion vector bits in the input file, resolution of the video and frame rate. These measurements can also be made from a very fast low quality transcode of the input clip to make them more informative. However, the exact relationship between the features and the curve parameters for each clip is rather more complicated than an equation we could write down. So instead of trying to discover that explicitly ourselves, we turned to Machine Learning with Google Brain. We first took about 10,000 video clips and exhaustively tested every quality setting on each, measuring the resulting bitrate from each setting. This gave us 10,000 curves which in turn gave us 4 x 10,000 parameters measured from those curves.


The next step was to extract features from our video clips. Having generated the training data and the feature set, our Machine Learning system learned a “Brain” configuration that could predict the parameters from the features. Actually we used both a simple “regression” technique as well as the Brain. Both outperformed our existing strategy.  Although the process of training the Brain is relatively computationally heavy, the resulting system was actually quite simple and required only a few operations on our features. That meant that the compute load in production was small.


Does it work?

The plot on the right shows the performance of the various systems on 10,000 video clips. Each point (x,y) represents the percentage of clips (y-axis) in which the resulting bitrate after compression is within x% of the target bitrate. The blue line shows the best case scenario where we use exhaustive search to get the perfect CRF for each clip. Any system that gets close to that is a good one. As you can see at the 20% rate, our old system (green line) would hit the target bitrate 15% of the time. Now with our fancy Brain system we can hit it 65% of the time if we use features from your upload only (red line), and better than 80% of the time (dashed line) using some features from a very fast low quality transcode.
nn_wp_prediction.png


But does this actually look good? You may have noticed that we concentrated on our ability to hit a particular bitrate rather than specifically addressing picture quality. Our analysis of the problem showed that this was the root cause. Pictures are the proof of the pudding and you can see some frames from a 720p video clip below (shot from a racing car). The top row shows two frames at the start and end of a typical chunk and you can see that the quality in the first frame is way worse than the last. The bottom row shows the frames in the same chunk using our new automated clip adaptive system. In both cases the measured bitrate is the same at 2.8 Mbps. As you can see, the first frame is much improved and as a bonus the last frame looks better as well. So the temporal fluctuation in quality is gone and we also managed to improve the clip quality overall.




This concept has been used in production in our video infrastructure division for about a year. We are delighted to report it has helped us deliver very good quality streams for movies like "Titanic" and most recently "Spectre." We don’t expect anyone to notice, because they don’t know what it would look like otherwise.

But there is always more we can do to improve on video quality. We’re working on it. Stay tuned.

Anil Kokaram, Engineering Manager, AV Algorithms Team, recently watched "Tony Cozier speaking about the West Indies Cricket Heritage Centre," Yao Chung Lin, Software Engineer, Transcoder Team, recently watched "UNDER ARMOUR | RULE YOURSELF | MICHAEL PHELPS," Michelle Covell, Research Scientist, recently watched "Last Week Tonight with John Oliver: Scientific Studies (HBO)" and Sam John, Software Engineer, Transcoder Team, recently watched "Atlantis Found: The Clue in the Clay | History."



1Optimizing transcoder quality targets using a neural network with an embedded bitrate model, Michele Covell, Martin Arjovsky, Yao-Chung Lin and Anil Kokaram, Proceedings of the Conference on Visual Information Processing and Communications 2016, San Francisco
Multipass Encoding for reducing pulsing artefacts in cloud based video transcoding, Yao-Chung Lin, Anil Kokaram and Hugh Denman, IEEE International Conference on Image Processing, pp 907-911, Quebec 2015

Ballroom culture coming to the Long Beach Pride Festival

Glen Powell’s parents crash Texas movie screening to troll him

$90,000 settlement approved in teen’s bullying lawsuit against LAUSD

Gunmen open fire and kill 4 people, including 3 foreigners, in Afghanistan's central Bamyan province

Ria.city






Read also

Francis Ford Coppola says his sci-fi saga 'Megalopolis' is being panned because Hollywood doesn't like rule-breakers

Trump 'does not like to surround himself with yes people,' says Ben Carson

MAFS star Ella Morgan stuns as she strips down for sexiest ever underwear snaps after Bobby Brazier snog

News, articles, comments, with a minute-by-minute update, now on Today24.pro

News Every Day

Ballroom culture coming to the Long Beach Pride Festival

Today24.pro — latest news 24/7. You can add your news instantly now — here


News Every Day

$90,000 settlement approved in teen’s bullying lawsuit against LAUSD



Sports today


Новости тенниса
WTA

Потапова проиграла на старте турнира WTA-500 в Страсбурге



Спорт в России и мире
Москва

"Спартак" обыграл "Рубин" в последнем матче Джикии



All sports news today





Sports in Russia today

Москва

"Спартак" обыграл "Рубин" в последнем матче Джикии


Новости России

Game News

Шапки женские вязаные на Wildberries, 2024 — новый цвет от 392 руб. (модель 466)


Russian.city


Москва

Терапевт рассказал об алгоритме действий на случай укуса клеща


Губернаторы России
Динамо

Разгром в Москве и драма в Сочи: «Динамо» без труда сохранило лидерство, но «Краснодар» остался в гонке за золото РПЛ


РОССИЯ И КИТАЙ: В МИРЕ ВОЗМОЖНА ГЕГЕМОНИЯ ЛИШЬ ИНТЕРЕСА НАРОДА, ЗАКОНА, ИСТИНЫ И СПРАВЕДЛИВОСТИ.

Пропавшую в Москве 13-летнюю школьницу нашли мертвой. Она могла отравиться психотропами

Хамоян: полиция Армении закрыла въезд в село Киранц на границе с Азербайджаном

Шапки женские вязаные на Wildberries, 2024 — новый цвет от 392 руб. (модель 466)


Королева прервала молчание после скандала с эфиром и позвала на свой концерт

Балерина Волочкова продолжила участвовать в благотворительных концертах

Mash: певец Сергей Шнуров задолжал ФНС больше 4,5 млн рублей

В Парке Горького вновь пройдет Московский детский фестиваль искусств «НЕБО»


Соболенко вышла в финал турнира WTA-1000 в Риме, в двух сетах обыграв Коллинз

Новак Джокович: «Я никогда не скажу, кого считаю величайшим в истории – оставлю это другим»

Диего Шварцман: «Два чилийца в полуфинале в Риме. Шесть латиноамериканцев в топ-30. А ATP в следующем году уберет один из турниров в Южной Америке»

Соболенко вышла в полуфинал турнира WTA в Риме



В столице Туркменистана - Ашхабаде открыли памятник легендарному армянскому поэту и композитору Саят-Нове

Открытие восьмого сезона программы «Военные оркестры в парках» в Подмосковье

Шапки женские на Wildberries — скидки от 398 руб. (на новые оттенки)

Тело пропавшей два дня назад школьницы нашли на востоке Москвы


Что там в IT: ИИ-отрыв Google, ChatGPT почти человек, отечественный BIOS

RPG Battle of Souls доступна в Google Play 2 стран

«Пятёрочка» подарила 1 000 000 рублей Дзержинской школе по итогам «Здоровой Олимпиады»

Гандболистки «Ростов-Дон» уступили ЦСКА в финале чемпионата России


Высокий сезон: индустрия гостеприимства России в поиске рабочих рук

Сплющило грузовиками: на трассе М-5 "Урал" погиб водитель легковушки

Сергей Собянин рассказал о подготовке к ЕГЭ и ОГЭ в Москве

Испанский стыд: Мадрид вывел своих военных из Мали



Путин в России и мире






Персональные новости Russian.city
Тимати

Мама Тимати ответила на провокационные вопросы о Валентине Ивановой и других девушках своего сына



News Every Day

Glen Powell’s parents crash Texas movie screening to troll him




Friends of Today24

Музыкальные новости

Персональные новости