6054 Commits

Author SHA1 Message Date
Matt Oliver ef79796c8d project: Update for 1.15.1 merge. 2025-06-14 21:52:12 +10:00
Matt Oliver 8186b704b7 Merge commit '39e8b9dcd4696d9ac3ebd4722e012488382f1adb' 2025-06-14 21:02:34 +10:00
Matt Oliver bcecc98a79 project: Update for 1.15.0 merge. 2025-06-14 19:48:26 +10:00
Matt Oliver 70414c8112 Merge commit '9f9b7e9ba2eb9d01640a9e69a3d655866265cf7f' 2025-06-14 19:22:24 +10:00
Jerome Jiang 39e8b9dcd4 Bump VPX_EXT_RATECTRL_ABI_VERSION
Fix the order of the newly added codec control

Bug: webm:384672478
Change-Id: I045c58865399ea9d74c91c1a5521215a0d2032f7
2025-01-10 14:30:59 -05:00
Jerome Jiang 2f1ad02bd5 Add changelog for v1.15.1 release
Bug: webm:384672478
Change-Id: I3346b06eb05e306eb23f7281b65ff7e9c84e7e6b
2025-01-09 14:52:08 -05:00
Jerome Jiang 69da847f66 Bump up version major
Before v1.15.0: c=10, a=1, r=0

Rule #3: source code has changed, increment r:
r=1

Rule #4: interfaces were removed in vpx_tpl.h, set r=0, increment c:
c=11, r=0

Rule #5: no interfaces have been added

Rule #6: interfaces were removed in vpx_tpl.h, set a=0:
a=0

After release: c=11, a=0, r=0

major = c-a = 11
minor = a = 0
patch = r = 0

Bug: webm:384672478
Change-Id: I2e70e7e35c64ece32eaf1dc5625640965483f9b9
2025-01-07 14:17:40 -05:00
James Zern 82a0c8a2db configure: add support for darwin24 (macOS 15.x)
Bug: webm:379534940
Change-Id: I8777b6bb8653a31080801e35916df9aa39a4c999
(cherry picked from commit 6e23d972a7a717f2ba3970c82b6b96d350b5bcde)
2024-11-19 22:03:26 +00:00
Matt Oliver bbedc9be7c project: Update template file for WinRT. 2024-11-09 19:38:13 +11:00
Jerome Jiang 9f9b7e9ba2 Changelog: add neon optimization speed up stats
Bug: webm:372498543
Change-Id: I297be5efb602b0181c2b25ff8b50060c10263130
2024-10-23 14:24:57 -04:00
Jerome Jiang 0ba09cc79f Update CHANGELOG and version
Bug: webm:372498543
Change-Id: Ieddfa0b18f8c5e53ab65e04b52b5a601c672ba62
2024-10-22 15:36:11 -04:00
James Zern 3939c5ebb0 vpx_highbd_convolve8_avg_sve2: fix C fallback typo
vpx_highbd_convolve8_c -> vpx_highbd_convolve8_avg_c

Change-Id: I8bc73c59d3e654739ee5c42a295f4ecdee6d7631
(cherry picked from commit 3500e57e52b6af057ed54223e15e560a95df8479)
2024-10-10 13:32:53 -04:00
Jerome Jiang 816a90fe76 Update AUTHORS and .mailmap
Bug: webm:372498543
Change-Id: I5041fd1558c0a36dce10395ec6e836f3d55384dc
2024-10-09 15:18:53 -04:00
Marco Paniconi 192b4a4ce7 rtc-vp9: Always disable svc_use_low_part
Possible fix for issue below. It was only disabled
for screen in a previous change, but we force it off
always to check if it clears the issue.

The speed feature disabled is only used for 3 spatial
layers and at least 2 temporal. The impact on speed is
expected to be small, ~2%, so ok to disable for now and
see if it clears the issue.

Bug: 366146260
Change-Id: If7af006425e1e0ef297b9d6466507ea4c90ddb6f
(cherry picked from commit 09b3d5fc5aa48752f95f4c0c37b0bd4ff55c0ba1)
2024-10-09 15:07:31 -04:00
Marco Paniconi cdd4e35015 vp8: Fix integer overflow in encode_frame_to_data_rate
Integer overflow in encode_frame_to_data_rate()
for the update:
lc->total_target_vs_actual += bits_off_for_this_layer

Fix is to use int64_t for total_target_vs_actual.

Bug: chromium:368114043
Change-Id: I9a01e1a69e26ae748e8ae23d9e1287431510388d
2024-09-20 09:50:41 -07:00
Wan-Teh Chang aa73610d03 Fix a typo: avg_frame_index => avg_frame_qindex
Change-Id: I8fd9f6f01ae712a9bf3dc9e34fe5f7115a305109
2024-09-19 00:09:36 -07:00
Marco Paniconi 417204d7fd rtc-vp9: Fix to integer overflow in vp9-svc
Divide by 3 instead of multiple by 3, in comparison of
lrc->avg_frame_bandwidth vd lrc->last_avg_frame_bandwidth,
in two functions for reset rc.
Small loss in precision, so acceptable.

Similar change to:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5698570

Bug: chromium:367892770
Change-Id: Ia9ef09a9f6beba930fedd496407cfa7057e39336
2024-09-18 14:33:49 -07:00
James Zern ac68e7f999 aarch64_cpudetect: detect SVE/SVE2 on Windows
PF_ARM_SVE_INSTRUCTIONS_AVAILABLE and PF_ARM_SVE2_INSTRUCTIONS_AVAILABLE
are available in WinSDK 10.0.26100 and recent versions of mingw-w64.

Based on a patch by Martin Storsjö on ffmpeg-devel:
https://ffmpeg.org/pipermail/ffmpeg-devel/2024-September/333611.html

Change-Id: I34b2341a559f95aa400e84d709f3eb36da5dbb7b
2024-09-18 19:43:10 +00:00
James Zern 729b78a127 aarch64_cpudetect: detect I8MM on Windows via SVE-I8MM
There's no direct processor feature constant for I8MM alone, but
there is a flag for SVE-I8MM (added in WinSDK 10.0.26100 and
recent versions of mingw-w64). If SVE-I8MM is available, we can
assume that I8MM is available.

While HW supporting these features isn't yet commonly running
Windows, this at least allows detecting and running the I8MM codepaths
in Windows builds in Wine (possibly running in QEMU).

Based on patch from Martin Storsjö on ffmpeg-devel:
https://ffmpeg.org/pipermail/ffmpeg-devel/2024-September/333609.html

Change-Id: I77117bee8516924fddcdecccae8bab3cf5beed96
2024-09-18 19:43:10 +00:00
James Zern 6dfdc4ee10 tiny_ssim: fix argc check
The program requires a minimum of 2 parameters. Previously the tool
would crash if only one input file was given.

Bug: webm:365481206
Change-Id: I875d81b2db4fcc4338061c03b23bb51b0aad58e4
2024-09-18 19:06:17 +00:00
Marco Paniconi 696c488d35 rtc-vp9: Disable svc_use_low_part for screen
Possible fix for issue below. The speed feature disabled
is only used for 3 spatial layers and at least 2 temporal.
The impact on speed is expected to be small, ~2%, so ok
to disable for now and see if it clears the issue.

Bug: 366146260

Change-Id: I94ab991d583cc2ce758db337abbbb463a65f0767
2024-09-17 23:07:24 +00:00
Jerome Jiang c6de95ce0e Initialize gf_picture in vp9 tpl
Bug: b/365068397
Change-Id: Id267532928353148f916b73feb19de515db14cb9
2024-09-06 13:02:12 -04:00
James Zern 3ba1fada8b vpx_image.h: add lifetime note for img_data
The wrapped storage must exist for the duration of the vpx_image_t
allocation.

Bug: aomedia:363806063
Change-Id: Ic6b79a56b6c07776222d1767490d873d7408ced0
2024-09-03 19:19:09 -07:00
Marco Paniconi fbf63dff1f vp9: clamp the calculation of sb64_target_rate to INT_MAX
Bug: b/361617762
Change-Id: Ie7d2b0973e6de23d6e992ee058cbb94b826fda65
2024-08-28 14:48:57 +00:00
James Zern 507aea8e29 vp9_speed_features.h: fix partition_search_type comment
FIXED_SIZE_PARTITION -> FIXED_PARTITION

Change-Id: I5e6a561042d7dfa87d6f11b033052d340e433440
2024-08-27 17:38:43 -07:00
James Zern 50aa6cca4d README: add security report note
The default template for https://issues.webmproject.org/ is a public bug
report. Security issues can be reported securely using the 'Security
report' template.

Change-Id: Ic7144a6c7a144772b78852d1415a51a570c79d50
2024-08-26 15:30:01 -07:00
Wan-Teh Chang f00fa3ce74 Add macro name as comment for header guard #endif
Change-Id: I948f21f414fc269ad03673636506fa83acf5f5f6
2024-08-23 00:35:45 +00:00
Wan-Teh Chang 35b908f808 Add #ifndef header guard to vpx_version.h
Change-Id: Ief028037a3a56b1f18998298ad594a86cf906bd3
2024-08-22 14:40:16 -07:00
James Zern 2c778f4da6 remove vp9_{highbd_,}resize_frame*()
and examples/resize_util.c. These functions were added in:
  3cd37dfeb Adds a non-normative resize library to vp9 encoder
but never used meaningfully in the library.

This mirrors the change in libaom:
  d10029bb4b Restore function prototype of av1_resize_frame420
except that vp9_resize_frame420() was never exported in the shared
library, so can be deleted along with the rest.

The reasoning for removing examples/resize_util.c is the same: it is not
useful and examples should use the public functions of the libvpx
library.

Change-Id: I386080d3f1a3ef81dfc87fcdf5bbdf459d996f03
2024-08-21 19:22:48 -07:00
James Zern 312a9004c1 remove vpx_ssim_parms_16x16()
The last reference to this function was removed in:
5511968f2 Removed several unused functions.

Change-Id: I644482b4f0c9c4765035adcdc21ec495e3e3a6e6
2024-08-20 17:47:07 -07:00
Yunqing Wang a5ea71f091 Key frame temporal filtering
Added key frame temporal filtering. Enabled it for VOD encoding
with encoder speed < 2.
Minor improvement in prediction.
Added the restriction of using no more than "arnr_max_frames"
frames for temporal filtering.
Key frame temporal filtering is turned off by default for now. To
enable it, set "--enable-keyframe-filtering=1"

Borg result with "--enable-keyframe-filtering=1"
         avg_psnr:  ovr_psnr:   ssim:    vmaf:
hdres2:   -0.762     -0.863    -0.903   -0.680
midres2:  -0.813     -0.753    -0.757   -0.743
lowres2:  -0.492     -0.598    -0.737   -0.881
The impact on the encoder time is minimal.

Change-Id: If6abea3e21efcb96f1978cd9dfaa742c40dc2a56
2024-08-19 17:59:58 +00:00
Jerome Jiang 5d20cc3081 IWYU: include vp9_ext_ratectrl.h for tpl
Change-Id: Ia00dc7f79a69eb73c85fb409418861bef459e863
2024-08-19 11:05:38 -04:00
Jerome Jiang ee2552d903 vp9 ext rc: TPL & final encoding use same QP
Removed codec control VP9E_ENABLE_EXTERNAL_RC_TPL since it is
no longer needed.

Change-Id: I151254ff3f0496c017ddf73c2caf94783ef38f31
2024-08-16 17:16:29 -04:00
Jerome Jiang a69eeb0af2 ext rc: Override encode QP in TPL pass for VBR
Change-Id: I8f32b5847b57313d00401f5596ed62ac7c4817f0
2024-08-16 16:50:21 -04:00
Jerome Jiang d9d6c5e2c9 Remove ext_rc_recode
This flag is always set to 0

Change-Id: I228b3befae478517e7b31228d4a6553af4fd7a27
2024-08-16 13:54:02 -04:00
James Zern 95568674c2 remove redundant && __GNUC__ preproc check
`#if defined(__GNUC__)` is enough if a specific version isn't being
looked for.

Bug: aomedia:356832974
Change-Id: I3fcbecf9d547c6a2d89d7b5456e83ee08ddc6f5e
2024-08-16 16:43:45 +00:00
Yunqing Wang fcd1f39e56 Improve temporal filter prediction and process
Applied 12-tap filter to temporal filter prediction for better
result. Improved the calculation of frames to be used in temporal
filtering.

The overall PSNR gain was -0.511% (lowres), -0.338% (midres), and
-0.288% (hdres).
Encoder time was increased by ~2%, which would be largely reduced
by the following SIMD optimization.

Change-Id: If3ece30f1614beadc99ebf6b4dc3f2d988d3bdb9
2024-08-09 23:05:32 +00:00
Jerome Jiang 13be4a7190 Remove a stale TODO in ext RC
Change-Id: Ie871a476a7a0b04cf88db17da8402dad1c3247f7
2024-08-09 18:02:00 +00:00
Wan-Teh Chang b222d72285 Add the saturate_cast_double_to_int() function
Move the saturate_cast_double_to_int() function in
vp8/encoder/firstpass.c to vpx_dsp/vpx_dsp_common.h so that it can be
used in other files.

Change-Id: I748fea969520542dca68d7a46500d3272f22e16f
2024-08-08 11:42:03 -07:00
Jerome Jiang c18b9f7c68 Add min/max q index to ext rc config
Change-Id: I5d152f3b0868e78c6b33fe651c6a40597b42feef
2024-08-08 10:24:29 -04:00
James Zern 634e1f8fb1 vp9_calc_iframe_target_size_one_pass_cbr: clamp final target
to INT_MAX. This matches calc_iframe_target_size() in VP8
(http://crbug.com/1473473). If rc->avg_frame_bandwidth is large even
small kf_boost values will overflow an int.

Change-Id: Iaca5b47fe97793ae70930b3b2c2f42725d2c96fb
2024-08-08 02:37:47 +00:00
James Zern bb95d3652b update libwebm to libwebm-1.0.0.31-10-g3b63004
This fixes a build error seen in gcc 15:
3b63004 mkvparser/mkvparser.cc: add missing <cstdint> include

Bug: aomedia:357622679
Change-Id: I6c4a1795d189f9993d4f2c5c9f0375912bc58f0c
2024-08-06 11:31:25 -07:00
Wan-Teh Chang 428f3104fa Include "gtest/gtest.h" using the shorter path
Rely on the -I or -system compiler option to find "gtest/gtest.h". This
makes it easier to build our tests against a copy of gtest outside the
libvpx source tree.

Bug: webm:42330726
Change-Id: I3b189c6345e13b36b236d1eedc6ee091bfa71f48
2024-08-02 22:42:20 +00:00
Jerome Jiang 1865f20e9a Extend border for vp8 loopfilter
Bug: webm:356482713
Change-Id: I149d077a57d55c46fe1924cff4c5cfcf5c7609b0
2024-08-02 14:59:58 -04:00
Wan-Teh Chang 9f06827eeb Run clang-format on three files
Change-Id: I055186d915d4660e848f6d856d7895953aaf76ba
2024-08-02 07:03:40 -07:00
James Zern 0c4af6b4c1 vpx_fdct16x16_avx2: add missing cast
Fixes:
vpx_dsp/x86/fwd_txfm_avx2.c:378:50: error: incompatible pointer types
  passing 'int16_t *' (aka 'short *') to parameter of type
  'tran_low_t *' (aka 'int *') [-Werror,-Wincompatible-pointer-types]

Change-Id: I9f50547c1fc885c24b4b91e4c7d6857d397cceed
2024-08-02 00:14:58 +00:00
James Zern b5451de5c5 vp9_extrc_update_encodeframe_result: normalize decl & def
Fixes compiler warning in visual studio after:
2ab292e9e Remove unused parameters from ext rc callback

vp9\encoder\vp9_ext_ratectrl.c(186): warning C4028: formal parameter 3
different from declaration

Change-Id: I4cfddb3f55fb7191ebaf578851ab3bc2c55106e3
2024-08-02 00:14:58 +00:00
James Zern 4295bf4f0f Update third_party/libwebm to commit f4b07ec
Change-Id: I18ff0e388d3c8b683385d98d76bff3e238488a94
2024-08-01 13:38:00 -07:00
Jerome Jiang 2ab292e9e1 Remove unused parameters from ext rc callback
Bug: b/356424505
Change-Id: I1c684e7f4cc9bb7b916354d391abd1ae168af39f
2024-07-31 22:03:08 +00:00
James Zern 3cc287bbd7 vpx_scale,scale1d_c: add assert(dest_scale != 0)
This fixes a 'division by zero' static analysis report (seen with
clang-14).

Bug: b:328632178
Change-Id: I4c051631ff1a948e8f83a831286e01fc50ff1c1d
2024-07-31 18:25:19 +00:00
James Zern 8db1b663e2 vp9_subexp,remap_prob: add an assert
Fixes a 'Result of operation is garbage or undefined' static analysis
report (seen with clang-14) related to left shifting a negative value.

Bug: b:328632178
Change-Id: I18f0100eca0deac1cac9be0c7e848685d2911fb3
2024-07-30 14:54:01 -07:00
James Zern f987e3514c doxygen: quiet warnings in decoder-only config
Fixes:
warning: explicit link request to 'VP9E_SET_EXTERNAL_RATE_CONTROL' could
not be resolved

Change-Id: If7a0d97412cc8fad3457031fbf29cb447635f4a0
2024-07-30 17:55:49 +00:00
James Zern 85d386599d systemdependent.c: fix warning w/CONFIG_MULTITHREAD=0
fixes:
vp8/common/generic/systemdependent.c: In function
   'vp8_machine_specific_config':
vp8/common/generic/systemdependent.c:63:46: warning: unused parameter
   'ctx' [-Wunused-parameter]
    63 | void vp8_machine_specific_config(VP8_COMMON *ctx) {

Change-Id: I0eeaa0c27ccfa901cc62150eed590f5056eb9238
2024-07-29 13:23:58 -07:00
James Zern cdf8da4c03 vp8: fix OOB access in x->MVcount
Motion vectors are now clamped in
vp8_find_best_sub_pixel_step_iteratively, vp8_find_best_sub_pixel_step,
vp8_find_best_half_pixel_step, vp8_full_search_sad,
vp8_refining_search_sadx4 and vp8_refining_search_sad_c (the rtcd for
other optimizations are redirects to vp8_refining_search_sadx4).

The difference of valid motion vectors may still go beyond the range of
the MVcount array, however, so additional checks are added to
rd_update_mvcount() and update_mvcount().

Note the test source and settings (speed 1 and GOOD quality mode) come
from the issue report; additional coverage is added for realtime. The
realtime path does not trigger the error without the fix, but as it's
similar to the rd path, the same clamp is done to be safe.

Fixes:
vp8/encoder/rdopt.c:1579:5: runtime error: index 17467 out of bounds for
  type 'unsigned int[2047]'

Bug: oss-fuzz:69906
Change-Id: Ia8bd087cfe4475ab09ba711ed806fbcbaa72e552
2024-07-25 15:08:02 -07:00
James Zern f9120b789d vp8,calc_iframe_target_size: clamp kf_boost
cpi->output_framerate may be as large as 10M. Previously this would
cause kf_boost to be ~20M which would overflow an int when multiplied by
values in kf_boost_qadjustment[].

Fixes:
vp8/encoder/ratectrl.c:340:25: runtime error: signed integer overflow:
  19999984 * 220 cannot be represented in type 'int'

Bug: oss-fuzz:69100
Change-Id: I2d77c9d2912412f6265f6a8dc0e6b361b63b8242
2024-07-25 19:43:53 +00:00
Jingning Han d63ecb4117 Reset the ref_table array for the key frame GOP
Change-Id: Idda6ad9352d4c74dcbe8f2b6e1615d10e958e4c8
2024-07-24 16:02:39 -07:00
Jingning Han f809c987b5 Remove repeated ref_frame assignments
Change-Id: I0daa5a40489ce14582cb6a1c2816df354f1134f9
2024-07-24 16:01:24 -07:00
Bohan f96deb0bb4 Add tpl propagation with updated ref_frame idx
Change-Id: I6fcef44a90fc434e18447964aa1b4585c7f62310
2024-07-24 18:57:29 +00:00
Wan-Teh Chang 3fb0e5d75d Remove unneeded cpi->output_framerate assignment
The assignment "cpi->output_framerate = cpi->framerate;" after the
vp8_new_framerate() call is not needed, because vp8_new_framerate() sets
cpi->framerate and cpi->output_framerate to the same value.

Change-Id: I4de97b43957142d658e0c08ecfc6628844ce453a
2024-07-23 15:22:55 -07:00
Angie Chiang 057e53d759 Small refactoring in vp9_firstpass.c
Change-Id: If5e76b05f584650ff675363e6eb347bedae7728c
2024-07-19 21:38:08 +00:00
James Zern 9a1e8ae7aa README: add link to issue tracker
Change-Id: Ic8bc0167e5d1975e006135e20afacf27ee6badcf
2024-07-18 23:46:57 +00:00
James Zern efe615f804 add repro for crbug.com/352414650
+ fix an additional double -> int overflow warning (chrome's fuzzers do
  not have the float-cast-overflow sanitizer enabled)

Bug: chromium:352414650
Change-Id: I634bb421a74236eac434df138ed71dadf197596a
2024-07-18 13:36:10 -07:00
Marco Paniconi 3219f76cea Remove printf warning statements in set_size_literal()
Bug: b/347890801
Change-Id: I78c8dd0907d54f6cd1d3972ea6c3897f4b0c5adc
2024-07-15 11:33:26 -07:00
Wan-Teh Chang 72018e8c74 Some cleanup in vbr_rate_correction()
The only real change is in the initialization of frame_window. The (int)
cast is moved to the result of VPXMIN(), so that
cpi->twopass.total_stats.count - cpi->common.current_video_frame is
calculated in double.

Change-Id: Ia80f24614af7184b37cfdd99d8a8b1639460f273
2024-07-13 00:16:11 +00:00
James Zern 77974ec041 vp9_svc_adjust_avg_frame_qindex: fix int overflow
rc->avg_frame_bandwidth is capped at INT_MAX. Rather than multiply the
value by 3, divide projected_frame_size by 3 to avoid the overflow.
Without rounding this differs slightly from the original, but loss of
precision is acceptable in this case.

Bug: chromium:348440590
Change-Id: Id5960825c79d7c764d257e9b4bd0a1de751878d8
2024-07-11 17:34:57 -07:00
Wan-Teh Chang a40848c80f Do not include vpx_version.h
Replace the VERSION_STRING_NOSP macro by the public API function
vpx_codec_version_str().

Treat vpx_version.h as an absolutely internal header of the libvpx
library.

Change-Id: I86ba8548a62adae91ae7f5caad98169707f3fc64
2024-07-09 16:57:20 -07:00
Angie Chiang 1640ed4089 Turn off frame_stats == NULL error.
This change happens in define_gf_group().
Since this part is not critical for ext_ratectrl,
turn off the error reporting for now.

Change-Id: Ie74aa06a116edb8c5d9e7b29cadbd366232fbc1d
2024-07-09 13:35:00 -07:00
Wan-Teh Chang 066ea57e3d Fix unused function warnings in real-time only
The compare_fp_stats() and compare_fp_stats_md5() functions are not used
when CONFIG_REALTIME_ONLY is equal to 1. Define these functions only if
CONFIG_REALTIME_ONLY is 0 to avoid the -Wunused-function warnings.

Change-Id: Iaae208f67708cfaeee5304b0320ebce63c863f96
2024-07-08 14:31:23 -07:00
Jingning Han 7cc7bbba1f Allow TPL group to reference more frames
Allow the TPL group to use up to 3 reference frames from the
previous GOP. This slightly changes the coding stats in the range
of <0.1%.

STATS_CHANGED

Change-Id: Ieb4e948a783bf8ef9ca78717d56ff750f3f795a4
2024-07-08 17:02:15 +00:00
Wan-Teh Chang 4ac9c4ba32 Fix int cast errors in vp8 on max target bitrate
Fix double-to-int cast overflows in vp8 code caused by setting the
target bitrate to the maximum value (2000000).

Tested: Build libvpx with UndefinedBehaviorSanitizer and then run
./vpxenc husky.yuv -o AV1_husky_2000000_10000000_10000000.webm --good \
  --cpu-used=2 -v -t 0 -w 352 -h 288 --fps=10000000/10000000 \
  --target-bitrate=2000000 --limit=150 --test-decode=fatal --passes=2 \
  --lag-in-frames=25 --min-q=0 --max-q=63 --arnr-maxframes=7 \
  --arnr-strength=5 --kf-max-dist=9999 --undershoot-pct=100 \
  --overshoot-pct=100 --bias-pct=50 --codec=vp8

Note: This is essentially the VP8 version of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/191361.

Bug: 349440066
Change-Id: Ia43e1aad8fcab60ace49da960579081c2c3a5445
2024-07-03 17:09:52 +00:00
Wan-Teh Chang 27c39522f5 vpxenc.c: Fix UBSan integer errors in test_decode
Fix the following UBSan integer errors in test_decode():
vpxenc.c:1589:57: runtime error: implicit conversion from type 'int' of
value -16 (32-bit, signed) to type 'unsigned int' changed the value to
4294967280 (32-bit, unsigned)
vpxenc.c:1590:58: runtime error: implicit conversion from type 'int' of
value -16 (32-bit, signed) to type 'unsigned int' changed the value to
4294967280 (32-bit, unsigned)

Tested: Build libvpx with -fsanitize=integer and then run
./vpxenc husky.yuv -o AV1_husky_2000000_10000000_10000000.webm --good \
  --cpu-used=2 -v -t 0 -w 352 -h 288 --fps=10000000/10000000 \
  --target-bitrate=2000000 --limit=150 --test-decode=fatal --passes=2 \
  --lag-in-frames=25 --min-q=0 --max-q=63 --arnr-maxframes=7 \
  --arnr-strength=5 --kf-max-dist=9999 --undershoot-pct=100 \
  --overshoot-pct=100 --bias-pct=50 --codec=vp8

Bug: 349440066
Change-Id: Ice2f0e7176ffec664856559e2c02bd51113c4d74
2024-07-03 16:26:46 +00:00
Wan-Teh Chang a396ac214d Fix unsigned int overflow in init_rate_histogram()
Tested: Build libvpx with -fsanitize=integer and then run
./vpxenc husky.yuv -o AV1_husky_2000000_10000000_10000000.webm --good \
  --cpu-used=2 -v -t 0 -w 352 -h 288 --fps=10000000/10000000 \
  --target-bitrate=2000000 --limit=150 --test-decode=fatal --passes=2 \
  --lag-in-frames=25 --min-q=0 --max-q=63 --min-gf-interval=4 \
  --max-gf-interval=22 --arnr-maxframes=7 --arnr-strength=5 \
  --kf-max-dist=9999 --aq-mode=0 --undershoot-pct=100 \
  --overshoot-pct=100 --bias-pct=50

This unsigned integer overflow seems to be caused by
g_timebase.num=1000000.

Note: This is a port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/191401.

Bug: 349440066
Change-Id: I924fa9c653400764dd7320938b88b4ea40f38172
2024-07-02 15:25:11 -07:00
Wan-Teh Chang af599a0c5f Fix further overflow issue in VBR.
This patch fixes some additional cases where under extreme conditions
some of the VBR adjustment variables can wrap.

As this happens on a per frame level the extra saturation checks should
not be an issue for performance.

Note: This CL is a port of the following libaom CLs:
https://aomedia-review.googlesource.com/c/aom/+/190521
https://aomedia-review.googlesource.com/c/aom/+/190888

Change-Id: I87c4ecca10f39767002f7d90d0f43b19c7150832
2024-06-28 21:29:04 -07:00
Wan-Teh Chang ac117ca7f9 Remove static from vars in parse_stream_params()
Those variables in parse_stream_params() don't need to be function-scope
static variables.

Change-Id: I5e0b0f78deb0aa8b4f95dcd2352d89342b9d528a
2024-06-27 17:30:01 -07:00
Angie Chiang 2693255a25 Let vp9_ext_ratectrl getting key frame decision
BUG = b/347936295

Change-Id: Ic152d3f6873ebe02977c1ede6a5663bd5f9be363
2024-06-21 11:21:18 -07:00
Marco Paniconi 253d6365e3 rtc-vp9: Allow scene detection for all speeds
Current code was disallowing scene detection for
speeds >= 8, to avoid any encode_time increase
(see comment in the code).

But we can expect the cost to be small even at speed 8,9,
and that concern on encode_time was from some time ago
before 8 and 9 were further optimized. And this is
needed for content with scene changes (see issue attached).
So allow scene detection now for all RTC speed settings (speed >= 5).

Bug: b/346846607
Change-Id: I678dbb88ff1399ed89b2bf9770ae9427e3044fc4
2024-06-18 16:58:00 +00:00
James Zern f07ca82f7a set_analyzer_env.sh: remove -fno-strict-aliasing
The last reference to the flag in configure was removed in:
fad70a358 Remove -fno-strict-aliasing flag

The library should be expected to function without this flag; it's built
and tested elsewhere without it.

Bug: webm:570, webm:603
Change-Id: Icf85fd9bd5c9cb0c81d6eecf10fba07807f48b4a
2024-06-14 12:16:33 -07:00
James Zern d6ae3ea465 rtcd.pl: add license header to generated files
Bug: aomedia:3525
Change-Id: I614056558fb5439b448342e0c01e53bd8da85585
2024-06-13 11:54:43 -07:00
Angie Chiang 68deb7ee20 Add missing header in vp9_firstpass.c
Change-Id: I675fa2b74b567e47f2a8fe2a7e4b4d3e77880d13
2024-06-12 14:29:08 -07:00
Angie Chiang ff67a4f209 Fix typo of received again
Change-Id: I6df009ec0423c2ef244399107c968ae1255337e5
2024-06-12 14:17:00 -07:00
Angie Chiang 277a5cdaa4 Remove redundant setting of max_layer_depth.
Change-Id: Ide2b6852339471b8e82109c846ba24fe7dc94aaa
2024-06-12 12:30:41 -07:00
Angie Chiang 2ca6e875c3 Typo recieved -> received
Change-Id: I140b5c2a5cefc346b3961dad09fd145d85d44d17
2024-06-12 10:28:12 -07:00
James Zern fb01e53c98 configure: add -c to ASFLAGS for android + AS=clang
The GNU Assembler was removed in r24. clang's internal assembler works,
but `-c` is necessary to avoid linking.

Bug: webm:1856
Change-Id: I61f80cf78657d3b71d5e73c5b2510575533ca5ea
2024-06-11 22:55:22 +00:00
James Zern b0c9d0c6fe configure: remove unused NM & RANLIB variables
+ update list in README

Change-Id: I363e9bc36b2e160de43d0fbcba4700297a582549
2024-06-11 22:55:22 +00:00
Angie Chiang ed95b102c4 Move ext_rc_define_gf_group_structure
Move the function into define_gf_group().
define_gf_group() has a lot of settings that might cause
performance drop if skipped.

Imitate define_gf_group_structure()'s behavior which add
an extra overlay frame at the end of gf_group whenever
alt_ref is used.

After this change, we can feed the baseline decision through
webmrc and get the same result as baseline.

This CL is tested with city_cif.yuv using ffmpeg

BUG = b/345528565

Change-Id: Ib61f0a0a72251f8662fb4072e0cfd7f456a243b3
2024-06-11 20:19:04 +00:00
James Zern 271b3f0bf0 tiny_ssim: mv read error checks closer to assignment
Quiets some spurious -Wmaybe-uninitialized warnings with gcc 14.1.0.

In function 'calc_plane_error16',
    inlined from 'main' at ../tools/tiny_ssim.c:464:5:
../tools/tiny_ssim.c:37:12: warning: 'v[0]' may be used uninitialized
  [-Wmaybe-uninitialized]
   37 |   if (orig == NULL || recon == NULL) {
      |            ^
In function 'calc_plane_error16',
    inlined from 'main' at ../tools/tiny_ssim.c:462:5:
../tools/tiny_ssim.c:37:12: warning: 'u[0]' may be used uninitialized
  [-Wmaybe-uninitialized]
   37 |   if (orig == NULL || recon == NULL) {
      |            ^
In function 'calc_plane_error',
    inlined from 'main' at ../tools/tiny_ssim.c:461:5:
../tools/tiny_ssim.c:61:12: warning: 'y[0]' may be used uninitialized
  [-Wmaybe-uninitialized]
   61 |   if (orig == NULL || recon == NULL) {

To reduce confusion, read_input_file() is changed to return an int as
previously it would only return (size_t)-1/0/1 (and now returns 0/1).

Change-Id: I2344048ecc2bd233891ffcef08002ee98d6d262a
2024-06-10 16:05:06 -07:00
Matt Oliver c1cc9ebd58 project: Update for 1.14.1 merge. 2024-06-08 22:45:23 +10:00
Matt Oliver 68a5066df4 Merge commit '12f3a2ac603e8f10742105519e0cd03c3b8f71dd' 2024-06-08 20:02:59 +10:00
James Zern a2508b5711 configure: disable runtime cpu detect w/armv7*-darwin
The default behavior changed in:
148d1085f Refactor and extend run-time CPU feature detection on Arm

This fixes build errors with these targets as there is no runtime cpu
detection defined for them.

Change-Id: Ie6b0bae1fc3e244d7dfcc823f60c3e466ccade79
2024-06-07 10:59:16 -07:00
Wan-Teh Chang ec129c190a Document the internal maximum of rc_target_bitrate
Both VP8 and VP9 internally cap the target bitrate to the smaller of the
uncompressed bitrate and 1000000 kilobits per second.

Change-Id: I4008ce09b5e709e75111800341d015e41eb1da42
2024-06-05 17:13:48 -07:00
Wan-Teh Chang b401a1ff2e Remove unnecessary double cast for cpi->framerate
cpi->framerate is already of the double type.

Change-Id: Ia9211b699e25b1c603585a40370a1ed66e7cbf03
2024-06-05 22:57:48 +00:00
Marco Paniconi faf12bdb83 vp9: round for framerate and _min/max_gf_interval()
Fixes for the comments in:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5598161

Change-Id: Ib7db69649c848098cd3f6e4a88233d333e84f628
2024-06-05 14:21:58 -07:00
James Zern 713e0faca0 vp9: round avg_frame_bandwidth result
in vp9_rc_update_framerate() and in functions in vp9_svc_layercontext.c.

This matches the code in VP8 and AV1 as discussed in
https://chromium-review.googlesource.com/c/webm/libvpx/+/5566050/2/vp8/encoder/onyx_if.c

Change-Id: I084f8002f8f6c8efffc511566910b3f3df47ba4e
2024-06-05 00:14:34 +00:00
Marco Paniconi 60807f0aba Use round for RC calcutions in cyclic_refresh
Same as the fix in libaom:
https://aomedia-review.googlesource.com/c/aom/+/190881

Bug: aomedia:3579

Change-Id: Idb4026943a970189e6cd47a29e54e16623595e31
2024-06-04 14:50:58 -07:00
Angie Chiang 9d734db169 Rename gop_size by show_frame_count
Change-Id: Id95fbeaa0ceeb10c077bfd628f45fe880b42b3de
2024-06-03 07:35:49 -07:00
Wan-Teh Chang fd84dccd51 Fix high target data rate overflow.
These change fixes issues that can occur if the user specifies a very
high target data rate or rate per frame.

Fixes some issue with overflow of int variables used to hold bitrate
values (rate per second, rate per frame etc).

Note: This CL is a port of the following libaom CLs:
https://aomedia-review.googlesource.com/c/aom/+/190381
https://aomedia-review.googlesource.com/c/aom/+/190462

All the changes were ported to VP9. For VP8, only the new type of
cpi->bytes (equivalent to ppi->total_bytes in libaom) was ported.

Change-Id: I438dd46efd5a134389b893ffae1f8a2381207906
2024-05-31 16:00:25 -07:00
Jingning Han ffe9c9a457 Handle ARF and GF gop cases
Allow the inference scheme to cover GOPs with and without ARFs.

Change-Id: I68518791e96d7d5b92355c34360bbb74f2ecc436
2024-05-31 09:25:51 -07:00
Jingning Han ddf3c281e6 Remove a redundant condition in firstpass.c
Remove a redundant condition to trigger ext_rc gop structure
function all.

Change-Id: Ia3f135c67982b5539b9a2e8a74ba13edd9b5e46f
2024-05-30 18:56:47 +00:00
Jerome Jiang b5ba2274a0 Merge tag 'v1.14.1' into main-merge-1.14.1
2024-05-21 v1.14.1 "Venetian Duck"

  This release includes enhancements and bug fixes.

  - Upgrading:
    This release is ABI compatible with the previous release.

  - Enhancement:
    Improved the detection of compiler support for AArch64 extensions,
    particularly SVE.

    Added vpx_codec_get_global_headers() support for VP9.

  - Bug fixes:

    Added buffer bounds checks to vpx_writer and vpx_write_bit_buffer.
    Fix to GetSegmentationData() crash in aq_mode=0 for RTC rate control.
    Fix to alloc for row_base_thresh_freq_fac.
    Free row mt memory before freeing cpi->tile_data.
    Fix to buffer alloc for vp9_bitstream_worker_data.
    Fix to VP8 race issue for multi-thread with pnsr_calc.
    Fix to uv width/height in vp9_scale_and_extend_frame_ssse3.
    Fix to integer division by zero and overflow in calc_pframe_target_size().
    Fix to integer overflow in vpx_img_alloc() & vpx_img_wrap()(CVE-2024-5197).
    Fix to UBSan error in vp9_rc_update_framerate().
    Fix to UBSan errors in vp8_new_framerate().
    Fix to integer overflow in vp8 encodeframe.c.
    Handle EINTR from sem_wait().

Change-Id: Ic5e274fdc35c9141591a65e825bf012d2cca3caa
2024-05-30 11:35:52 -04:00
Jerome Jiang 12f3a2ac60 Update CHANGELOG
Bug: webm:1854
Change-Id: I3242d7fd58838aa8c4103ae07a67deb9dcc7dd37
2024-05-29 16:00:23 -04:00
Jerome Jiang be3ea68f9e Update CHANGELOG for fixes to ubsan errors
Bug: webm:1854
Change-Id: I81050a6a69721062078e818ca3ce23994749f711
2024-05-29 12:10:08 -04:00
Wan-Teh Chang 1dbb3b28e8 Fix some UBSan errors in vp8_new_framerate()
Fix some UBSan errors in the calculations of cpi->av_per_frame_bandwidth
and cpi->min_frame_bandwidth in vp8_new_framerate() and in the
calculation of cpi->per_frame_bandwidth in encode_frame_to_data_rate().

A port of the VP9 changes in
https://chromium-review.googlesource.com/c/webm/libvpx/+/4944271 and
https://chromium-review.googlesource.com/c/webm/libvpx/+/5565157 to VP8.
Similar to the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/190462.

Bug: aomedia:3509
Change-Id: I77b0e0b2f9fe667428daa9c4ceec0a35aafbfa81
(cherry picked from commit 25540b3c12)
2024-05-28 17:08:30 +00:00
Wan-Teh Chang c60622ebac Fix a UBSan error in vp9_rc_update_framerate()
Fix a UBSan error in the calculation of rc->min_frame_bandwidth in
vp9_rc_update_framerate().

A follow-up to
https://chromium-review.googlesource.com/c/webm/libvpx/+/4944271.
Similar to the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/190462.

Bug: aomedia:3509
Change-Id: I36168a6d00cd81e60ae19a7d74c21f2e6c2f0caf
(cherry picked from commit 1f65facb63)
2024-05-24 18:13:42 +00:00
Wan-Teh Chang 25540b3c12 Fix some UBSan errors in vp8_new_framerate()
Fix some UBSan errors in the calculations of cpi->av_per_frame_bandwidth
and cpi->min_frame_bandwidth in vp8_new_framerate() and in the
calculation of cpi->per_frame_bandwidth in encode_frame_to_data_rate().

A port of the VP9 changes in
https://chromium-review.googlesource.com/c/webm/libvpx/+/4944271 and
https://chromium-review.googlesource.com/c/webm/libvpx/+/5565157 to VP8.
Similar to the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/190462.

Bug: aomedia:3509
Change-Id: I77b0e0b2f9fe667428daa9c4ceec0a35aafbfa81
2024-05-23 18:05:54 -07:00
Wan-Teh Chang 495c4b596c Add a #endif comment for CONFIG_VP9_HIGHBITDEPTH
Change-Id: Idc388e722e2579ce6935b52b1786038bdf2d5d47
2024-05-23 23:50:28 +00:00
James Zern f3e064e1d8 {aarch*,arm}_cpudetect: align define with comment
ANDROID_USE_CPU_FEATURES_LIB -> VPX_USE_ANDROID_CPU_FEATURES

Change-Id: I2d425cf3cd28219e570efb0c442b33f1a64447ae
2024-05-23 23:48:22 +00:00
Wan-Teh Chang 1f65facb63 Fix a UBSan error in vp9_rc_update_framerate()
Fix a UBSan error in the calculation of rc->min_frame_bandwidth in
vp9_rc_update_framerate().

A follow-up to
https://chromium-review.googlesource.com/c/webm/libvpx/+/4944271.
Similar to the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/190462.

Bug: aomedia:3509
Change-Id: I36168a6d00cd81e60ae19a7d74c21f2e6c2f0caf
2024-05-23 14:38:50 -07:00
Wan-Teh Chang db4d6a5f54 Fix a typo in the CpuSpeedTest.TestTuneScreen test
Change the second rc_2pass_vbr_minsection_pct to
rc_2pass_vbr_maxsection_pct.

This copy-and-paste error was introduced in
https://chromium-review.googlesource.com/c/webm/libvpx/+/332653.

Change-Id: If2c61cd2ce0a6808643b8e80a27f054f7339e0fd
2024-05-22 16:04:40 -07:00
Angie Chiang 6c079c8beb Account for gop_decision->use_alt_ref
The change is in ext_rc_define_gf_group_structure()

Bug: b/339314081

Change-Id: I03576a0407105ced3e7cff4c33986e9a9a83b77f
2024-05-22 10:46:39 -07:00
Jerome Jiang 18da85657c Update AUTHOR, version and CHANGELOG
Bug: webm:1854
Change-Id: I0801e9b685d395c7556e2269601f4c01ab310661
2024-05-22 11:35:53 -04:00
Wan-Teh Chang 61c4d556bd Fix a bug in alloc_size for high bit depths
I introduced this bug in commit 2e32276:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5446333

I changed the line

  stride_in_bytes = (fmt & VPX_IMG_FMT_HIGHBITDEPTH) ? s * 2 : s;

to three lines:

  s = (fmt & VPX_IMG_FMT_HIGHBITDEPTH) ? s * 2 : s;
  if (s > INT_MAX) goto fail;
  stride_in_bytes = (int)s;

But I didn't realize that `s` is used later in the calculation of
alloc_size.

As a quick fix, undo the effect of s * 2 for high bit depths after `s`
has been assigned to stride_in_bytes.

Bug: chromium:332382766
Change-Id: I53fbf405555645ab1d7254d31aadabe4f426be8c
(cherry picked from commit 74c70af016)
2024-05-21 18:43:46 +00:00
Wan-Teh Chang 5193ce7167 Apply stride_align to byte count, not pixel count
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188962.

stride_align is documented to be the "alignment, in bytes, of each row
in the image (stride)."

Change-Id: I2184b50dc3607611f47719319fa5adb3adcef2fd
(cherry picked from commit 7d37ffacc6)
2024-05-21 18:43:07 +00:00
Wan-Teh Chang 5a83437ffc Avoid wasted calc of stride_in_bytes if !img_data
Change-Id: If1ddde5e894a06359f15486a2cee054a2f0cb1a2
(cherry picked from commit 8b2f8baee5)
2024-05-21 18:42:17 +00:00
Wan-Teh Chang 9d7054c0cb Avoid integer overflows in arithmetic operations
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188823.

Impose maximum values on the input parameters so that we can perform
arithmetic operations without worrying about overflows.

Also change the VpxImageTest.VpxImgAllocHugeWidth test to write to the
first and last samples in the first row of the Y plane, so that the test
will crash if there is unsigned integer overflow in the calculation of
stride_in_bytes.

Bug: chromium:332382766
Change-Id: I54cec6c9e26377abaa8a991042ba277ff70afdf3
(cherry picked from commit 06af417e79)
2024-05-21 18:30:51 +00:00
Wan-Teh Chang c5640e3300 Fix integer overflows in calc of stride_in_bytes
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188761.

Fix unsigned integer overflows in the calculation of stride_in_bytes in
img_alloc_helper() when d_w is huge.

Change the type of stride_in_bytes from unsigned int to int because it
will be assigned to img->stride[VPX_PLANE_Y], which is of the int type.

Test:
. ../libvpx/tools/set_analyzer_env.sh integer
../libvpx/configure --enable-debug --disable-optimizations
make -j
./test_libvpx --gtest_filter=VpxImageTest.VpxImgAllocHugeWidth

Bug: chromium:332382766
Change-Id: I3b39d78f61c7255e10cbf72ba2f4975425a05a82
(cherry picked from commit 2e32276277)
2024-05-21 18:30:29 +00:00
Wan-Teh Chang f60da3e3ea Add test/vpx_image_test.cc
Ported from test/aom_image_test.cc in libaom commit 04d6253.

Change-Id: I56478d0a5603cfb5b65e644add0918387ff69a00
(cherry picked from commit 3dbab0e664)
2024-05-21 18:29:47 +00:00
Wan-Teh Chang b9c4f1951f Make img_alloc_helper() fail on VPX_IMG_FMT_NONE
If fmt is VPX_IMG_FMT_NONE, currently img_alloc_helper() allocates a
single plane because VPX_IMG_FMT_NONE (0) is not a planar format (the
VPX_IMG_FMT_PLANAR bit is not set in VPX_IMG_FMT_NONE).

Although this seems correct, the problem is that most of the code in
libvpx assumes planar formats and is likely to dereference a null
pointer when it uses img->planes[1]. Also, VPX_IMG_FMT_NONE isn't really
a valid image format. So it is safer to make img_alloc_helper() fail if
fmt is VPX_IMG_FMT_NONE.

Change-Id: I05b47f4b5eceb631a02384b2cce1c2f6fdca8673
(cherry picked from commit d3a946de8c)
2024-05-21 18:28:09 +00:00
Marco Paniconi 5b4cfe88e4 vp9-rtc: Fix integer overflow in key frame target size
The integer overflow happens
in vp9_calc_iframe_target_size_one_pass_cbr(), when
calculating the target size for L1T3 encoding.

The input target bitrate(kbps) is very large, so it gets set
to INT_MAX (before being multiplied by 1000 to convert to bps),
and avg_frame_bandwidth is then set to (INT_MAX / lc->framerate),
which when multipled by (16 + kf_boost) can exceed INT_MAX.
Fix is to cast the operands to int64_t and final result to int.

Bug: chromium:340918567
Change-Id: Ic00094b22c1f12ca988c0cb1fcaed473e1f8ed2b
2024-05-16 11:51:47 -07:00
Deepa K G ea0cd1a38d Fix error handling in vp9_pack_bitstream()
In multi-threaded scenario, when the bitstream
buffer allocated is insufficient, the main thread
called 'longjmp' without waiting for the completion
of workers. In this patch, 'longjmp' is called by
the main thread after joining other worker threads.

This resolves the assertion failure as reported in
Bug: webm:1847

Bug: webm:1844

Change-Id: I399c76087b65e7b8d9a9fa4f12d784408243d648
(cherry picked from commit 611d9ba0a5)
2024-05-14 15:04:00 -04:00
Wan-Teh Chang 58955cf5f5 Perform bounds checks in vpx_write_bit_buffer
Add the `size` and `error` members to the vpx_write_bit_buffer struct.
Add the vpx_wb_init() and vpx_wb_has_error() functions.

Instances of the vpx_write_bit_buffer struct are only allocated in the
vp9_pack_bitstream() function. So vp9_pack_bitstream() is the only
function outside vpx_dsp/bitwriter_buffer.* that needs updating.

This CL completes the work of adding output buffer bounds checks to
vp9/encoder/vp9_bitstream.c.

Bug: webm:1844
Change-Id: I6b362be572852ee51d96023b35bfb334faada7e1
(cherry picked from commit d790001fd5)
2024-05-14 13:04:03 -04:00
Wan-Teh Chang 3bfd83a70c Perform bounds checks in vpx_writer
In the vpx_writer struct, change the buffer_end field to the size field.
Change vpx_stop_encode() to return true on success, false on failure
(output buffer full).

In write_compressed_header(), remove the assertion
assert(header_bc.pos <= 0xffff). The caller (vp9_pack_bitstream()) will
check that condition.

In vp9_pack_bitstream(), the variable "first_part_size" is renamed
"compressed_hdr_size".

Bug: webm:1844
Change-Id: I4ed6ab905a707ad44d875e53036d5a42523a65d0
(cherry picked from commit 73703c188b)
2024-05-14 13:04:03 -04:00
James Zern 34d3114348 vp9_pack_bitstream: remove a dead store
Fixes a static analysis warning:
Value stored to 'data_size' is never read

Bug: webm:1844
Change-Id: Ia27181b1051bb2c3a6bc4a4c2549df8b0525e889
(cherry picked from commit 9f73377821)
2024-05-14 13:04:03 -04:00
Wan-Teh Chang b1cb83ca01 Add the buffer_end field to the vpx_writer struct
The buffer_end field will allow bounds checking when vpx_writer writes
to the output buffer. This CL sets up the plumbing to pass the output
buffer size from vp9_pack_bitstream() to vpx_start_encode(), which
initializes the vpx_writer struct. vpx_writer doesn't use the output
buffer size in bounds checks yet, but the code in vp9_bitstream.c does.

Bug: webm:1844
Change-Id: I995e469ab453c02d740f54b46e0b08c7f2eb1a2e
(cherry picked from commit e387187438)
2024-05-14 10:20:16 -04:00
Wan-Teh Chang ac433759d1 Pass output buffer size to vp9_pack_bitstream()
Set up the plumbing to pass the size of the output buffer `dest` to
vp9_pack_bitstream(). The output buffer is the cx_data buffer in the
encoder_encode() function in vp9/vp9_cx_iface.c, and its size is
cx_data_sz.

In this CL vp9_pack_bitstream() ignores the `dest_size` parameter.

Bug: webm:1844
Change-Id: I53c80280143d409cf16f87c4d6deec3d9338aea3
(cherry picked from commit d48577579b)
2024-05-14 10:19:55 -04:00
Deepa K G 611d9ba0a5 Fix error handling in vp9_pack_bitstream()
In multi-threaded scenario, when the bitstream
buffer allocated is insufficient, the main thread
called 'longjmp' without waiting for the completion
of workers. In this patch, 'longjmp' is called by
the main thread after joining other worker threads.

This resolves the assertion failure as reported in
Bug: webm:1847

Bug: webm:1844

Change-Id: I399c76087b65e7b8d9a9fa4f12d784408243d648
2024-05-14 01:14:15 +05:30
Wan-Teh Chang b1cf64c40b vpx_decoder.h: Change "size member" to "sz member"
That member of vpx_codec_stream_info_t is named "sz", not "size".

Change-Id: I6cc878709d9dae37b9911cf746ba248a06ec1b1a
2024-05-13 17:03:10 +00:00
Wan-Teh Chang 498097b15b vpx_dec_fuzzer.cc: Initialize stream_info.sz
stream_info.sz should be initialized to sizeof(stream_info).

Bug: oss-fuzz:68912
Change-Id: I0cc0fcdfc93b7188a834ee1896f0bb4cf8c32fa9
2024-05-13 16:59:39 +00:00
Angie Chiang 5913401ebb Add vp9_ratectrl.h header to vp9_firstpass.c
KF_STD/GF_ARF_STD are used in vp9_firstpass.c
and defined in vp9_ratectrl.h

Change-Id: I5a6e42faa23e5f50630926e336daef37055fd195
2024-05-13 01:07:24 +00:00
Wan-Teh Chang db25581967 Assert a vpx_img_set_rect call always succeed
The vpx_img_set_rect() call at the end of img_alloc_helper() always
succeeds, so assert its return value is equal to 0.

A port of the changes to aom/src/aom_image.c in the libaom CLs
https://aomedia-review.googlesource.com/c/aom/+/90307 and
https://aomedia-review.googlesource.com/c/aom/+/190011.

Bug: webm:1850
Change-Id: I559820db245a596b4aed2042bfa7ebe7dd2d69b7
2024-05-10 23:58:23 +00:00
James Zern 1a3cd4922b vpx_dec_fuzzer: add vpx_codec_peek_stream_info coverage
Change-Id: I511539292cb8c2098c81f5fe3d711b9739482ffa
2024-05-09 17:19:33 -07:00
Jerome Jiang e934e35515 vp9 rc: also run tpl for GOPs without ARF
Tested with ffmpeg integration end to end test.

Bug: b/338393251
Change-Id: I4048036d35f8ab64c07305b838d091f765f64a8d
2024-05-09 15:19:35 -04:00
Hirokazu Honda bff1fe63ea vp9 rc: Fix GetSegmentationData() crash in aq_mode=0
cpi_->cyclic_refresh is nullptr if aq_mode is 0, in other words, the
rate controller runs in non adaptive quantization mode. This CL fixes
the crash in GetSegmentationData() in non aq mode.

Bug: b/259487065
Test: video encoding on ChromeOS

Change-Id: I503b30d15c697c8dd1da203b3c7361b91c428e87
(cherry picked from commit 1d007eafa3)
2024-05-08 19:07:00 -04:00
Marco Paniconi 08a79efb18 vp9: Fix to alloc for row_base_thresh_freq_fac
Issue happens for real-time nonrd pickmode.
Due to speed feature: sf->adaptive_rd_thresh_row_mt,
enabled for speed >= 8, and for speed >= 7 svc only.

Issue occurs where resolution (sb_rows) changes and
row_base_thresh_freq_fact needs to be re-allocated.

Fix is to add sb_rows to TileDataEnc and check for
re-alloc of row_base_thresh_freq_fac.

Bug: b:331108922
Change-Id: I1a1ca94c14f343200c180725e4cb8d91d3c55b83
(cherry picked from commit 3f8f19372b)
2024-05-08 19:07:00 -04:00
Wan-Teh Chang 669654dda0 Free row mt memory before freeing cpi->tile_data
In vp9_init_tile_data(), call vp9_row_mt_mem_dealloc(cpi) to free the
row mt memory in cpi->tile_data before freeing cpi->tile_data.

Bug: b:331086799, b:331108729
Change-Id: Idc79984ce7e0110e6858139b2ed286492a2e8622
(cherry picked from commit 34277e53ad)
2024-05-08 19:07:00 -04:00
James Zern 41fd571e7a encode_api_test.cc: assert encoder is initialized
Before proceeding with Encode(). This avoids some static analysis
warnings about uninitialized `cfg_` members.

Change-Id: Ib67b278d6706ab1034219e8c1ad9ba0c5b574ba8
(cherry picked from commit 108f5128e2)
2024-05-08 19:07:00 -04:00
James Zern 8433fe6393 vpx_ext_ratectrl.h,cosmetics: Correspondent -> Corresponds
+ add some doxygen autolinks

Change-Id: Ifceb7d9e89d31037d0b690b1d661cebcd6fa67b8
2024-05-08 14:07:50 -07:00
Wan-Teh Chang 1e5823f682 Handle EINTR from sem_wait()
sem_wait() may be interrupted by a signal and fail with EINTR:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_wait.html

Retry the sem_wait() call if it fails with EINTR.

This finishes the fix started in
https://chromium-review.googlesource.com/c/webm/libvpx/+/5299569. As a
speculative fix, that CL fixed only the sem_wait(&cpi->h_event_end_lpf)
calls responsible for bug chromium:324459561. ClusterFuzz verified the
fix, so this CL extends it to the other sem_wait() calls.

Note that sem_wait() calls like the following do not need this fix,
because the while (1) loop retries the sem_wait() call if it fails:

  while (1) {
    if (vpx_atomic_load_acquire(&cpi->b_multi_threaded) == 0) break;

    if (sem_wait(&cpi->h_event_start_lpf) == 0) {
      ...
    }
  }

Bug: chromium:324459561
Change-Id: I0f0612616eee37fb3da68049e49b3e86927b5e24
(cherry picked from commit d4959f9825)
2024-05-07 18:29:13 +00:00
James Zern 108f5128e2 encode_api_test.cc: assert encoder is initialized
Before proceeding with Encode(). This avoids some static analysis
warnings about uninitialized `cfg_` members.

Change-Id: Ib67b278d6706ab1034219e8c1ad9ba0c5b574ba8
2024-05-03 22:00:56 +00:00
Yunqing Wang 314ee14b64 Fix a rare memory overflow bug
In very rare cases (e.g. encoding with very high bit rate), the
allocated token memory isn't enough, which causes a buffer overflow
and then an encoder failure. This is fixed by using the aligned
number of blocks while allocating this buffer.

BUG=b/328803779

Change-Id: I5437cce13398206bf9982d57f35d6f9da17b187f
2024-05-03 21:01:23 +00:00
James Zern 9f73377821 vp9_pack_bitstream: remove a dead store
Fixes a static analysis warning:
Value stored to 'data_size' is never read

Bug: webm:1844
Change-Id: Ia27181b1051bb2c3a6bc4a4c2549df8b0525e889
2024-05-03 20:59:31 +00:00
James Zern a0c4e53665 configure: Do more elaborate test of whether SVE can be compiled
This is a port of the change in libaom:
https://aomedia-review.googlesource.com/c/aom/+/189761
5ccdc66ab6 cpu.cmake: Do more elaborate test of whether SVE can be compiled

For Windows targets, Clang will successfully compile simpler
SVE functions, but if the function requires backing up and restoring
SVE registers (as part of the AAPCS calling convention), Clang
will fail to generate unwind data for this function, resulting
in an error.

This issue is tracked upstream in Clang in
https://github.com/llvm/llvm-project/issues/80009.

Check whether the compiler can compile such a function, and
disable SVE if it is unable to handle that case.

Change-Id: I8550248abd6a7876bd8ecf6ba66bc70518133566
(cherry picked from commit 35f0262c5e)
2024-05-03 17:46:53 +00:00
James Zern e44918bd4e VP9: add vpx_codec_get_global_headers() support
This returns the contents of CodecPrivate described in:
https://www.webmproject.org/docs/container/#vp9-codec-feature-metadata-codecprivate

The value for 4:2:0 is 1 (colocated) to match the default given for the
codec parameter string:
https://www.webmproject.org/vp9/mp4/#codecs-parameter-string

Bug: b:332052663
Change-Id: Ie50dd8d76e2d7389ac01bf4dbec801f9c8ea0e21
(cherry picked from commit 63b9c2c0e2)
2024-05-02 15:26:47 -07:00
James Zern 35f0262c5e configure: Do more elaborate test of whether SVE can be compiled
This is a port of the change in libaom:
https://aomedia-review.googlesource.com/c/aom/+/189761
5ccdc66ab6 cpu.cmake: Do more elaborate test of whether SVE can be compiled

For Windows targets, Clang will successfully compile simpler
SVE functions, but if the function requires backing up and restoring
SVE registers (as part of the AAPCS calling convention), Clang
will fail to generate unwind data for this function, resulting
in an error.

This issue is tracked upstream in Clang in
https://github.com/llvm/llvm-project/issues/80009.

Check whether the compiler can compile such a function, and
disable SVE if it is unable to handle that case.

Change-Id: I8550248abd6a7876bd8ecf6ba66bc70518133566
2024-05-02 15:21:07 -07:00
James Zern 3e713e39ae vp9_ethread_test: move 'best' mode to a Large test
This mode is used infrequently and is quite slow. This shifts the tests
to nightly to speed up the presubmit.

Change-Id: I3020887e0ca0150d7cbea9cc726649c11f94d56c
2024-05-02 22:20:00 +00:00
Angie Chiang 6db3f6e576 Add several utility functions to set gf_group
Use the utility functions and set gf_group_size in
ext_rc_define_gf_group_structure()

Avoid using gop_decision->update_type to keep the logic simple
for now.

Also simplify the interface.

Change-Id: I78fd5892e6f9731d50d6e5da97598b46c70a1dde
2024-05-02 21:41:30 +00:00
Wan-Teh Chang f65aff7b99 Remove vpx_ports/msvc.h
The vpx_ports/msvc.h header provides snprintf() and round() for MSVC
older than Visual Studio 2015 and Visual Studio 2013, respectively.

Since configure now requires vs14 (Visual Studio 2015) or later, it is
safe to remove vpx_ports/msvc.h.

Change-Id: I2fe4c41eaa126f4cf17639c11895f1e464294c76
2024-05-02 20:01:11 +00:00
James Zern 8372a5cfe1 vpx_ext_ratectrl.h: make rate_ctrl_log_path const
Change-Id: I499d77b25ca3dcdbd3c72fb319f9023e9a2823b0
2024-05-02 09:59:34 -07:00
Jerome Jiang 847b3548b4 Better format comments for vpx_ext_ratectrl.h
For vpx_rc_type_t: comment for each enum is moved to where it is
defined.

Change-Id: Ic1e2097ed381e7d71746792e0d517106db882685
2024-05-02 10:01:00 -04:00
Jerome Jiang 1c77f7fc0e Fix comments in vpx_ext_ratectrl.h
Added file level descriptor

Added comments for vpx_rc_ref_frame_t

Change-Id: Ifb000650821eab719b6e0fd003a00027ea132b9f
2024-05-02 10:01:00 -04:00
Wan-Teh Chang c0db981eaa Include <stdio.h> or <cstdio> for *printf()
Change-Id: Ifc0537fe5ae1223418fb68da5583cc72ae2c32a8
2024-05-02 02:26:16 +00:00
James Zern e9be4f607b encode_api_test.cc: apply iwyu
add missing <cstdio> and <cstdlib> and delete some unused headers.

Change-Id: I6c66368f557e6df896bffb2aa90228811f14f027
2024-05-02 02:25:43 +00:00
James Zern 7a0089dc08 vpx_ext_ratectrl.h: fix doxygen comments
fixes a few warnings about undocumented members update_type,
update_ref_index and ref_frame_list.

Change-Id: I668c61f6a511ba9e6c0907f6dafb0be614678e60
2024-05-01 13:24:59 -07:00
Angie Chiang f93e6aa333 Print gop_index in ENCODE_FRAME_RESULT
Change-Id: Icb522110dd2a7f87212ec0e7fc2638245008365f
2024-05-01 18:26:02 +00:00
James Zern b61b272208 vp9_rdopt.c: make init_frame_mv static
fixes a -Wmissing-prototypes warning

Change-Id: Ie380f9e4211ffab461f15dfe84184b8769d4f7bd
2024-04-26 12:54:08 -07:00
James Zern 63b9c2c0e2 VP9: add vpx_codec_get_global_headers() support
This returns the contents of CodecPrivate described in:
https://www.webmproject.org/docs/container/#vp9-codec-feature-metadata-codecprivate

The value for 4:2:0 is 1 (colocated) to match the default given for the
codec parameter string:
https://www.webmproject.org/vp9/mp4/#codecs-parameter-string

Bug: b:332052663
Change-Id: Ie50dd8d76e2d7389ac01bf4dbec801f9c8ea0e21
2024-04-25 15:20:18 -07:00
Angie Chiang 3015c41f06 Add VPX_RC_NONE
Change-Id: I8ca4caa7ffc4e9f8590ad8d02de0348b88c45254
2024-04-19 22:38:33 +00:00
James Zern 6f5839f986 vp9_encoder.c: fix printf format string
Replace %ld with %zu for `size_t`. Added in:
fd28f6f3c Add rate_ctrl_log_path

Fixes:
vp9\encoder\vp9_encoder.c(5748,15): warning C4477: 'fprintf' : format
  string '%ld' requires an argument of type 'long', but variadic
  argument 2 has type 'size_t'

Change-Id: I36fa9c7a9e14d4a2d9ef51a7f5c55de71bb34518
2024-04-19 10:55:20 -07:00
James Zern 2b88a07bc9 vpx_image_test.cc: add missing stdint include
fixes clang-tidy warning:
no header providing "uint16_t" is directly included

Change-Id: Ic71045ce6f88659ecd22243d473a3b6dc8c827dd
2024-04-18 12:50:01 -07:00
Angie Chiang fd28f6f3cc Add rate_ctrl_log_path
Change-Id: I4dc25c9ce4103cf3de44cff4d63e8ff8c82f35c0
2024-04-17 19:43:37 -07:00
Jerome Jiang 85dafa9c61 Initialize frame_mv in rd pick inter
Bug: b/334626386
Change-Id: Ie480a08f09c1b212b4163a5f6eb191c35510236f
2024-04-16 14:51:27 -04:00
Wan-Teh Chang 976134c50d Add 10 and 12b ranges to vpx_color_range_t comment
Add note about undefined behavior in vpx_codec_encode() description.

A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/158001
by Yannis Guyon <yguyon@google.com>.

Bug: webm:1850
Change-Id: Ia90f0bfd8265e35e9f33c17400c1c065d7915b77
2024-04-13 03:20:57 +00:00
Wan-Teh Chang 89efe85cd4 Clarify comment about buf_align in vpx_img_wrap.
If img_data is not NULL, img_alloc_helper ignores buf_align, so
vpx_img_wrap can set buf_align to any placeholder value.

A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/90362.

Bug: webm:1850
Change-Id: I42bc45aecf822a9314caf23058fe123d0574dc20
2024-04-13 03:19:57 +00:00
Wan-Teh Chang 3f4055b05b Introduce local vars uv_x,uv_y in vpx_img_set_rect
Port the changes to aom/src/aom_image.c in the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/56643. The changes
related to `border` are not ported.

Bug: webm:1850
Change-Id: Ie81fffe0c84e912da880ffca245ae27cd71cf348
2024-04-13 03:19:00 +00:00
Wan-Teh Chang 74c70af016 Fix a bug in alloc_size for high bit depths
I introduced this bug in commit 2e32276:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5446333

I changed the line

  stride_in_bytes = (fmt & VPX_IMG_FMT_HIGHBITDEPTH) ? s * 2 : s;

to three lines:

  s = (fmt & VPX_IMG_FMT_HIGHBITDEPTH) ? s * 2 : s;
  if (s > INT_MAX) goto fail;
  stride_in_bytes = (int)s;

But I didn't realize that `s` is used later in the calculation of
alloc_size.

As a quick fix, undo the effect of s * 2 for high bit depths after `s`
has been assigned to stride_in_bytes.

Bug: chromium:332382766
Change-Id: I53fbf405555645ab1d7254d31aadabe4f426be8c
2024-04-12 15:48:04 -07:00
Wan-Teh Chang 7d37ffacc6 Apply stride_align to byte count, not pixel count
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188962.

stride_align is documented to be the "alignment, in bytes, of each row
in the image (stride)."

Change-Id: I2184b50dc3607611f47719319fa5adb3adcef2fd
2024-04-11 16:46:13 -07:00
Wan-Teh Chang 8b2f8baee5 Avoid wasted calc of stride_in_bytes if !img_data
Change-Id: If1ddde5e894a06359f15486a2cee054a2f0cb1a2
2024-04-11 15:59:44 -07:00
Wan-Teh Chang 06af417e79 Avoid integer overflows in arithmetic operations
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188823.

Impose maximum values on the input parameters so that we can perform
arithmetic operations without worrying about overflows.

Also change the VpxImageTest.VpxImgAllocHugeWidth test to write to the
first and last samples in the first row of the Y plane, so that the test
will crash if there is unsigned integer overflow in the calculation of
stride_in_bytes.

Bug: chromium:332382766
Change-Id: I54cec6c9e26377abaa8a991042ba277ff70afdf3
2024-04-11 10:29:38 -07:00
Wan-Teh Chang 2e32276277 Fix integer overflows in calc of stride_in_bytes
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188761.

Fix unsigned integer overflows in the calculation of stride_in_bytes in
img_alloc_helper() when d_w is huge.

Change the type of stride_in_bytes from unsigned int to int because it
will be assigned to img->stride[VPX_PLANE_Y], which is of the int type.

Test:
. ../libvpx/tools/set_analyzer_env.sh integer
../libvpx/configure --enable-debug --disable-optimizations
make -j
./test_libvpx --gtest_filter=VpxImageTest.VpxImgAllocHugeWidth

Bug: chromium:332382766
Change-Id: I3b39d78f61c7255e10cbf72ba2f4975425a05a82
2024-04-10 20:47:45 -07:00
Wan-Teh Chang 3dbab0e664 Add test/vpx_image_test.cc
Ported from test/aom_image_test.cc in libaom commit 04d6253.

Change-Id: I56478d0a5603cfb5b65e644add0918387ff69a00
2024-04-10 18:15:00 -07:00
Matt Oliver f4d13145a2 project: Update for 1.14.0 merge. 2024-04-06 22:50:00 +11:00
Matt Oliver 9e260c493d Merge commit '602e2e8979d111b02c959470da5322797dd96a19' 2024-04-06 22:13:57 +11:00
Wan-Teh Chang 8762f5efb2 Define the MAX_NUM_THREADS macro in vp9_ethread.h
The MAX_NUM_THREADS macro is unrelated to the VPxWorkerInterface, so it
doesn't need to be defined in vpx_util/vpx_thread.h.

The VP8 code doesn't seem to depend on MAX_NUM_THREADS, so VP8 can use
64 directly in the range check of its g_threads option. Move the
definition of the MAX_NUM_THREADS macro to vp9/encoder/vp9_ethread.h and
use it in VP9 code only.

Change-Id: Ibf788ca2496c743a2ac0498fefaab8a3c181228d
2024-04-04 20:27:32 +00:00
Chun-Min Chang 0752960c6a Add missing header for EBUSY on mingw
The `error: use of undeclared identifier 'EBUSY'` in
vpx_util/vpx_pthread.h was found in Mozilla's bug 1886318 [1]. This
patch addresses the issue by adding the `<errno.h>` header to introduce
the `EBUSY` identifier, resolving the problem.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1886318#c1

Change-Id: Ic417dafebf5ab160060dd29f692fa9c40d8db05a
2024-04-04 10:10:37 -07:00
Wan-Teh Chang 6445da1b40 Fix GCC -Wmissing-braces warnings
warning: missing braces around initializer [-Wmissing-braces]

Bug: webm:1846
Change-Id: I007a68d09f48d4199ecd948136e69f9cf5f219f5
2024-04-04 00:03:35 +00:00
Casey Smalley 2bafeadd3e Add missing configuration includes
The Google cpp style guide dictates that you should "include what you
use" with respect to symbols. This CL adds vpx_config.h imports to unit
tests that rely on config flags but were otherwise indirectly included.

Change-Id: Ia70a512cebe6c104d2d64afbed3cde8a405c68df
2024-04-03 20:20:44 +00:00
Casey Smalley 588beb020b Unit test config changes for Chromium
This CL will help run libvpx tests under Chromium against its partition
allocator. The allocator does not support single allocations above
3.998GiB. Because of this tests related to large video sizes that
Chromium is configured for are expected to fail.

Chromium also only supports the CONFIG_REALTIME_ONLY option,
some changes are scoped behind this flag.

Change-Id: I80e8743c0619ce502688109ce0be01cb252d5f92
2024-04-03 20:20:44 +00:00
Wan-Teh Chang 05a4c855be Compare ctx->pending_cx_data with NULL
ctx->pending_cx_data is a pointer. It looks nicer to compare
ctx->pending_cx_data with NULL than with 0.

Change-Id: I18815907b3d75551abfc603cb3c5c0297dceed23
2024-04-03 02:53:45 +00:00
Hirokazu Honda 1d007eafa3 vp9 rc: Fix GetSegmentationData() crash in aq_mode=0
cpi_->cyclic_refresh is nullptr if aq_mode is 0, in other words, the
rate controller runs in non adaptive quantization mode. This CL fixes
the crash in GetSegmentationData() in non aq mode.

Bug: b/259487065
Test: video encoding on ChromeOS

Change-Id: I503b30d15c697c8dd1da203b3c7361b91c428e87
2024-04-02 23:00:50 +00:00
Wan-Teh Chang 976cedd643 Set priv->cx_data_sz to 0 if cx_data alloc fails
Change-Id: I6553cd7b09270b4d60ccd7199d499e03c22b3936
2024-04-02 22:53:17 +00:00
Wan-Teh Chang 419f36e8ed encoder_encode: Assert pending_cx_data_sz is valid
In encoder_encode(), assert ctx->pending_cx_data_sz is not too big
before the memmove() call.

Change-Id: Icd1e95f6d751b0bf67386d0d99218b256bc91ebd
2024-04-02 21:52:03 +00:00
Wan-Teh Chang dc74cf923b Dont use VPX_CODEC_CORRUPT_FRAME in set_frame_size
VPX_CODEC_CORRUPT_FRAME is a decoder error. It is strange for
vpx_codec_encode() to fail with this error. In set_frame_size(), change
VPX_CODEC_CORRUPT_FRAME to VPX_CODEC_ERROR.

The use of VPX_CODEC_CORRUPT_FRAME was originally added in
commit 1ed56a46b3.

Change-Id: Iee92ed4cfca5061289b278ece2ba475cf98fec06
2024-04-02 17:56:48 +00:00
Gerda Zsejke More bf932674a8 Add SVE2 implementation of vpx_highbd_convolve8_avg
Add SVE2 implementation of vpx_highbd_convolve8_avg function and the
corresponding tests as well.

Change-Id: I2ff707da55d11b1d5376eb0a7ec85c343a2709c2
2024-04-02 13:32:51 +02:00
Gerda Zsejke More 9274c2bbf0 Merge horiz. and vert. passes in HBD SVE2 2D 4tap convolution
The current SVE2 approach to 2D convolution is:
1) Filter horizontally, storing to an intermediate buffer.
2) Filter vertically and store the final output.

This patch merges the two phases for high bitdepth 2D convolution for
filter sizes smaller or equal to 4 to avoid the storing and
re-loading from the intermediate buffer.

This approach is not beneficial when applying an 8tap filter in the
convolution.

Change-Id: Ie090eb79f1cbf182300d9343ae63069396ef3956
2024-04-02 13:29:48 +02:00
Jingning Han 43d12d5079 Update yv12_mb initialization
BUG=webm:1846

Change-Id: If8475d46397f04ef769f3e4647de5c2d4b6760a4
2024-03-30 00:02:02 +00:00
Jerome Jiang 5396643be6 Add invalid value to gop decision enums
These invalid value definitions are necessary to initialize
the gop decision in external RC so libvpx can tell which is populated
and which is not

Bug: b/329483680
Change-Id: I06bbb41fa59d0fb95296aebd0d05a703ec953b81
2024-03-29 21:20:42 +00:00
Wan-Teh Chang 5f5dfb3303 Assert the return value of read_tx_mode() is < 5
Coverity somehow thinks the return value of read_tx_mode() is between 0
and 7 (inclusive).

Hopefully this will fix Coverity CID 1584457: Out-of-bounds access in
read_coef_probs().

Change-Id: I49fbddf6fd6861bc9def9dfa91eaaaa4aefe5710
2024-03-29 11:22:41 -07:00
Jingning Han ccefddef33 Initialize yv12_mb array
This array will be partially configured and used in later rate
distortion optimization search.

BUG=webm:1846

Change-Id: I83daba341c56767187031edb1c10d4528a4257a3
2024-03-29 09:10:23 -07:00
Wan-Teh Chang d790001fd5 Perform bounds checks in vpx_write_bit_buffer
Add the `size` and `error` members to the vpx_write_bit_buffer struct.
Add the vpx_wb_init() and vpx_wb_has_error() functions.

Instances of the vpx_write_bit_buffer struct are only allocated in the
vp9_pack_bitstream() function. So vp9_pack_bitstream() is the only
function outside vpx_dsp/bitwriter_buffer.* that needs updating.

This CL completes the work of adding output buffer bounds checks to
vp9/encoder/vp9_bitstream.c.

Bug: webm:1844
Change-Id: I6b362be572852ee51d96023b35bfb334faada7e1
2024-03-28 14:05:47 -07:00
Jerome Jiang d5501945fc vp9 rc: override GF_GROUP decisions using ext RC
Bug: b/329483680
Change-Id: I2e02673f1bca56bfa24545b4e25d5e3fd3b0e863
2024-03-28 17:33:54 +00:00
Marco Paniconi 3f8f19372b vp9: Fix to alloc for row_base_thresh_freq_fac
Issue happens for real-time nonrd pickmode.
Due to speed feature: sf->adaptive_rd_thresh_row_mt,
enabled for speed >= 8, and for speed >= 7 svc only.

Issue occurs where resolution (sb_rows) changes and
row_base_thresh_freq_fact needs to be re-allocated.

Fix is to add sb_rows to TileDataEnc and check for
re-alloc of row_base_thresh_freq_fac.

Bug: b:331108922
Change-Id: I1a1ca94c14f343200c180725e4cb8d91d3c55b83
2024-03-28 16:47:45 +00:00
Wan-Teh Chang 73703c188b Perform bounds checks in vpx_writer
In the vpx_writer struct, change the buffer_end field to the size field.
Change vpx_stop_encode() to return true on success, false on failure
(output buffer full).

In write_compressed_header(), remove the assertion
assert(header_bc.pos <= 0xffff). The caller (vp9_pack_bitstream()) will
check that condition.

In vp9_pack_bitstream(), the variable "first_part_size" is renamed
"compressed_hdr_size".

Bug: webm:1844
Change-Id: I4ed6ab905a707ad44d875e53036d5a42523a65d0
2024-03-27 09:07:03 -07:00
Wan-Teh Chang 5bea4606dd Fix a typo in comment: "it" -> "is"
Change-Id: I5d36c5198c67cbb2f424901ec045d0620fea2f04
2024-03-27 14:48:55 +00:00
Wan-Teh Chang 34277e53ad Free row mt memory before freeing cpi->tile_data
In vp9_init_tile_data(), call vp9_row_mt_mem_dealloc(cpi) to free the
row mt memory in cpi->tile_data before freeing cpi->tile_data.

Bug: b:331086799, b:331108729
Change-Id: Idc79984ce7e0110e6858139b2ed286492a2e8622
2024-03-26 23:38:09 +00:00
Marco Paniconi c84d23c6cd vp9: fix to integer overflow test
failure for the 16k test: issue introduced
in: c29e637283

Bug: b/329088759, b/329674887, b/329179808

Change-Id: I88e8a36b7f13223997c3006c84aec9cfa48c0bcf
(cherry picked from commit 19832b1702)
2024-03-26 19:22:17 +00:00
Marco Paniconi b6847dcf72 Fix to buffer alloc for vp9_bitstream_worker_data
The code was using the bitstream_worker_data when it
wasn't allocated for big enough size. This is because
the existing condition was to only re-alloc the
bitstream_worker_data when current dest_size was larger
than the current frame_size. But under resolution change
where frame_size is increased, beyond the current dest_size,
we need to allow re-alloc to the new size.

The existing condition to re-alloc when dest_size is
larger than frame_size (which is not required) is kept
for now.

Also increase the dest_size to account for image format.

Added tests, for both ROW_MT=0 and 1, that reproduce
the failures in the bugs below.

Note: this issue only affects the REALTIME encoding path.

Bug: b/329088759, b/329674887, b/329179808

Change-Id: Icd65dbc5317120304d803f648d4bd9405710db6f
(cherry picked from commit c29e637283)
2024-03-26 19:21:31 +00:00
Wan-Teh Chang 08781b2e51 Add high bit depths, 4:2:2, 4:4:4 to VP9Encoder
A port of the changes to vp9_encoder_fuzz_test.cc in
https://chromium-review.googlesource.com/c/chromium/src/+/5292940.

Change-Id: Ie143ffd9cffbd6a8639812c72e85c9a017aa554e
(cherry picked from commit 8c36d36bcc)
2024-03-26 19:20:29 +00:00
Gerda Zsejke More d2ba3a22b4 Add 2D-specific highbd SVE2 horizontal convolution function
2D 8-tap convolution filtering is performed in two passes -
horizontal and vertical. The horizontal pass must produce enough
input data for the subsequent vertical pass - 3 rows above and 4 rows
below, in addition to the actual block height.

At present, all highbd SVE horizontal convolution algorithms process
4 rows at a time, but this means we end up doing at least 1 row too
much work in the 2D first pass case where we need h + 7, not h + 8
rows of output.

This patch adds an additional SVE2 path that processes h + 7 rows of
data exactly, saving the work of the unnecessary extra row.

Change-Id: I2f5d39ad737dbd7eccb08dd2b51586c6710119b8
2024-03-26 19:00:26 +00:00
Gerda Zsejke More cd9d72c065 Add SVE2 implementation of vpx_highbd_convolve8
Add SVE2 implementation of vpx_highbd_convolve8 function. Add the
corresponding tests as well.

Change-Id: I783cc083f1bce5f13ce721bc191b34c48033f5ae
2024-03-26 19:00:26 +00:00
Jerome Jiang 219d7e6a0c Fix several clang-tidy complaints
Change-Id: I78721d6b7ed692ad9363b5cac4e3324a3136d5b6
(cherry picked from commit 4c2435c33e)
2024-03-26 17:42:20 +00:00
Wan-Teh Chang c9bd573531 Replace "cpi->common." with "pc->"
If a local variable "pc" is defined as &cpi->common, replace
"cpi->common." with "pc->".

Also replace a memcpy() call with a struct assignment.

Change-Id: I6f4f12e69d9989beaa6e04c83d93230e7d726278
2024-03-25 14:46:50 -07:00
Wan-Teh Chang 4f579df337 Declare VP9BitstreamWorkerData dest_size as size_t
Declare the dest_size member of the VP9BitstreamWorkerData struct as
size_t instead of int.

Fix the following MSVC warning:
vp9\encoder\vp9_bitstream.c(1031,37): warning C4267: '=':
conversion from 'size_t' to 'int', possible loss of data

Change-Id: Idab5ad5d4bf4d1e4754f011a3073c9a89da29f55
2024-03-23 11:20:22 -07:00
Wan-Teh Chang e387187438 Add the buffer_end field to the vpx_writer struct
The buffer_end field will allow bounds checking when vpx_writer writes
to the output buffer. This CL sets up the plumbing to pass the output
buffer size from vp9_pack_bitstream() to vpx_start_encode(), which
initializes the vpx_writer struct. vpx_writer doesn't use the output
buffer size in bounds checks yet, but the code in vp9_bitstream.c does.

Bug: webm:1844
Change-Id: I995e469ab453c02d740f54b46e0b08c7f2eb1a2e
2024-03-23 02:15:42 +00:00
James Zern 9137f7fa4b rtcd.pl: add empty specialize() check
This was added in libaom in:
5ddac0aac8 RTCD defs: Remove empty specialize statements once and for all.
https://aomedia-review.googlesource.com/c/aom/+/9062

Change-Id: I9c8fb0c8e4bd4dc9373d8533ab083dff816e7cbe
2024-03-22 09:45:42 -07:00
Wan-Teh Chang d48577579b Pass output buffer size to vp9_pack_bitstream()
Set up the plumbing to pass the size of the output buffer `dest` to
vp9_pack_bitstream(). The output buffer is the cx_data buffer in the
encoder_encode() function in vp9/vp9_cx_iface.c, and its size is
cx_data_sz.

In this CL vp9_pack_bitstream() ignores the `dest_size` parameter.

Bug: webm:1844
Change-Id: I53c80280143d409cf16f87c4d6deec3d9338aea3
2024-03-21 18:59:58 -07:00
James Zern cab4f31e1d encodeframe.c: remove some unused includes
clears some clang-tidy warnings

Change-Id: I82c1d212126b9c7b010b6bc8ac32d92453f6d376
2024-03-21 01:59:48 +00:00
James Zern 3c58cb1bc2 VP8_COMMON: remove unused cpu_caps member
This was set, but not read; rtcd covers this.

Change-Id: I1d8b8f8d8ed9e7bc56c3734cb96b79b937b5e20c
2024-03-20 13:14:35 -07:00
Wan-Teh Chang 6e879c6173 Save encode_tiles_buffer_alloc_size() result
Avoid calling encode_tiles_buffer_alloc_size() twice by saving its
return value in a local variable.

Change-Id: I3050f9cf7c3520f7edc80abf66620ba233fadad8
2024-03-20 20:03:57 +00:00
James Zern afc8b452b8 aarch64_cpudetect: add missing include
clears clang-tidy warning for HAS_*

Change-Id: I1b21326480b7c5c3be18c055f848071df0076915
2024-03-20 18:24:37 +00:00
James Zern 55d4b736b2 vpx_scaled_convolve8_neon: add missing include
clears clang-tidy warnings for types and constants in vpx_filter.h

Change-Id: I1f3f843b9ab6fd0ad038e33a048e8708cbd2a950
2024-03-20 18:24:37 +00:00
Jerome Jiang 6358ef6261 vp9 rc: Add ref frame list for each frame in GOP
Bug: b/329483680
Change-Id: I24573c25e70b41b7af243c473f28fa1b290cc373
2024-03-20 12:17:52 -04:00
Jerome Jiang 18059190d7 Remove vpx_rc_gop_info_t
Not being used anywhere any more. This was used to pass
GOP info to ML models.

Bug: b/617172914
Change-Id: Ibcfa00bc8215b73f43d9a11edbe00b6a2d7fb137
2024-03-20 16:17:23 +00:00
Jerome Jiang 6641e9e03c Add update type and ref update idx to gop decision
Bug: b/329483680
Change-Id: Ifc82ad79415400bbec4efe6ab9b78496d5f73ee7
2024-03-20 16:17:23 +00:00
Jerome Jiang 458e1c6875 Remove TPL IO functions
These are not used in libvpx.

Bug: webm:1837
Change-Id: Ic234b2dcd47d6614030b8a066c921e4285af5e99
2024-03-19 19:05:24 +00:00
Marco Paniconi 19832b1702 vp9: fix to integer overflow test
failure for the 16k test: issue introduced
in: c29e637283

Bug: b/329088759, b/329674887, b/329179808

Change-Id: I88e8a36b7f13223997c3006c84aec9cfa48c0bcf
2024-03-17 10:07:34 -07:00
Marco Paniconi c29e637283 Fix to buffer alloc for vp9_bitstream_worker_data
The code was using the bitstream_worker_data when it
wasn't allocated for big enough size. This is because
the existing condition was to only re-alloc the
bitstream_worker_data when current dest_size was larger
than the current frame_size. But under resolution change
where frame_size is increased, beyond the current dest_size,
we need to allow re-alloc to the new size.

The existing condition to re-alloc when dest_size is
larger than frame_size (which is not required) is kept
for now.

Also increase the dest_size to account for image format.

Added tests, for both ROW_MT=0 and 1, that reproduce
the failures in the bugs below.

Note: this issue only affects the REALTIME encoding path.

Bug: b/329088759, b/329674887, b/329179808

Change-Id: Icd65dbc5317120304d803f648d4bd9405710db6f
2024-03-15 21:43:28 +00:00
Wan-Teh Chang 7fb8ceccf9 Restrict ranges of duration,deadline to UINT32_MAX
Bug: webm:1828
Change-Id: I3b1d208cf300d7c4c5584681183d45b1e97c7380
2024-03-15 01:11:50 +00:00
Wan-Teh Chang bc5a22eb60 Replace timestamp_ratio by oxcf->g_timebase_in_ts
Fix a TODO comment in encoder_init().

Change-Id: Id737142202c807a3f538fdf50612e77ca790990c
2024-03-14 14:32:44 -07:00
Wan-Teh Chang 6c0bf97a98 Detect integer overflows related to pts & duration
A port of the following two libaom CLs:
https://aomedia-review.googlesource.com/c/aom/+/187902
https://aomedia-review.googlesource.com/c/aom/+/188161

Bug: webm:1828
Change-Id: Id25039b000c3d04e7a4c8d71579a6932e9fd65ef
2024-03-14 11:14:25 -07:00
Wan-Teh Chang 8c36d36bcc Add high bit depths, 4:2:2, 4:4:4 to VP9Encoder
A port of the changes to vp9_encoder_fuzz_test.cc in
https://chromium-review.googlesource.com/c/chromium/src/+/5292940.

Change-Id: Ie143ffd9cffbd6a8639812c72e85c9a017aa554e
2024-03-13 13:45:41 -07:00
Jonathan Wright ad1d0ece31 Disable SVE2 if compiler doesn't support arm_neon_sve_bridge.h
SVE and SVE2 code paths in libvpx require intrinsics from
arm_neon_sve_bridge.h. SVE is disabled if the compiler does not
support this header. This patch conditionally disables SVE2 in the
same way.

Also gate the check for arm_neon_sve_bridge.h on whether SVE is
enabled in the first place. The check isn't necessary if the user has
explicitly disabled SVE. (Explicitly disabling SVE already disables
SVE2 since the former is a pre-requisite for the latter.)

Change-Id: Ibb21f09e8b2470d1ce5d98b71b101f5b7f7dbcdc
2024-03-13 18:38:10 +00:00
James Zern c1494fa57e neon: fix -Woverflow warnings
with signed char values, 128 -> -128

Change-Id: Iec2257729d7878459794d6a3d6bc3f745d39e97c
2024-03-13 18:36:14 +00:00
Wan-Teh Chang daa33cca37 Remove return statement after vpx_internal_error()
In encoder_encode(), remove the return statement after a
vpx_internal_error() call because setjmp() has been called at that
point.

Change-Id: Ib8ebbfbacb21097ce7f1b4e3bf53004bbe88a42b
2024-03-13 00:57:33 +00:00
Wan-Teh Chang 0ba7b50338 Ignore the pts parameter when flushing the encoder
Change-Id: I8380a4a7ffcbf7a6f183d02d363473273b47f064
2024-03-12 11:02:30 -07:00
Wan-Teh Chang 7f6ba04e87 Move the local variable sd to the innermost scope
Change-Id: Ie6bfda6247ba408b9dbcf0b94fa95dbca0c57adb
2024-03-11 17:07:39 -07:00
James Zern cf1b7a65ff VP9RateControlRtcConfig: relocate some initializations
use default member initializers for all members for consistency.

Change-Id: I1956163c995d94aadbde38b4edaf21dc722e50c4
2024-03-11 18:50:00 +00:00
James Zern 0af7244971 ratectrl_rtc.h: remove use of vp9_zero()
This is an internal define that shouldn't be exposed in this header.

Change-Id: I43b793ab18c19ffab8bcc71fcd7097216989ca5a
2024-03-11 18:50:00 +00:00
James Zern ca7fd396e7 ratectrl_rtc.h: move some includes to .cc
This is the first step in removing the use of internal headers in a
public header.

Change-Id: Ia71b0b16a01037baa72942fc8ee7aeb4ffc04b86
2024-03-11 18:50:00 +00:00
James Zern 0ba67bb93d *ratectrl_rtc.h: remove unneeded 'public:'
in struct VP8RateControlRtcConfig and struct VP9RateControlRtcConfig;
structs default to public access.

Change-Id: Icdc5b44fb4c7297b0cb3c6cde8bec33ea5cee18c
2024-03-11 18:50:00 +00:00
James Zern cd88d25c53 vp8_ratectrl_rtc.cc: fix include order
vp8/vp8_ratectrl_rtc.h should come first as it's implemented in this
module. Split the rest of the groups on C/C++/vpx bounds.

Change-Id: If6bbbd8f3adf3766fa36fbc53ae06c9f6f76ebe9
2024-03-11 18:48:31 +00:00
Gerda Zsejke More 5391609fbe Add SVE2 implementation of vpx_highbd_convolve8_avg_vert
Add SVE2 implementation of vpx_highbd_convolve8_avg_vert function.
Add the corresponding tests as well.

Change-Id: I20ca19e09a1686bb00c0b51bf756ddab0adbc2c0
2024-03-11 18:43:35 +00:00
Gerda Zsejke More 45ea306dad Add SVE implementation of vpx_highbd_convolve8_avg_horiz
Add SVE implementation of vpx_highbd_convolve8_avg_horiz function.
Add the corresponding tests as well.

Change-Id: If13793fa653834dfdfeddfee60b80129eea85dd7
2024-03-11 18:43:35 +00:00
Gerda Zsejke More 2c3a9b69e7 Add SVE2 implementation of vpx_highbd_convolve8_vert
Add SVE2 implementation of vpx_highbd_convolve8_vert function. Add
the corresponding tests as well.

Change-Id: I289ac79d4493935217feaa4fd2fa0b8ef9a62972
2024-03-11 18:43:35 +00:00
Gerda Zsejke More 282e9aa0eb Add Arm SVE2 build flags and run-time CPU feature detection
Add 'sve2' arch options to the configure, build and unit test files -
adding appropriate conditional options where necessary. Arm SIMD
extensions are treated as supersets in libvpx, so disable SVE2 if
SVE is unavailable.

Change-Id: Icdec2aace357e36fba77c77cd8b70da1e5427fce
2024-03-11 18:43:35 +00:00
Wan-Teh Chang a87978a53d VP8: Always reset the setjmp flag before returning
Always reset the setjmp flag to 0 before returning from the function
where setjmp() was called.

Change-Id: I80bf39ef1769f656f53c6c6657c06e34489750f4
2024-03-11 16:45:59 +00:00
Wan-Teh Chang f51417671e Include system headers first
Change-Id: Ia096dacb3dd102829196e5ebd1bc148cf2ea2f93
2024-03-09 18:27:18 +00:00
James Zern 03c7f6a108 libs.doxy_template: remove DOT_TRANSPARENT
This was deprecated in 1.9.5 [1]. It is now enabled by default. For
earlier versions of doxygen this will set the value to false, but I
don't believe we were relying on this functionality.

[1]: https://www.doxygen.nl/manual/changelog.html#log_1_9_5

Change-Id: I75f576d35ca86636761cf70fda0dd0ad37f71d71
2024-03-09 02:33:16 +00:00
Wan-Teh Chang f46d99bcf7 Clear dangling ptr in vp8_remove_decoder_instances
Change-Id: I80c7d41c4675305efbbfbaddd45b42122979b318
2024-03-09 01:02:58 +00:00
Wan-Teh Chang ec06dcc314 Subtract pts_offset from pts after calling setjmp
This allows us to call vpx_internal_error() if the relative pts would be
negative.

Change-Id: I9ca314c4e32bb2c17bbe20ede6ea854bf9701ade
2024-03-08 14:07:38 -08:00
James Zern 99e887c09e vp8/encoder/encodeframe.c: sort includes
Change-Id: I30a8117754e8168a3f6fe37c4ea459475ad1b9aa
2024-03-07 19:57:44 -08:00
Wan-Teh Chang a6647c9cab Add vp8_ prefix to sem_* macro names
The sem_* macros do not behave exactly like the POSIX sem_* functions.
Add the vp8_ prefix to the sem_* macro names to make it clear that they
are not the POSIX sem_* functions. Another reason for adding the vp8_
prefix is that we need to wrap sem_wait() (to handle EINTR) on the Unix
platforms that have real sem_wait() function.

Handle EINTR in the Unix (non-Apple) definition of vp8_sem_wait().

Change-Id: I3df02a30f851d41691a55cf7a84aa2ff054bba9c
2024-03-08 01:46:59 +00:00
Jonathan Wright 6b6916be0a Refactor standard bitdepth Neon scaled convolve
Tidy up the standard bitdepth Armv8.0 Neon implementation of scaled
convolution.

Change-Id: I9e48e773b4a4b252b9254a22af23c8e834407b8a
2024-03-08 00:39:58 +00:00
Jonathan Wright 9b94b7bd01 Optimize Arm Neon implementation of transpose_u8_8x8()
Operate on 128-bit vectors to reduce the total number of instructions
by two.

Change-Id: I252e67831ccbb51adcfe5caaadb3205d3eb11b79
2024-03-08 00:39:58 +00:00
James Zern fa64af7bbf vp8/encoder/encodeframe.c: add missing include
Based on a clang-tidy warning:
  `no header providing "sem_wait" is directly included`
Though this may not clear it entirely, it's the closest that can be
done given the platform-dependent includes and implementation in
vp8/common/threading.h

Change-Id: I19984f820f3f380e58deef40563a2f0c66187748
2024-03-07 13:53:56 -08:00
James Zern 1f066bf77c build/make/Android.mk: update configure/build comments
set --target to the more modern aarch64-android-gcc and remove an
incorrect comment regarding realtime-only.

Change-Id: I5f6c9de9fcd96a60817e37fc6f6505725ddea6b9
2024-03-07 00:40:17 +00:00
George Steed b0e26cdcfd aarch64_cpudetect.c: Avoid unused variable warning
When dot-product and SVE support are disabled the hwcap variable is
currently unused. Fix this by wrapping it in an #ifdef matching the
conditions where it is needed.

Change-Id: I1c2e302d861c6c726b314e374f07d4fafe17ffc7
2024-03-06 18:52:38 +00:00
Jerome Jiang 148c7f65f0 IWYU: fix clang-tidy complaints
Include vp9_firstpass.h for KF_UPDATE

Change-Id: Ie1805a2201f3c42c7d3a0102e4eaa0378cca315e
2024-03-06 10:11:46 -05:00
Daniel Cheng b207d1c9bd Only #define __builtin_prefetch if it doesn't exist.
libvpx's check for conditionally defining __builtin_prefetch is broken,
since clang-cl defines __builtin_prefetch on Win ARM64: in addition, it
supports up to 3 arguments, with the latter 2 being optional. This
causes build breaks when paired with other libraries, like Abseil, which
do perform the conditional test correctly.

The real fix here is to define something like VPX_PREFETCH rather than
trying to #define an implementation-reserved name, which is undefined
behavior.

Bug: 328105513
Change-Id: Ibe14d9ce34306654bd20e560973f76c3b40036ee
2024-03-04 18:11:26 -08:00
Jerome Jiang a571299b07 vp9 ext rc: Do motion search on key frame in TPL
Bug: b/327254742
Change-Id: I7448c09994441c89c36420e780cd2641c6f1aa5a
2024-03-04 21:54:11 +00:00
Jonathan Wright 9d8d71b41b Refactor Arm Neon transpose_concat_*() to not need lookup table
Refactor the transpose_concat_*() helper function used in the Arm Neon
DotProd and I8MM vertical convolution implementations to not use TBL
instructions. Using vzip* to achieve the same outcome (with the same
number of instructions) avoids needing/loading the lookup indices and
also increases performance on little (in-order) Arm Cortex cores.

Change-Id: Iff62a44f8a9bf0ee239d5bb36be8424cab0dbca5
2024-03-04 20:24:03 +00:00
Jonathan Wright 5a8e2f705e Cosmetic: Remove 'vpx_' prefix from static Neon functions
Tidy up some of the naming in Arm Neon convolution functions.

Change-Id: I9cfd925dbcb754bdf9fe0860a46a1c9dca2c7f9a
2024-03-04 20:24:03 +00:00
Cheng Chen f394f2be74 Delete "public" from struct definitions
Struct by default is public.

Change-Id: I87dc164d6a63fcc950c6e513901fc2826e53a8ae
2024-02-29 14:16:55 -08:00
Wan-Teh Chang d4959f9825 Handle EINTR from sem_wait()
sem_wait() may be interrupted by a signal and fail with EINTR:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_wait.html

Retry the sem_wait() call if it fails with EINTR.

This finishes the fix started in
https://chromium-review.googlesource.com/c/webm/libvpx/+/5299569. As a
speculative fix, that CL fixed only the sem_wait(&cpi->h_event_end_lpf)
calls responsible for bug chromium:324459561. ClusterFuzz verified the
fix, so this CL extends it to the other sem_wait() calls.

Note that sem_wait() calls like the following do not need this fix,
because the while (1) loop retries the sem_wait() call if it fails:

  while (1) {
    if (vpx_atomic_load_acquire(&cpi->b_multi_threaded) == 0) break;

    if (sem_wait(&cpi->h_event_start_lpf) == 0) {
      ...
    }
  }

Bug: chromium:324459561
Change-Id: I0f0612616eee37fb3da68049e49b3e86927b5e24
2024-02-28 21:06:36 +00:00
George Steed 793c0b9196 Only enable AArch64 extensions if the compiler supports them
We already have some logic in the configure.sh file to selectively
disable code dependent on particular architecture extensions, however we
do not yet have anything to check that the compiler being supplied
recognises and can compile code using these extensions.

This commit adds compiler "-march=..." flag tests to the existing
extension-disable loop so that we now correctly disable extensions that
are not supported by the compiler. For AArch64 this loop also needs to
move below the existing compiler/OS handling to ensure that prefixes
like $CROSS are handled correctly before running compiler tests.

Bug: webm:1841
Change-Id: I936b911c4b0ebf03abc34b7532b2bb4568129f57
(cherry picked from commit fa50b26848)
2024-02-28 02:46:29 +00:00
Gerda Zsejke More 3ac1316c46 Require Arm Neon-SVE bridge header for enabling SVE
Disable SVE feature if arm_neon_sve_bridge header is not supported
by the compiler.

Change-Id: I3f78be2dd95b37b8d51b9f1fceca1f9701535eca
(cherry picked from commit 6ea3b51ec2)
2024-02-28 02:45:59 +00:00
Wan-Teh Chang b7b5d0a568 Use the value param in Win32 version of sem_init
Name the three parameters of sem_init() as sem, pshared, value. See
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_init.html.

Pass the `value` parameter to CreateSemaphore() as the second
(lInitialCount) parameter:
https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createsemaphorea

Remove unneeded parentheses around semaphore_wait(*sem).

Change-Id: I1735c94adb511ca539159dfea19421595ec15d24
2024-02-27 13:44:23 -08:00
Wan-Teh Chang 7b9843099c Handle EINTR from sem_wait(&cpi->h_event_end_lpf)
sem_wait() may be interrupted by a signal and fail with EINTR:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_wait.html

Retry the sem_wait(&cpi->h_event_end_lpf) call if it fails with EINTR.

Bug: chromium:324459561
Change-Id: Icc957e8b9f21f25ec3c95e22cab502af417443f2
(cherry picked from commit d63efe0679)
2024-02-27 12:11:58 -08:00
Marco Paniconi 4c80888a71 vp8: Fix to race issue for multi-thread with pnsr_calc
Added unitest which triggers the data race in the
bug below, when only C code is forced.

The data race is between the loopfilter and variance
computation from generate_psnr_packet calculation.
Proposed fix is to move the wait for loopfilter thread to
finish up before entering generate_psnr_packet().

Bug: b/266833179.

Change-Id: Id2871c53274be0f404e65601c9a5c98aaead0c72
(cherry picked from commit 756b29a776)
2024-02-27 19:57:23 +00:00
George Steed fa50b26848 Only enable AArch64 extensions if the compiler supports them
We already have some logic in the configure.sh file to selectively
disable code dependent on particular architecture extensions, however we
do not yet have anything to check that the compiler being supplied
recognises and can compile code using these extensions.

This commit adds compiler "-march=..." flag tests to the existing
extension-disable loop so that we now correctly disable extensions that
are not supported by the compiler. For AArch64 this loop also needs to
move below the existing compiler/OS handling to ensure that prefixes
like $CROSS are handled correctly before running compiler tests.

Bug: webm:1841
Change-Id: I936b911c4b0ebf03abc34b7532b2bb4568129f57
2024-02-27 19:53:03 +00:00
Gerda Zsejke More 3646b12927 Specialise highbd_convolve8_horiz_sve for 4-tap filter
Add SVE implementation for vpx_highbd_convolve8_horiz that specialises
for 4-tap filters. This way we avoid a lot of redundant work to
multiply and add zero, given that some of the 8-tap filters are
zero-padded, so they are effectively 4-tap filters.

Change-Id: Ib5e0377f924df1d893e9436f443fcbe7d196ea27
2024-02-27 19:38:09 +00:00
Gerda Zsejke More c78f1ef4a0 Rename dot_neon_sve_bridge header file
Rename dot_neon_sve_bridge.h to vpx_neon_sve_bridge.h in order to
reflect that other instructions can be implemented in the header
file. In a subsequent patch, the usage of vtbl with Neon-SVE bridge
intrinsics will be added.

Change-Id: I8f71aad2b7fb4932c9554badf041a80aca58c7cf
2024-02-27 19:38:09 +00:00
Jonathan Wright 2bc1012c53 Remove redundant code for neon_dotprod 2D convolution
Remove the 4-tap Neon DotProd path for the horizontal pass of 2D
convolution since it has been made redundant by the horizontal-
vertical merged implementation. Also move the 8-tap path closer to
where it is used and call it explicitly rather than the filter-
agnostic wrapper.

Change-Id: I1861dc88a67a759c3e8deb0b471ec447a62063f2
2024-02-26 22:28:05 +00:00
Jonathan Wright baece7460d Merge h. and v. passes in 4-tap SBD Neon DotProd 2D convolution
The current SBD Neon DotProd approach to 2D convolution is:
1) Filter horizontally, storing to an intermediate buffer.
2) Filter vertically and store the final output.

This patch merges the two phases for 4-tap standard bitdepth 2D
convolution to avoid storing to and re-loading from the intermediate
buffer - giving a 10-25% speedup depending on block size. Merging the
passes for 8-tap filters does not have the same benefit, so keep the
existing implementation.

Change-Id: Ic6008836d1a499ee2cd957b9db194fca5671ccb4
2024-02-26 22:28:05 +00:00
Jonathan Wright d191c5f984 Remove redundant code for neon_i8mm 2D convolution
Remove the 4-tap Neon i8mm path for the horizontal pass of 2D
convolution since it has been made redundant by the horizontal-
vertical merged implementation. Also move the 8-tap path closer to
where it is used and call it explicitly rather than the filter-
agnostic wrapper.

Change-Id: Icddecb7e133656c54aa5e79536b49759715b6fcb
2024-02-26 20:59:41 +00:00
Jonathan Wright cef5b0da97 Merge h. and v. passes in 4-tap SBD Neon i8mm 2D convolution
The current SBD Neon i8mm approach to 2D convolution is:
1) Filter horizontally, storing to an intermediate buffer.
2) Filter vertically and store the final output.

This patch merges the two phases for 4-tap standard bitdepth 2D
convolution to avoid storing to and re-loading from the intermediate
buffer - giving a 5-40% speedup depending on block size. Merging the
passes for 8-tap filters does not have the same benefit, so keep the
existing implementation.

Change-Id: Ic8ec2822681176ef879dcaf8424d8d91c5e8d2df
2024-02-26 20:59:41 +00:00
James Zern a3209600f2 codec_factory.h: fix -Wpedantic warnings
With either CONFIG_VP8=0 or CONFIG_VP9=0. Fixes a warning about an extra
';' outside of a function due to VP[89]_INSTANTIATE_TEST_SUITE() being
defined to nothing.

Change-Id: I1878d7596e39c5166efbe96450a733efc08665ea
2024-02-26 20:52:35 +00:00
Jerome Jiang b5578f1283 Add inter/intra_pred_err to VpxTplBlockStats
inter/intra_cost in VP9 TPL is calculated with SATD
which should be close enough to be used as inter/intra_pred_err

Bug: b/326262148
Change-Id: Ic0fd08708fcf3640398fc22a1a6bb6f449b2a9b8
2024-02-26 12:27:38 -05:00
Jerome Jiang ff9591f8df vp9 ext rc: assign srcrf_dist/rate instead
Bug: b/326262148
Change-Id: I3af0b5d28c58447862eb11d5b10afa8a32d82ada
2024-02-26 17:27:11 +00:00
James Zern 5433b943a4 resize_test.cc: fix warning w/CONFIG_VP9=0
fixes -Wunused-but-set-variable

Change-Id: Id9431342745baa1492f5da0e32d09372e10fdcd2
2024-02-24 00:53:28 +00:00
James Zern fca3d1755f fix void param declarations
These should be funcname(void) rather than funcname(). Quiets some
-Wstrict-prototypes warnings.

Change-Id: I68705fe53f4438c9584e7040c39cecec859af27c
2024-02-22 15:13:03 -08:00
James Zern 5e90a97fa2 tokenize.h: remove undefined vp8_tokenize_initialize()
It was removed in:
f039a85fd Make global data const

Change-Id: Ib5aa35500c3ee7caf1ec216e0351c32ef373f5f2
2024-02-22 13:33:26 -08:00
Marco Paniconi 79284f4c84 vp8: add uv_delta_q support to external RC
Bug: b/321137490

Change-Id: Id9ccf8e80d693b296b224846094fc7c0f71c5d0a
2024-02-22 17:31:40 +00:00
James Zern 1659e73b0b vp9_context_tree.h: add name to union
Anonymous unions are not supported in C99, they were added in C11:
https://en.cppreference.com/w/c/language/union

Fixes -Wpendantic warning:
vp9/encoder/vp9_context_tree.h:93:4: warning: ISO C99 doesn’t support
  unnamed structs/unions [-Wpedantic]

Change-Id: Ibd29d6deca35d81ea886e80e9f44575c73ecd96d
2024-02-21 23:15:19 +00:00
James Zern 5d022e45ec vp9_rdopt,skip_iters: normalize use of const
Fixes a -Wpedantic warning:
vp9/encoder/vp9_rdopt.c:1988:20: warning: invalid use of pointers to
  arrays with different qualifiers in ISO C before C2X [-Wpedantic]

Change-Id: I581e21d7e59c0bae0e44056a3b3f049c5a4e7cf2
2024-02-20 13:56:59 -08:00
Gerda Zsejke More 9c0c5144e7 Add SVE implementation of vpx_highbd_convolve8_horiz
Add SVE implementation of vpx_highbd_convolve8_horiz function. Add
the corresponding tests as well.

Change-Id: I0b2815831daf203e167ea5289307087ce53ff9da
2024-02-20 19:14:56 +00:00
Jonathan Wright 7e9da9702c Use Armv8.0 Neon 4-tap vertical convolution for all arch levels
The new Armv8.0 Neon implementation of 4-tap vertical convolution is
faster than Armv8.4 DotProd and Armv8.6 I8MM implementations. This
patch removes the DotProd and I8MM implementations in favour of using
the Armv8.0 version everywhere.

Change-Id: I126470fd4862d8bb116153e90bb2e4f2f2dba1e4
2024-02-20 15:22:35 +00:00
Jonathan Wright 9f7a70bdf2 Further accelerate Armv8.0 Neon 4-tap convolution
Refactor Armv8.0 Neon 4-tap convolution functions to operate on 8-bit
types directly, rather than first widening to 16-bit.

2-tap (bilinear) filter values are always positive, but 4-tap filter
values are negative on the outer edges (taps 0 and 3), with taps 1
and 2 having much greater positive values to compensate. To use
instructions that operate on 8-bit types we also need the types to be
unsigned. In the convolution kernel, subtracting the products of taps
0 and 3 from the products of taps 1 and 2 always works since 2-tap
filters are 0-padded.

Co-authored by: Hari Limaye <hari.limaye@arm.com>

Change-Id: I87b32e2ef8cbd21eebb8cd2642e8826b704905b1
2024-02-20 15:22:22 +00:00
Wan-Teh Chang 4340382bb0 Move THREADFN macro definitions to vpx_pthread.h
The THREADFN and THREAD_EXIT_SUCCESS macros are used to define the
thread start routines passed to our implementation of pthread_create(),
so it makes sense to define these macros in vpx_util/vpx_pthread.h. This
also allows the VP8 and VP9 code to share the macro definitions.

Replace the THREAD_FUNCTION macro by THREADFN. They have the same
definition.

Change-Id: I79a7476e43652667af6a8da7ad7ce346b1b6b024
2024-02-16 09:46:39 -08:00
Wan-Teh Chang 4e384da53d Delete a duplicate definition of thread_sleep()
There are two identical definitions of thread_sleep() for Win32. Delete
the first one.

Change-Id: I617e180e3459a24fbafec5b060bbdcd4fcee8128
2024-02-15 20:31:25 -08:00
Wan-Teh Chang 3316c11240 Delete unused macro definitions
Change-Id: Ic12d5b9ec9b18743e8ef67d132ed3bbbc90c7fa6
2024-02-15 17:04:32 -08:00
Wan-Teh Chang d63efe0679 Handle EINTR from sem_wait(&cpi->h_event_end_lpf)
sem_wait() may be interrupted by a signal and fail with EINTR:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_wait.html

Retry the sem_wait(&cpi->h_event_end_lpf) call if it fails with EINTR.

Bug: chromium:324459561
Change-Id: Icc957e8b9f21f25ec3c95e22cab502af417443f2
2024-02-16 00:14:40 +00:00
Jerome Jiang e1da3834ba Add base qp to ext rc config
Change-Id: I0eb7e0dbe3d1784c4408fdddf763d2b64c90fbb5
2024-02-15 13:08:46 -05:00
Peter Kasting e92dd05124 Add VPX_WORKER_STATUS_ to values of global-scope status enum.
This helps prevent name clashes if code e.g. #includes headers from both
libvpx and libaom.

Bug: none
Change-Id: Ifc9e7ac4862dc04a399e7777d2636e1453627970
2024-02-14 20:37:40 -08:00
Peter Kasting 4c0cf7458c Split pthread wrapper to vpx_pthread.h.
Also does a bit of cleanup to the THREAD macros as suggested in review.

Bug: none
Change-Id: I1fbfacf99b2439ac1147e346e53d72d7ee39c298
2024-02-14 18:54:43 +00:00
Jerome Jiang b01d61c9af Remove unused signals for get_encodeframe_decision
Bug: b/323234722
Change-Id: Iab5c27b232552f924b05fdd7fa1cd6792e04faed
2024-02-14 10:46:18 -05:00
Jerome Jiang 591c787436 vp9 ext rc: Remove initializer for gop_decision
Change-Id: Ie4ebcc39ab8c34631395ce81e2916c766c3a7f13
2024-02-13 23:21:18 +00:00
Jonathan Wright 455cb26998 Optimize Arm Neon USDOT narrowing sequences in convolve kernels
Currently we use two rounds of complex right-shift operations to
narrow and pack results from the dot-product convolution kernels.
This patch refactors these sequences to use one "simple" right-shift
and one complex right-shift - reducing the latency by 4 cycles on
modern out-of-order Arm CPUs.

Change-Id: I3fd38560bb14d85826e417f40d35f11165ab80da
2024-02-13 19:58:25 +00:00
Jonathan Wright 939bcd4026 Optimize Arm Neon SDOT narrowing sequences in convolve kernels
Currently we use two rounds of complex right-shift operations to
narrow and pack results from the dot-product convolution kernels.
This patch refactors these sequences to use one "simple" right-shift
and one complex right-shift - reducing the latency by 4 cycles on
modern out-of-order Arm CPUs.

Change-Id: I908147ed65a87157009363782399ff398406cdf9
2024-02-13 19:58:25 +00:00
Jerome Jiang a64bf87fb9 Fix gop decision and gop index in TPL pass
- Initialize gop_decision
 - Initialize GF group for a new one
 - GF group index for key frame special treatment is not needed any more
   when key frame is decided by the RC

Bug: b/323050877
Change-Id: Iaf36ea4f671b833f3ba4c524b9799a3093412dfa
2024-02-13 18:01:20 +00:00
Peter Kasting 8cf26c1284 Backport thread-related changes from libaom.
This ports changes that touched aom_thread.[c,h] from the time after
libaom copied libvpx' sources, where they hadn't already been present.
The goal here is to unify the two repos' thread implementations in hopes
of ultimately sharing one.

The list of commits is approximately as follows; however, I made a few
other changes as necessary where noted.
https://aomedia-review.googlesource.com/c/aom/+/64044
  Edited other hook func return values similarly.
https://aomedia-review.googlesource.com/c/aom/+/71321
https://aomedia-review.googlesource.com/c/aom/+/71327
https://aomedia-review.googlesource.com/c/aom/+/71436
https://aomedia-review.googlesource.com/c/aom/+/72481
  Also removed conflicting MAX_NUM_THREADS definition of 80. I think
  this was incorrect as the relevant array was indexed by variables that
  were in turn controlled by the global config values that were clamped
  to <=64.
https://aomedia-review.googlesource.com/c/aom/+/102621
  Also removed a pre-XP handling block in
  vp8/common/generic/systemdependent.c.
https://aomedia-review.googlesource.com/c/aom/+/102601
  MAX_DECODE_THREADS was the only relevant piece.
https://aomedia-review.googlesource.com/c/aom/+/102741
https://aomedia-review.googlesource.com/c/aom/+/109025
https://aomedia-review.googlesource.com/c/aom/+/160961
https://aomedia-review.googlesource.com/c/aom/+/169684
  Also removed OS/2 support from elsewhere.
https://aomedia-review.googlesource.com/c/aom/+/169823
https://aomedia-review.googlesource.com/c/aom/+/170022
https://aomedia-review.googlesource.com/c/aom/+/169685
https://aomedia-review.googlesource.com/c/aom/+/173761
https://aomedia-review.googlesource.com/c/aom/+/174842

Bug: none
Change-Id: I91462873a57e9efa120288d1bd8af3a6c09d423d
2024-02-12 18:45:00 -08:00
Jonathan Wright 491c16a9f3 Merge horiz. and vert. passes in HBD Neon 2D avg convolution
The current Neon approach to 2D convolution is:
1) Filter horizontally, storing to an intermediate buffer.
2) Filter vertically, average with the dst block and store the final
   output.

This patch merges the two phases for high bitdepth 2D convolution to
avoid the storing and re-loading from the intermediate buffer. This
provides a small gain (<5%) for large block sizes but the benefit
increases for small block sizes - as the proportion of compute to
memory access decreases. These effects are amplified further when
considering little (in-order) core performance.

Change-Id: I84f1cafcfbbfa48b2cfe4b20881da9c4bc3b56ac
2024-02-12 13:22:19 +00:00
Jonathan Wright 364326c37f Merge horiz. and vert. passes in HBD Neon 2D convolution
The current Neon approach to 2D convolution is:
1) Filter horizontally, storing to an intermediate buffer.
2) Filter vertically and store the final output.

This patch merges the two phases for high bitdepth 2D convolution to
avoid the storing and re-loading from the intermediate buffer. This
provides a small gain (<5%) for large block sizes but the benefit
increases for small block sizes - as the proportion of compute to
memory access decreases. These effects are amplified further when
considering little (in-order) core performance.

Change-Id: I8ec13fb9edd642fdb927bf5394a3c2a349d22a29
2024-02-12 12:58:45 +00:00
Jonathan Wright 58731e2b7a Specialise highbd Neon 2D horiz convolution for 4-tap filters
Add a highbd Neon implementation of the horizontal portion of 2D
convolution specialised for executing with 4-tap filters. This new
path is also used when executing with bilinear (2-tap) filters.

Change-Id: I513e35c4f8857bc89e0def5e9402bc31ddd46440
2024-02-09 17:05:47 +00:00
Jonathan Wright 3127962e71 Specialise highbd Neon vert convolution for 4-tap filters
Add a highbd Neon implementation of vertical convolution specialised
for executing with 4-tap filters. This new path is also used when
executing with bilinear (2-tap) filters.

Change-Id: I30469c7b8e6ccff31d96588a3e4c21b401f1ed09
2024-02-09 15:37:51 +00:00
Jonathan Wright 70b14bf4dc Specialise highbd Neon horiz convolution for 4-tap filters
Add a highbd Neon implementation of horizontal convolution specialised
for executing with 4-tap filters. This new path is also used when
executing with bilinear (2-tap) filters.

Change-Id: Icabeea295af3e0bbeda755168996668cb960b0de
2024-02-09 15:37:51 +00:00
Jonathan Wright b1c9bbeaae Remove unneeded assert in vpx_filter.h
Filter tap reporting was made more granular recently[1] to enable Arm
Neon optimizations that specialise convolution implementations
according to the filter size. This patch removes an assert that
should have been removed during that change - it no longer serves any
purpose to assert that the filter being used is a no-op filter.

This change is a pre-requisite for some highbd Neon convolution
changes that specialise implementations according to filter size.
(Without this change a convolve-copy test would fail should we
interrogate the size of the filter.)

[1] https://chromium-review.googlesource.com/c/webm/libvpx/+/5063929

Change-Id: I2a71680d27134535e6c0663b1668ba1b150b1a6f
2024-02-09 15:29:18 +00:00
Jonathan Wright 00135942da Add 2D-specific highbd Neon horizontal convolution function
2D 8-tap convolution filtering is performed in two passes -
horizontal and vertical. The horizontal pass must produce enough
input data for the subsequent vertical pass - 3 rows above and 4 rows
below, in addition to the actual block height.

At present, all highbd Neon horizontal convolution algorithms process
4 rows at a time, but this means we end up doing at least 1 row too
much work in the 2D first pass case where we need h + 7, not h + 8
rows of output.

This patch adds an additional Neon path that processes h + 7 rows of
data exactly, saving the work of the unnecessary extra row.

Change-Id: Id6658b4e9e774effc760ff131e188b6907a57676
2024-02-09 10:38:12 +00:00
Jonathan Wright 08eb51bc1a Call scalar impl. immediately from HBD Neon 2D convolution
Call scalar C implementation of 2D convolution immediately if scaling
is required - instead of entering the Neon functions for the
horizontal and vertical passses and then falling back to the scalar
implementation. This has the benefit of being able to allocate a
smaller intermediate buffer.

Change-Id: Icacdd5f3a1401395951b613da1cd6932955bd0f8
2024-02-09 10:37:19 +00:00
Jonathan Wright ef3fd00c27 Refactor Neon highbd 2D convolution definitions and merge files
There's no reason for these files to be separate, and merging them
will make life easier in subsequent commits adding a horizontal pass
specialised for the first pass of 2D.

Also perform some refactoring for 2D convolution definitions:
- Add a comment deriving the intermediate buffer height.
- Align the intermediate buffers to 32 bytes.

Change-Id: Ib92524396e6f9c58295339de54d08d894ace3bd1
2024-02-08 17:49:45 +00:00
Jonathan Wright 72cc21e3a2 Refactor SBD Armv8.0 Neon horizontal convolve8 paths
Mostly a cosmetic change:
1) Remove forward declarations.
2) Remove excessive prefetches.

Change-Id: I88d42d8f9ee828c6c4095ffaec8e0333d776a4a1
2024-02-08 09:44:33 +00:00
Jonathan Wright 81ce6067cc Refactor SBD Armv8.0 Neon vertical convolve8 paths
Mostly a cosmetic change:
1) Remove forward declarations.
2) Remove excessive prefetches - some of which were wrong, prefetching
   data that had just been loaded.

Change-Id: I17d8accc2abf3a9b2050603f859fce588a1f7178
2024-02-08 09:44:19 +00:00
James Zern e32f9d4139 configure: remove profile from CONFIG_LIST
CONFIG_PROFILE is unused currently. The option can still be selected
because it is in the CMDLINE_SELECT list and interpreted by configure
directly.

Bug: webm:1835
Change-Id: Id9667289113335a10018803f578b255967bd60b1
2024-02-08 02:49:44 +00:00
James Zern 8408251f47 README,cosmetics: break some long lines / fix whitespace
+ normalize configure commands

Change-Id: Id21ebca85e7e8e4df128e986d4f4ec33c7f1483f
2024-02-07 19:24:45 +00:00
Jonathan Wright 96b64eaac5 Refactor Neon highbd_convolve8 kernels
Move narrowing shift and max value clipping into the 4-pixel-output
kernel. As well as cleaning up the code quite a bit, this also
improves performance by 5-10% as it eliminates the implied top /
bottom register shuffling of the previous approach.

Also clean up the formatting and magic numbers in the 8-pixel-output
kernel.

Change-Id: I77a5e9e317ef4097f187330d4b32973022ba573f
2024-02-07 00:35:02 +00:00
Jonathan Wright 18bc7ffe59 Optimize vpx_highbd_convolve8_horiz_avg_neon
Avoid transposes before and after convolution kernels by using extra
loads.

Change-Id: Iddee4752d7f9ed644502176ed863742fa77fe5a6
2024-02-07 00:34:19 +00:00
Jonathan Wright 04c8813a2c Optimize vpx_highbd_convolve8_horiz_neon
Avoid transposes before and after convolution kernels by using extra
loads.

Change-Id: I20c622fcb208e83534d604af50a58ba5ac472264
2024-02-07 00:33:55 +00:00
Wan-Teh Chang a0f3eb8ce4 Delete a useless clamp(q, x, y) call
In https://chromium-review.googlesource.com/c/webm/libvpx/+/71356, the
statement
  clamp(q, active_best_quality, active_worst_quality);
was added to rc_pick_q_and_bounds_two_pass() (recently renamed
vp9_rc_pick_q_and_bounds_two_pass()).

The result of the clamp() call is not used, so the clamp() call has no
side effect.

Fix Coverity CID 1577645 Useless call:
  side_effect_free: Calling
  clamp(q, active_best_quality, active_worst_quality) is only useful for
  its return value, which is ignored.

Change-Id: I014c3e4caf2bc999fe480000acc4e49e7ad15aaf
2024-02-06 23:31:27 +00:00
Jerome Jiang f5e1a0ab7e Include headers to fix clang-tidy complaints
Change-Id: I7fd2a10b4775e7e7fca49339832c257d84d99e33
2024-02-06 22:26:21 +00:00
Jonathan Wright 6c00356485 Refactor vpx_highbd_convolve8_avg_vert_neon
Various bits of tidying up to make the code more compact:
- Use appropriate load/store helper functions from mem_neon.h.
- Remove variable forward declarations.
- Use != 0 instead of > 0 in loop termination tests.
- Remove excessive prefetches.

Change-Id: I114cf4d2a34f02acc130558d125d2c191c6c5992
2024-02-06 21:24:27 +00:00
Jonathan Wright 01edfb3df4 Refactor vpx_highbd_convolve8_vert_neon
Various bits of tidying up to make the code more compact:
- Use/create appropriate mem_neon.h load/store helper functions.
- Remove variable forward declarations.
- Use != 0 instead of > 0 in loop termination tests.
- Remove excessive prefetches.

Change-Id: Ida7d3c4a3fe084600417f196baa26501c6e2d45a
2024-02-06 21:24:27 +00:00
Jonathan Wright de7883604f Init using 0-vector instead of load-broadcast in mem_neon.h
Initialise result vectors of mem_neon.h helpers with vdup_n_<type>(0)
instead of load-broadcast of the first loaded elements. The former is
more easily optimized by modern compilers.

Change-Id: If967e2bb55523670c3e433dd66d060665e13b4f2
2024-02-05 21:40:36 +00:00
Jonathan Wright a7a853c3a2 Remove stride width == 4 tests in mem_neon.h helpers
This condition is only ever true in unit tests. It does not benefit
real usage scenarios.

Change-Id: I0c1b09b0b371cfe99ba1e26aba57740a67434070
2024-02-05 21:40:36 +00:00
Jonathan Wright 4084250ccd Align intermediate buffers for 2D Neon convolutions
Align the intermediate buffers to 32 bytes and always use a stride of
64, regardless of the actual data block width.

Change-Id: I738eaa711168bc8231d8ac54d9e5e5e87b62e703
2024-02-05 21:40:36 +00:00
Zoltan Kuscsik c6a8fa27b7 Added documentation on PGO for optimization analysis
tools/README.pgo.md: documentation added

Bug: webm:1835
Change-Id: Iad72ca63fd143a1c36c7347f723578d11158e81b
2024-02-05 21:12:01 +00:00
Zoltan Kuscsik 7eec109a83 Add profile guided optimization support
Tested on x64/ARM64.

To generate a new profile
$ export CC=clang
$ export CXX=clang++
$ ./libvpx/configure --enable-encode_perf_tests --enable-profile

Using the profile:

$  make clean
$  llvm-profdata  merge  -o perf.profdata  default_xxx_0.profraw
$ ./libvpx/configure --use-profile=perf.profdata

Bug:webm:1835
Change-Id: I8ab53fef1f8e2cc98c3b0f5c0f50eece5466965d
2024-02-05 21:12:01 +00:00
Gerda Zsejke More 58fb0f1d27 Add SVE implementation of vp9_block_error_fp
Add SVE implementation of vp9_block_error_fp function. Add the
corresponding tests as well.

Change-Id: I81f4b11bd2f1d0b9f377553bb9298d735308da30
2024-02-05 21:09:18 +00:00
Gerda Zsejke More a9d91d7a0a Add SVE implementation of vp9_block_error function
Add SVE implementation of vp9_block_error function. Add the
corresponding tests as well.

Change-Id: Iebba73ee845855be939e120326e1005237230c2a
2024-02-05 21:09:18 +00:00
Gerda Zsejke More 075569f3a5 Add SVE implementation of vpx_sum_squares_2d_i16
Add SVE implementation of vpx_sum_squares_2d_i16 function. Add the
corresponding test as well.

Change-Id: If3b31c9882e2b7bed0106011efb0bb5522de7008
2024-02-05 21:09:18 +00:00
Jerome Jiang 1258773dc2 Ext rc: Remove max_frame_size from frame decision
Add rdmult to the frame decision as RC can return this information, and
we may want to use it in the future.

Bug: b/323234722
Change-Id: I8ddb7038073d89af1ef84932448b1abaf1937cee
2024-02-05 14:21:29 +00:00
Wan-Teh Chang a9bd789d24 Delete #if USE_PARTIAL_COPY code
The USE_PARTIAL_COPY macro was added in
https://chromium-review.googlesource.com/c/webm/libvpx/+/51505 and
the location of #if USE_PARTIAL_COPY was slightly adjusted in
https://chromium-review.googlesource.com/c/webm/libvpx/+/73600.

Delete the unused function vp9_copy_and_extend_frame_with_rect().

Change-Id: I160b312177ba2fabbea2638172af37f8144d60b1
2024-02-02 18:16:09 +00:00
James Zern fecaf72c30 vp9_scale_and_extend_frame_ssse3: fix uv width/height
Use uv_crop_(width|height). This fixes an issue with 1 to 2 scaling from
1x1 where the unrounded value would go to zero, resulting in a heap
overflow. This path is only executed when the library is built without
--enable-vp9-highbitdepth.

Bug: b:319964497
Change-Id: I9cb6632f864ec54c045608af86aede20657d6253
(cherry picked from commit 7ad5f4f695)
2024-02-02 01:35:18 +00:00
James Zern d10bdbcc40 encode_api_test,RandomPixelsVp8: fix stack overflow
Observed when built using Visual Studio 2019.

Move 720P image allocation to the heap.

Bug: webm:1831
Change-Id: I4e343af08d2f282618ad1b328a39d7dba5e79654
(cherry picked from commit 43e1c8bf10)
2024-02-02 01:35:04 +00:00
Marco Paniconi 4f94206a53 vp8: Fix to integer division by zero and overflow
This can happen in the setting of the frame
target size for delta frames, for non-CBR mode
(end_usage != USAGE_STREAM_FROM_SERVER) and with
temporal layers.

In calc_pframe_target_size(): the percent_high
(factor to adjust the target_size) may end up dividing
bits_off_target by total_byte_count. The total_byte_count
is define per layer for temporal layers, so it will be zero
for delta frames if the enhancement layer has never been
encoded before.

Since percent_high is capped to over_shoot_pct, the proposed
fix is to apply this cap if total_byte_count is zero.
Also this CL fixes a few integer overflow issues in setting
the layer target_bandwidth, the recale function, and in
setting target_bits_per_mb.

Unittest is added by Wan-Teh which triggers this issue.

Bug: chromium:1514684

Change-Id: I091158e720ece75d7ab9b7c4d18d30a5783102ab
(cherry picked from commit 43bd567950)
2024-02-02 01:34:49 +00:00
Marco Paniconi 9b913654e8 Fix to integer overflow in vp8 encodeframe.c
Unit test added.

Bug:webm:1831

Change-Id: Ib85f4f0fbdbebc0b49555f206a36376cea687df6
(cherry picked from commit 193b151195)
2024-02-02 01:34:23 +00:00
Wan-Teh Chang 105bc8ff18 Make encoder know frame size increase from config
Equivalent to the change to av1_change_config() in the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/182413.

Because we call alloc_compressor_data() only if
cm->mi_alloc_size < new_mi_size, this change won't cause
alloc_compressor_data() to be called unnecessarily, unlike the libaom
bug https://crbug.com/aomedia/3526.

Bug: b:317105128
Change-Id: I8a772a1d5c4766846641a6d541a6d861bf76c60f
(cherry picked from commit aef73b22cb)
2024-02-02 01:26:45 +00:00
Jonathan Wright c35f3e9e35 Cosmetic: Refactor Arm Neon i8mm convolution functions
Tweak some comments and remove forward declarations.

Change-Id: I6eb01621cee838f29981853ee1ef615947e05563
2024-02-02 00:24:06 +00:00
Jonathan Wright 224f2dc82a Refactor Arm Neon DotProd convolution functions
This change was intended to be cosmetic in that it tweaks some
comments, removes forward declarations and moves some constant
declarations into the kernels where they're used. However, it also
adds some performance for 8-tap vertical convolution paths as it
appears removing forward declarations also removes some false loop-
carried dependencies that the compiler wasn't able to figure out.

Change-Id: Ic58658b10fbe8378062920199819359d2df008de
2024-02-02 00:24:06 +00:00
Jerome Jiang 3b1039c822 Rewrite ext RC test
The updated test will validate the QP / frame type / ARF settings by the
rate controller and callbacks, making sure the callbacks are working as
expected.

Removed the old tests that verify the signals from the encoder, which
are not needed any more.

Change-Id: Ida3c484e2ac520f3e81358d7cbf7918abfdaca54
2024-02-01 12:29:35 -05:00
Jerome Jiang 9aefcb317a Ext RC: remove gop_info parameter
This is not used any more

Bug: b/323234722

Change-Id: I74bbed38a4a23f2aec8e05413754565e67437e9e
2024-01-31 21:08:07 -05:00
Jerome Jiang 91bc8ec56a vp9: Set VPX_FRAME_IS_INVISIBLE for no show frame
Detect if the frame is arf without parsing bitstream.

Change-Id: I3dd70369ef156508624c45591302b682b1785fa8
2024-01-31 21:07:07 -05:00
Jerome Jiang 861981b135 Allow external RC to control key frame
Disable some tests because they rely on vpx_rc_gop_info_t
which isn't populated when the callback is used for key frame

This parameter will be deleted / cleaned up in the follow-up.

Bug: b/323050877
Change-Id: If1c0476eac8d324c8d5a460bfc9afdb6d93aacdf
2024-01-31 22:17:06 +00:00
Jerome Jiang 56b67113d0 Move vp9_estimate_qp_gop to vp9_tpl_model.c
vp9_estimate_qp_gop is only used for TPL
Rename to vp9_estimate_tpl_qp_gop

Change-Id: I87246f72e90174bf3ba3bf8e1061f5f31edfddff
2024-01-30 19:49:31 -05:00
Jerome Jiang 8630b18323 Fix gf group index used in TPL pass for WebM RC
Change-Id: Ia781caf2de550015822ef5c954b36314ba8a2942
2024-01-30 23:19:00 +00:00
James Zern 7ad5f4f695 vp9_scale_and_extend_frame_ssse3: fix uv width/height
Use uv_crop_(width|height). This fixes an issue with 1 to 2 scaling from
1x1 where the unrounded value would go to zero, resulting in a heap
overflow. This path is only executed when the library is built without
--enable-vp9-highbitdepth.

Bug: b:319964497
Change-Id: I9cb6632f864ec54c045608af86aede20657d6253
2024-01-30 20:50:56 +00:00
James Zern ef09c2e1a4 vp9_encoder.c: make vp9_svc_twostage_scale static
Change-Id: Id7fa33df3f45e46968869ec15a6892c79aac263c
2024-01-30 20:50:33 +00:00
James Zern adac06ace7 vp9_scale_references: condense hbd #if
Change-Id: I6f1d85885b6bdced3925f86da0a60421a6058e91
2024-01-30 20:50:16 +00:00
Jonathan Wright 6cf6e1f082 Simplify Armv8.4 DotProd correction constant computation
Simplify the computation of the Armv8.4 DotProd convolution
correction constant. Summing 128 * filter_tap[0,7] is always the same
as 128 * 128 since the filter taps always sum to 128.

Change-Id: I227ba47ae47bed8304a695a2395bcc85f33c245c
2024-01-30 11:02:37 +00:00
Jonathan Wright fd6b80b153 Move Neon dotprod and i8mm convolution kernels into .c files
Move the convolution kernels using Armv8.4 dotprod and Armv8.6 i8mm
instructions into the respective .c files. These kernels are only used
in the respective .c files so it isn't useful for them to be declared
in a header.

This change also removes the need for feature-macro guarding - which
wasn't being done correctly for MSVC (since Microsoft's Arm
architecture feature macros are named differently to those defined by
GNU-compliant compilers.)

Bug: webm:1838
Change-Id: I495fca2a982c34978b6c9102f144bb9c45352a9a
2024-01-29 15:46:06 +00:00
Jonathan Wright 189c135d5d Merge Arm Neon dotprod and i8mm convolution files
Move the Arm Neon dotprod and i8mm 2D convolution functions into the
appropriate vpx_convolve8_neon_[dotprod|i8mm].c file. Only the
Armv7/Armv8.0 Neon files needed to be split in this way to allow
linking against a handwritten assembly implementation of the kernels
for Armv7 builds.

Change-Id: Ifc363556c3961aa78b9e53761537d4816c5b9964
2024-01-29 15:43:50 +00:00
James Zern 433577ae31 Update third_party/libwebm to commit affd7f4
This is one commit after the libwebm-1.0.0.31 tag:
affd7f4 In MakeUID(), call rand() under #ifdef _WIN32

Change-Id: I5979a8cd3b064d4f4f0dbeca9f84f6791e593b47
2024-01-27 03:02:10 +00:00
Marco Paniconi 2edd69749f Pass the aligned width/height in lookahead_push
Also use the crop_width/height for the setting
of larger_dimensions.

Change-Id: I6b3a3e49944b17f7b51f0705d7a95c2a43255f8c
2024-01-26 20:19:40 +00:00
Jonathan Wright fed0dfe965 Allow SVE variance functions to be called from Neon subpel var
Call indirect RTCD high bitdepth variance functions (instead of the
Neon functions) in the high bitdepth Neon subpel variance paths so
that faster SVE variance functions can be used on CPUs where SVE is
supported.

Change-Id: I04bdef235afac06f2100df0cbaccfb8caef41ac7
2024-01-26 00:16:50 +00:00
Gerda Zsejke More 33ef1caf2f Add SVE implementation of HBD get<w>x<h>var functions
Add SVE implementation of get<w>x<h>var functions for 8-, 10-, 12- bit
depth. Add the corresponding tests as well.

Change-Id: Id4feb8726a3eb0a963e3dd8932ee52374a67da48
2024-01-25 20:58:29 +00:00
Gerda Zsejke More 0e23256348 Enable HBDGetVariance test for different implementations
Enable HBDGetVariance test for Neon and SSE2 implementations.

Change-Id: I77dcf0243784e79b21d956f3899903e2e13a545a
2024-01-25 20:58:29 +00:00
Gerda Zsejke More c43ec846f3 Enable GetVariance test for different implementations
Enable GetVariance test for Neon, Neon Dotprod, SSE2, MSA, VSX
implementations.

Change-Id: Ia6f42af5ef99ad1bc6319cdb46e6bd164f7eea94
2024-01-25 20:58:29 +00:00
Gerda Zsejke More 95d0fcae01 Add unit tests for vpx_get<w>x<h>var functions
Add standard and high bitdepth unit tests for vpx_get<w>x<h>var
functions. Enable these unit tests for the C implementation.

Change-Id: I8716fd6a9718dab3eef218a8a60a1efd4c0e316c
2024-01-25 20:58:29 +00:00
Wan-Teh Chang 989c393b2b Initialize members in VP8/VP9RateControlRTC ctors
Fix Coverity defects CID 1568604 and CID 1568615 (Uninitialized pointer
field). Since the constructors are private and the Create() factory
methods set the cpi_ pointer field, these two Coverity defects are
harmless.

Define the constructors with "= default" instead of "{}".

Change-Id: Ie6b45fce66c23941a9a5c38ee0bccbc4b7d3a2a2
2024-01-24 14:04:24 -08:00
Gerda Zsejke More d84436d533 Add SVE implementation of HBD variance functions
Add SVE implementation of variance functions for 8-, 10-, 12- bit
depth. Add the corresponding tests as well.

Change-Id: I785d85760ad4346cbfbf0f842784b4945870afee
2024-01-22 22:57:55 +00:00
James Zern 0f6a274964 vp9_ratectrl.c: add missing include for INTER_LAYER_PRED_ON
clears a clang-tidy warning

Change-Id: Ie5b8825ef645658304252b0d0554cdefa3de26c2
2024-01-22 20:21:06 +00:00
James Zern 43e1c8bf10 encode_api_test,RandomPixelsVp8: fix stack overflow
Observed when built using Visual Studio 2019.

Move 720P image allocation to the heap.

Bug: webm:1831
Change-Id: I4e343af08d2f282618ad1b328a39d7dba5e79654
2024-01-22 18:04:03 +00:00
Wan-Teh Chang eeb1be7f23 Support VPX_IMG_FMT_NV12 in vpx_img_read/write
read_yuv_frame() supports VPX_IMG_FMT_NV12. Port its code to
vpx_img_read() and vpx_img_write().

The code in vp9/simple_encode.cc, including img_read(), doesn't support
VPX_IMG_FMT_NV12. Check before the vpx_img_alloc() calls and abort the
process if the image format is VPX_IMG_FMT_NV12.

Bug: chromium:1510090
Change-Id: Ie77e29c2c9ee7a01e6a59c8ad3cbcc769d9f2d4c
2024-01-19 11:48:46 -08:00
Wan-Teh Chang d3a946de8c Make img_alloc_helper() fail on VPX_IMG_FMT_NONE
If fmt is VPX_IMG_FMT_NONE, currently img_alloc_helper() allocates a
single plane because VPX_IMG_FMT_NONE (0) is not a planar format (the
VPX_IMG_FMT_PLANAR bit is not set in VPX_IMG_FMT_NONE).

Although this seems correct, the problem is that most of the code in
libvpx assumes planar formats and is likely to dereference a null
pointer when it uses img->planes[1]. Also, VPX_IMG_FMT_NONE isn't really
a valid image format. So it is safer to make img_alloc_helper() fail if
fmt is VPX_IMG_FMT_NONE.

Change-Id: I05b47f4b5eceb631a02384b2cce1c2f6fdca8673
2024-01-19 19:43:26 +00:00
James Zern db6a5c09ce README: remove library version
This often falls out of sync with the release and the version is already
contained in CHANGELOG.

Bug: webm:1833
Change-Id: Ieee6ca40249bf6e77037fbec30d87b109ca8fe21
2024-01-18 15:44:19 -08:00
Jerome Jiang 2dead7118a Merge tag 'v1.14.0'
Release v1.14.0 Venetian Duck

2024-01-18 v1.14.0 "Venetian Duck"

  This release drops support for old C compilers, such as Visual Studio 2012
  and older, that disallow mixing variable declarations and statements (a C99
  feature). It adds support for run-time CPU feature detection for Arm
  platforms, as well as support for darwin23 (macOS 14).

  - Upgrading:
    This release is ABI incompatible with the previous release.

    Various new features for rate control library for real-time: SVC parallel
    encoding, loopfilter level, support for frame dropping, and screen content.

    New callback function send_tpl_gop_stats for vp9 external rate control
    library, which can be used to transmit TPL stats for a group of pictures. A
    public header vpx_tpl.h is added for the definition of TPL stats used in
    this callback.

    libwebm is upgraded to libwebm-1.0.0.29-9-g1930e3c.

  - Enhancement:
    Improvements on Neon optimizations: VoD: 12-35% speed up for bitdepth 8,
    68%-151% speed up for high bitdepth.
    Improvements on AVX2 and SSE optimizations.
    Improvements on LSX optimizations for LoongArch.
    42-49% speedup on speed 0 VoD encoding.
    Android API level predicates.

  - Bug fixes:
    Fix to missing prototypes from the rtcd header.
    Fix to segfault when total size is enlarged but width is smaller.
    Fix to the build for arm64ec using MSVC.
    Fix to copy BLOCK_8X8's mi to PICK_MODE_CONTEXT::mic.
    Fix to -Wshadow warnings.
    Fix to heap overflow in vpx_get4x4sse_cs_neon.
    Fix to buffer overrun in highbd Neon subpel variance filters.
    Added bitexact encode test script.
    Fix to -Wl,-z,defs with Clang's sanitizers.
    Fix to decoder stability after error & continued decoding.
    Fix to mismatch of VP9 encode with NEON intrinsics with C only version.
    Fix to Arm64 MSVC compile vpx_highbd_fdct4x4_neon.
    Fix to fragments count before use.
    Fix to a case where target bandwidth is 0 for SVC.
    Fix mask in vp9_quantize_avx2,highbd_get_max_lane_eob.
    Fix to int overflow in vp9_calc_pframe_target_size_one_pass_cbr.
    Fix to integer overflow in vp8,ratectrl.c.
    Fix to interger overflow in vp9 svc.
    Fix to avg_frame_bandwidth overflow.
    Fix to per frame qp for temporal layers.
    Fix to unsigned integer overflow in sse computation.
    Fix to uninitialized mesh feature for BEST mode.
    Fix to overflow in highbd temporal_filter.
    Fix to unaligned loads w/w==4 in vpx_convolve_copy_neon.
    Skip arm64_neon.h workaround w/VS >= 2019.
    Fix to c vs avx mismatch of diamond_search_sad().
    Fix to c vs intrinsic mismatch of vpx_hadamard_32x32() function.
    Fix to a bug in vpx_hadamard_32x32_neon().
    Fix to Clang -Wunreachable-code-aggressive warnings.
    Fix to a bug in vpx_highbd_hadamard_32x32_neon().
    Fix to -Wunreachable-code in mfqe_partition.
    Force mode search on 64x64 if no mode is selected.
    Fix to ubsan failure caused by left shift of negative.
    Fix to integer overflow in calc_pframe_target_size.
    Fix to float-cast-overflow in vp8_change_config().
    Fix to a null ptr before use.
    Conditionally skip using inter frames in speed features.
    Remove invalid reference frames.
    Disable intra mode search speed features conditionally.
    Set nonrd keyframe under dynamic change of deadline for rtc.
    Fix to scaled reference offsets.
    Set skip_recode=0 in nonrd_pick_sb_modes.
    Fix to an edge case when downsizing to one.
    Fix to a bug in frame scaling.
    Fix to pred buffer stride.
    Fix to a bug in simple motion search.
    Update frame size in actual encoding.

Change-Id: I9c27fb2b917f9b80ed4bcc5cb3b4f87c56b62c2f
2024-01-18 12:33:56 -05:00
Gerda Zsejke More 71a5cb6e8a Add SVE implementation of HBD MSE functions
Add SVE implementation of MSE functions for 10-, 12- bit depth. Add
the corresponding tests as well.

An implementation was not added for 8 bit depth as the Neon DotProd
version is faster than the SVE implementation.

Change-Id: I0c5712ba2735a2879a0aa3a9a52980032fddc7a6
2024-01-17 21:38:54 +00:00
Marco Paniconi b95d175726 vp9-rtc: Fix to reset on scene change for temporal layers
Revert the part of the fix regarding temporal layers in:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5191480

Keep it as it was for now until further testing.

Change-Id: If747aabf907ba93cc20bcc067d2ca8f7758a91dd
2024-01-17 18:33:32 +00:00
Gerda Zsejke More e001eeb5bc Enable Neon Dotprod impl for HBD MSE
Enable Neon Dotprod 8-bit high bitdepth implementation for MSE
function as it is now not called with bit depth 10 or 12.

Bug: webm:1819
Change-Id: I9d1d506401aa0523fba2d8ea4978dc00fdacbb95
2024-01-17 18:15:17 +00:00
Gerda Zsejke More 41e0655e5e Fix highbd_get_block_variance_fn input parameter
Instead of always calling highbd_get_block_variance_fn with bit depth
8 use the macroblock's bit depth.

Bug: webm:1819
Change-Id: Ib4b19703384e897ee9ffeef73a11a8af2d262558
2024-01-17 18:15:17 +00:00
Marco Paniconi 25f03e456f vp9-svc: Fix to sample encoder for mismatch check
Don't check for mismatch for superframes whose
top spatial layer resolution was dropped.

Change-Id: I0abef43a710f0fb52ba2490fc784e57cda9952a0
2024-01-16 19:49:06 +00:00
Marco Paniconi cc306fac74 vp9-svc: Fix to max-q on scene change for svc
For svc with no inter-layer prediction: reset
the RC and force max_qp on all spatial layers
on scene/slide changes. In the current code it was only
reset on current spatial layer because it was assumed
we can predict off lower spatial layer to avoid
prediction across scene change. But this does not apply
when inter-layer prediction is off on delta frames.

Also reset only up to current temporal layer.
Because of the hierarchical prediction structure
only the lower temporal layers need the RC to be reset.

This helps to reduce excessive frame drops for the
full_superframe_drop mode.

Change-Id: I76925681850b82aa7fff7f9b1c1a0a605cf3cf3b
2024-01-16 15:49:10 +00:00
Jerome Jiang 8aeb5848a5 Do not use TPL QP from RC for final encoding
Bug: b/316610379
Change-Id: Ie7c6f8be0132602155102a72a16b2ee94c1c3dbd
2024-01-12 21:26:42 +00:00
James Zern 0eecce72b2 vp9_quantize.c: add missing include for get_msb()
clears a clang-tidy warning

Change-Id: Iaf58775084e758246a8fe0a4828ae954ea95f5b1
2024-01-11 16:33:53 -08:00
James Zern 0a91e18eca vp8_datarate_test.cc: add missing include
for VPX_CODEC_USE_PSNR. This clears a clang-tidy warning. vpx_encoder.h
exports vpx_codec.h so it shouldn't be necessary.

Change-Id: I863b6f8689eeef59cd9eadf3cdc177247a0653f8
2024-01-11 16:30:21 -08:00
Marco Paniconi 43bd567950 vp8: Fix to integer division by zero and overflow
This can happen in the setting of the frame
target size for delta frames, for non-CBR mode
(end_usage != USAGE_STREAM_FROM_SERVER) and with
temporal layers.

In calc_pframe_target_size(): the percent_high
(factor to adjust the target_size) may end up dividing
bits_off_target by total_byte_count. The total_byte_count
is define per layer for temporal layers, so it will be zero
for delta frames if the enhancement layer has never been
encoded before.

Since percent_high is capped to over_shoot_pct, the proposed
fix is to apply this cap if total_byte_count is zero.
Also this CL fixes a few integer overflow issues in setting
the layer target_bandwidth, the recale function, and in
setting target_bits_per_mb.

Unittest is added by Wan-Teh which triggers this issue.

Bug: chromium:1514684

Change-Id: I091158e720ece75d7ab9b7c4d18d30a5783102ab
2024-01-11 23:05:47 +00:00
Gerda Zsejke More aeb4928c68 Add SVE 16-bit dot-product helper functions
Add header file containing helper functions to make use of SVE
dot-product intrinsics via the Neon-SVE bridge.

Change-Id: I6cd198f8202559672817cbc19f890db35c03d3ff
2024-01-11 19:26:52 +00:00
Salome Thirot 0801bfca3f Add -flax-vector-conversions=none to Clang builds
GCC already does not allow implicit vector type conversions by default,
add -flax-vector-conversions=none to Clang builds to have the same
behavior.

Change-Id: I9d1adb836377077cf48818c80fe71025e2d2bdc7
2024-01-11 18:26:28 +00:00
Zoltan Kuscsik e03c9d2a62 Update of get_files.py to support Python3.x
Change-Id: I92aeb2a060338bdfc0083602b837b99181a8421c
2024-01-11 18:25:39 +00:00
Gerda Zsejke More 6ea3b51ec2 Require Arm Neon-SVE bridge header for enabling SVE
Disable SVE feature if arm_neon_sve_bridge header is not supported
by the compiler.

Change-Id: I3f78be2dd95b37b8d51b9f1fceca1f9701535eca
2024-01-11 18:23:46 +00:00
Marco Paniconi 756b29a776 vp8: Fix to race issue for multi-thread with pnsr_calc
Added unitest which triggers the data race in the
bug below, when only C code is forced.

The data race is between the loopfilter and variance
computation from generate_psnr_packet calculation.
Proposed fix is to move the wait for loopfilter thread to
finish up before entering generate_psnr_packet().

Bug: b/266833179.

Change-Id: Id2871c53274be0f404e65601c9a5c98aaead0c72
2024-01-11 00:07:08 +00:00
Wan-Teh Chang aef73b22cb Make encoder know frame size increase from config
Equivalent to the change to av1_change_config() in the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/182413.

Because we call alloc_compressor_data() only if
cm->mi_alloc_size < new_mi_size, this change won't cause
alloc_compressor_data() to be called unnecessarily, unlike the libaom
bug https://crbug.com/aomedia/3526.

Bug: b:317105128
Change-Id: I8a772a1d5c4766846641a6d541a6d861bf76c60f
2024-01-10 21:42:45 +00:00
Wan-Teh Chang c5f8086709 Move VPX_TPL_ABI_VERSION to the ext RC ABI version
The VpxTpl* structs defined in vpx_tpl.h are only used by the external
rate control library. Add a VPX_TPL_ABI_VERSION component to
VPX_EXT_RATECTRL_ABI_VERSION and remove the VPX_TPL_ABI_VERSION
component from VPX_ENCODER_ABI_VERSION.

The current value of VPX_TPL_ABI_VERSION is 2. It is subtracted from
VPX_EXT_RATECTRL_ABI_VERSION and added to VPX_ENCODER_ABI_VERSION so
that the values of those two macros stay the same.

Add a note to explain why VPX_ENCODER_ABI_VERSION has a
VPX_EXT_RATECTRL_ABI_VERSION component.

Change-Id: I680b8522dc04328cd51df6de590fdec75ca88ae8
2024-01-09 21:18:26 +00:00
Jerome Jiang 602e2e8979 Fix a typo in changelog for v1.14.0
Bug: webm:1833
Change-Id: I8133f678cf4231f2d048c61e42622b883897712a
2024-01-09 16:12:22 -05:00
Hari Limaye 469150a922 configure: add -arch flag when targeting darwin23
Commit db83435 introduced support for configuring for *-darwin23-gcc.
However configuring for *-darwin23-gcc does not currently add the
`-arch` flag to CFLAGS/LDFLAGS, so correct this here.

Change-Id: Ieeda1a5039ad40590dfcdcc6ba615a1d1697d54d
2024-01-09 20:38:58 +00:00
Jerome Jiang 2b1f6859f6 Update version
Before release:
c-a=8, a=0, r=1 -> c=8, a=0, r=1

After release:
 - If the library source code has changed at all since the last
   update, then increment revision:
   c=8, a=0, r=r+1=2

 - If any interfaces have been added, removed, or changed since
   the last update, increment current, and set revision to 0:
   c=c+1=9, a=0, r=0

 - If any interfaces have been added since the last public release,
   then increment age:
   c=9, a=a+1=1, r=0

 - If any interfaces have been removed or changed since the last
   public release, then set age to 0:
   c=9, a=0, r=0 (VpxTpl* structure changes)

MAJOR=c-a=9
MINOR=a=0
PATCH=r=0

Bug: webm:1833
Change-Id: Id24c9a0ff415a6f625d17b6098cdd0baf27432e3
2024-01-09 09:18:35 -05:00
Jerome Jiang e32df98af5 Update changelog with vp9 ext rc
Bug: webm:1833
Change-Id: I7e6e1da7965f098c8b62c55a09619d0bf703b516
2024-01-04 15:23:38 -05:00
Jerome Jiang 3b3e8b5f29 vp9 ext rc: if->assert, more comment for TPL ctrl
Change if to assertion in vp9_extrc_get_encodeframe_decision

Clarify comment for VP9E_ENABLE_EXTERNAL_RC_TPL that
rc_type | VPX_RC_QP must be non zero for this control to work.

Change-Id: I2c54cf7eda1f0f12f4ff7ac929e8e6a1fdd2215d
2024-01-04 13:49:03 -05:00
Jerome Jiang 1474e3c729 Return error if TPL related interface isn't set
Bug: b/316610379
Change-Id: I391d9ef308a1c2d763b124e451ebb22a05060102
2024-01-03 11:11:41 -05:00
Jerome Jiang 22b17dc3fb Update changelog
Bug: webm:1833
Change-Id: I90ffd457cafe705a040f9a63b870da66c076076e
2024-01-02 16:05:38 -05:00
Jerome Jiang f6b7166a2b Clear some clang-tidy complaints
Change-Id: I749c0b2b97f26923fc5e1c1e46a1c017cf25823f
2024-01-02 15:08:06 -05:00
Jerome Jiang bd78034078 Add codec ctrl to control TPL by external RC
Bug: b/316610379
Change-Id: Ic18aad8da35436b3de81b9ddf9359407da522701
2024-01-02 15:24:00 +00:00
Matt Oliver 33ecc6cc5f project: Update for 1.13.1 merge. 2023-12-24 00:07:29 +11:00
Matt Oliver a5f55b5c6b Merge commit '10b9492dcf05b652e2e4b370e205bd605d421972' 2023-12-24 00:01:10 +11:00
Zoltan Kuscsik 655da33b89 Use get_lsb in vp9 and vp8 invert_quant functions
Performance optimization. get_msb utilizes
the compiler/platform specific last significant bit
operator.

Note: 32 bit unsigned assumed, like all get_msb implementations do.
Change-Id: Ib013ad24aa0ea845efeb52aacd448b067edf91da
2023-12-21 14:23:04 +01:00
Jerome Jiang b8b6b4d4cc Remove VP9E_GET_TPL_STATS
This is never used.
A callback in external rc func was added and used instead.

Change-Id: Iade6f361072f0c28af98904baf457d2f0e9ca904
(cherry picked from commit 41ced868a6)
2023-12-18 12:23:32 -05:00
Jerome Jiang 41ced868a6 Remove VP9E_GET_TPL_STATS
This is never used.
A callback in external rc func was added and used instead.

Change-Id: Iade6f361072f0c28af98904baf457d2f0e9ca904
2023-12-16 20:12:34 +00:00
Jerome Jiang d0e2c30aa4 Update AUTHORS and .mailmap
Bug: webm:1833
Change-Id: I4569b9dc1ec1c70458120bebc45b2c963796ed87
2023-12-14 23:52:30 +00:00
Hari Limaye 0c2314d82e configure: add -arch flag when targeting darwin23
Commit db83435 introduced support for configuring for *-darwin23-gcc.
However configuring for *-darwin23-gcc does not currently add the
`-arch` flag to CFLAGS/LDFLAGS, so correct this here.

Change-Id: Ieeda1a5039ad40590dfcdcc6ba615a1d1697d54d
2023-12-14 20:23:26 +00:00
Wan-Teh Chang df655cf4fb Clarify the comment for update_error_state()
Explain why the encoder init functions cannot call update_error_state().

In vp8/vp8_cx_iface.c, this comment should have been added in
https://chromium-review.googlesource.com/c/webm/libvpx/+/4506609.

Rewrite update_error_state() in vp8/vp8_cx_iface.c to look like the
versions in vp9/vp9_cx_iface.c and av1/av1_cx_iface.c (in libaom).

Change-Id: I3f153d67b8c549ca5ac8ea0cfbcaad4ae705c8e6
2023-12-13 20:46:29 +00:00
Wan-Teh Chang 4fe07a0c41 Return correct error after longjmp in vp8e_encode
After a longjmp() call in vp8e_encode(), call update_error_state() so
that we return the error code and error detail set by the
vpx_internal_error() call.

Change-Id: I1f2428eb1b1f61e46c02604e16a5d44dcf162479
2023-12-13 10:56:28 -08:00
Marco Paniconi 193b151195 Fix to integer overflow in vp8 encodeframe.c
Unit test added.

Bug:webm:1831

Change-Id: Ib85f4f0fbdbebc0b49555f206a36376cea687df6
2023-12-12 22:49:46 +00:00
Hari Limaye a75859c439 Remove redundant comment in convolve8_4_usdot
The function convolve8_4_usdot contains a comment relating to the
SDOT implementation of convolve8, which requires addition of a
correction constant to account for range clamp of the input values.

This is not performed in the i8mm USDOT implementation - so remove the
comment.

Also add some const qualifiers to function arguments.

Change-Id: I10aff560d20403897f708ee293bf873be9c35761
2023-12-12 20:21:38 +00:00
James Zern 7be2dadc76 README: update target list
Change-Id: I001179ce34b2bf2350dce5f0197b6be175ab1c37
(cherry picked from commit f9b7c85768)
2023-12-11 21:24:10 +00:00
Wan-Teh Chang 7e735cdf43 IWYU: include vp9_scale.h and vpx_codec.h
Fix the following clang-tidy misc-include-cleaner warnings:
vp9/encoder/vp9_encoder.c:
  no header providing "vp9_is_valid_scale" is directly included
  no header providing "VPX_CODEC_CORRUPT_FRAME" is directly included
vp9/vp9_cx_iface.c:
  no header providing "valid_ref_frame_size" is directly included

Change-Id: I20e846f5b14c42c72aaefec0718b4ae9c7eea44a
2023-12-09 00:15:40 +00:00
Wan-Teh Chang 3a88c0c204 Avoid dangling pointers in vp9_encode_free_mt_data
Set cpi->tile_thr_data and cpi->workers to NULL after freeing them.

Change-Id: I46fec5f08a6dd034c8d76828f4d546630442f216
2023-12-08 14:39:18 -08:00
Cheng Chen 6bb806b177 Update frame size in actual encoding
Issue explanation:
The unit test calls set_config function twice after encoding the
first frame.
The first call of set_config reduces frame width, but is still within
half of the first frame.
The second call reduces frame width even more, making is less than
half of the first frame, which according to the encoder logic,
there is no valid ref frames, and this frame should be set as a
forced keyframe. This leads to null pointer access in scale_factors
later.

Solution:
To make sure the correct detection of a forced key frame,
we need to update the frame width and height only when the actual
encoding is performed.

Bug: b/311985118

Change-Id: Ie2cd3b760d4a4b399845693d7421c4eb11a12775
(cherry picked from commit 1ed56a46b3)
2023-12-08 21:34:57 +00:00
Yunqing Wang 75d7727f58 Fix a bug in simple motion search
This change fixed a bug revealed by b/311294795.
In simple motion search, the reference buffer pointer needs to be
restored after the search. Otherwise, it causes problems while the
reference frame scaling happens. This CL fixes the bug.

Bug: b/311294795
Change-Id: I093722d5888de3cc6a6542de82a6ec9d601f897d
(cherry picked from commit 50ed636e49)
2023-12-08 19:29:46 +00:00
Jerome Jiang 36b2dec5ee Set pred buffer stride correctly
Bug: b/312875957
Change-Id: I2eb5ab86d5fe30079b3ed1cbdb8b45bb2dc72a1d
(cherry picked from commit 585798f756)
2023-12-08 19:29:46 +00:00
Bohan Li eba5ceb9d1 Improve test comments.
Change-Id: I42dddb946193e30cf07e39b43eaad051c5da479a
(cherry picked from commit 9ad598f249)
2023-12-08 02:12:17 +00:00
Marco Paniconi c884b2e60e Add unittest for issue b/314857577
Bug: b/314857577

Change-Id: I591036c1ad3362023686d395adb4783c51baa62d
(cherry picked from commit 12e928cb34)
2023-12-08 01:45:50 +00:00
Wan-Teh Chang 6fca4de48e Remove SSE code for 128x* blocks
The maximum block size is 64x64 in VP9.

Bug: webm:1819
Change-Id: If9802be9f81b51dbcdbc8a68d5afe48ca6d3d0e7
(cherry picked from commit c4c9208054)
2023-12-08 00:34:23 +00:00
Wan-Teh Chang 0d5811e4ef Use vpx_sse instead of vpx_mse to compute SSE
Use vpx_sse and vpx_highbd_sse instead of vpx_mse16x16 and
vpx_highbd_8_mse16x16 respectively to compute SSE for PSNR
calculations. This solves an issue whereby vpx_highbd_8_mse16x16
was being used to calculate SSE for 10- and 12-bit input.

This is a port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/175063
by Jonathan Wright <jonathan.wright@arm.com>.

Bug: webm:1819
Change-Id: I37e3ac72835e67ccb44ac89a4ed16df62c2169a7
(cherry picked from commit 7dfe343199)
2023-12-08 00:34:23 +00:00
James Zern c64a85d25a vp9_frame_scale.c,cosmetics: funnction -> function
Change-Id: I8ecbd52037ff096f5c84c834b193b0a34c55a8b7
(cherry picked from commit 2f258fdee1)
2023-12-07 20:45:55 +00:00
Jerome Jiang fa60c7d9c1 IWYU: include yv12config.h for YV12_BUFFER_CONFIG
Fix clang-tiday warning

Change-Id: Ic4d6739cb933a37168176f6b481afdfd2562acfc
2023-12-07 12:35:30 -05:00
Cheng Chen 1ed56a46b3 Update frame size in actual encoding
Issue explanation:
The unit test calls set_config function twice after encoding the
first frame.
The first call of set_config reduces frame width, but is still within
half of the first frame.
The second call reduces frame width even more, making is less than
half of the first frame, which according to the encoder logic,
there is no valid ref frames, and this frame should be set as a
forced keyframe. This leads to null pointer access in scale_factors
later.

Solution:
To make sure the correct detection of a forced key frame,
we need to update the frame width and height only when the actual
encoding is performed.

Bug: b/311985118

Change-Id: Ie2cd3b760d4a4b399845693d7421c4eb11a12775
2023-12-06 21:46:54 -08:00
Yunqing Wang 50ed636e49 Fix a bug in simple motion search
This change fixed a bug revealed by b/311294795.
In simple motion search, the reference buffer pointer needs to be
restored after the search. Otherwise, it causes problems while the
reference frame scaling happens. This CL fixes the bug.

Bug: b/311294795
Change-Id: I093722d5888de3cc6a6542de82a6ec9d601f897d
2023-12-06 16:50:33 -08:00
Wan-Teh Chang c3b821fd4a Add the needed Android API level predicates.
fseeko and ftello are available on Android only from API level 24. Add
the needed guards for these functions.

Suggested by Yifan Yang.

Change-Id: I3a6721d31e1d961ab10b434ea6e92959bd5a70ab
(cherry picked from commit bf07554183)
2023-12-06 18:57:20 -05:00
Jerome Jiang 585798f756 Set pred buffer stride correctly
Bug: b/312875957
Change-Id: I2eb5ab86d5fe30079b3ed1cbdb8b45bb2dc72a1d
2023-12-06 23:54:31 +00:00
Yunqing Wang ebca0ab6fa Fix a bug in frame scaling
This change fixed a corner case bug reealed by b/311394513.
During the frame scaling, vpx_highbd_convolve8() and vpx_scaled_2d()
requires both x_step_q4 and y_step_q4 are less than or equal to a
defined value. Otherwise, it needs to call vp9_scale_and_extend_
frame_nonnormative() that supports arbitrary scaling.

The fix was done in LBD and HBD funnctions.

Bug: b/311394513
Change-Id: Id0d34e7910ec98859030ef968ac19331488046d4
(cherry picked from commit 8bf3649d41)
2023-12-06 16:22:35 -05:00
Bohan Li ffd533161a Fix edge case when downsizing to one.
BUG: b/310329177
Change-Id: I2ebf4165adbc7351d6cc73554827812dedc4d362
(cherry picked from commit a9f1bfdb8e)
2023-12-06 16:22:24 -05:00
Angie Chiang 5d49fa1f01 Set skip_recode=0 in nonrd_pick_sb_modes
Need to set skip_recode properly so that
vp9_encode_block_intra() can work properly when it is
called by block_rd_txfm(). We can not skip "recode" because
it is still at the rd search stage.

Bug: b/310340241
Change-Id: I7d7600ef72addd341636549c2dad1868ad90e1cb
(cherry picked from commit f10481dc0a)
2023-12-06 16:22:10 -05:00
James Zern 2f258fdee1 vp9_frame_scale.c,cosmetics: funnction -> function
Change-Id: I8ecbd52037ff096f5c84c834b193b0a34c55a8b7
2023-12-06 20:45:12 +00:00
Wan-Teh Chang 476d02a2d2 Fix two clang-tidy misc-include-cleaner warnings
no header providing "CONFIG_VP9_HIGHBITDEPTH" is directly included
no header providing "VPX_BITS_8" is directly included

Change-Id: Ie6d78c79ab462501417f2b451bbe808a1fdce931
2023-12-06 19:17:08 +00:00
James Zern f9b7c85768 README: update target list
Change-Id: I001179ce34b2bf2350dce5f0197b6be175ab1c37
2023-12-06 19:11:27 +00:00
Bohan Li 6f1001a894 Fix scaled reference offsets.
Since the reference frame is already scaled, do not scale the offsets.

BUG: b/311489136, b/312656387
Change-Id: Ib346242e7ec8c4d3ed26668fa4094271218278ed
(cherry picked from commit 845a817c05)
2023-12-06 14:01:33 -05:00
Gerda Zsejke More a05cfd672d configure: Add darwin23 support
Add target arm64-darwin23-gcc, x86_64-darwin23-gcc for MacOS 14.

Change-Id: I6b68a6a61d51aaa78ec11a5055bb95ce77a81d9c
(cherry picked from commit db83435afb)
2023-12-06 13:57:50 -05:00
Wan-Teh Chang c4c9208054 Remove SSE code for 128x* blocks
The maximum block size is 64x64 in VP9.

Bug: webm:1819
Change-Id: If9802be9f81b51dbcdbc8a68d5afe48ca6d3d0e7
2023-12-05 14:29:37 -08:00
Wan-Teh Chang 7dfe343199 Use vpx_sse instead of vpx_mse to compute SSE
Use vpx_sse and vpx_highbd_sse instead of vpx_mse16x16 and
vpx_highbd_8_mse16x16 respectively to compute SSE for PSNR
calculations. This solves an issue whereby vpx_highbd_8_mse16x16
was being used to calculate SSE for 10- and 12-bit input.

This is a port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/175063
by Jonathan Wright <jonathan.wright@arm.com>.

Bug: webm:1819
Change-Id: I37e3ac72835e67ccb44ac89a4ed16df62c2169a7
2023-12-05 22:13:43 +00:00
Jerome Jiang 4c2435c33e Fix several clang-tidy complaints
Change-Id: I78721d6b7ed692ad9363b5cac4e3324a3136d5b6
2023-12-05 22:08:56 +00:00
Marco Paniconi 12e928cb34 Add unittest for issue b/314857577
Bug: b/314857577

Change-Id: I591036c1ad3362023686d395adb4783c51baa62d
2023-12-05 22:02:45 +00:00
Wan-Teh Chang 97184161d5 Add "IWYU pragma: export" to some public headers
vpx/vpx_integer.h is clearly intended as the facade header for the
Standard C Library headers <stddef.h>, <inttypes.h>, and <stdint.h>.

It is reasonable to expect that vpx/vpx_decoder.h and vpx/vpx_encoder.h
should provide the symbols from vpx/vpx_codec.h.

Change-Id: I220797e63b2efc3dd9e2ac197fe2f918bf80d247
2023-12-05 21:18:45 +00:00
Jerome Jiang 7cfc58de48 RTC RC: add screen content support for vp8
Bug: b/281463780
Change-Id: I446c00bf8d794aa9134e4fe37960dd8a465448a4
2023-12-05 19:53:38 +00:00
Yunqing Wang 8bf3649d41 Fix a bug in frame scaling
This change fixed a corner case bug reealed by b/311394513.
During the frame scaling, vpx_highbd_convolve8() and vpx_scaled_2d()
requires both x_step_q4 and y_step_q4 are less than or equal to a
defined value. Otherwise, it needs to call vp9_scale_and_extend_
frame_nonnormative() that supports arbitrary scaling.

The fix was done in LBD and HBD funnctions.

Bug: b/311394513
Change-Id: Id0d34e7910ec98859030ef968ac19331488046d4
2023-12-04 19:39:08 -08:00
Bohan Li 9ad598f249 Improve test comments.
Change-Id: I42dddb946193e30cf07e39b43eaad051c5da479a
2023-12-04 23:32:08 +00:00
Gerda Zsejke More db83435afb configure: Add darwin23 support
Add target arm64-darwin23-gcc, x86_64-darwin23-gcc for MacOS 14.

Change-Id: I6b68a6a61d51aaa78ec11a5055bb95ce77a81d9c
2023-12-04 21:22:35 +00:00
Angie Chiang f10481dc0a Set skip_recode=0 in nonrd_pick_sb_modes
Need to set skip_recode properly so that
vp9_encode_block_intra() can work properly when it is
called by block_rd_txfm(). We can not skip "recode" because
it is still at the rd search stage.

Bug: b/310340241
Change-Id: I7d7600ef72addd341636549c2dad1868ad90e1cb
2023-12-02 07:18:11 +00:00
Wan-Teh Chang 5dcb4c1740 Define VPX_DL_* macros as unsigned long constants
Define the VPX_DL_REALTIME, VPX_DL_GOOD_QUALITY, and VPX_DL_BEST_QUALITY
macros as unsigned long, because the deadline parameter of
vpx_codec_encode() is of the unsigned long type. This enables C++
templates to deduce the unsigned long type from these macros.

Change-Id: I2173e3bbf5e15c84c11843790df93a497a35ed7d
2023-12-02 02:28:57 +00:00
Wan-Teh Chang bf07554183 Add the needed Android API level predicates.
fseeko and ftello are available on Android only from API level 24. Add
the needed guards for these functions.

Suggested by Yifan Yang.

Change-Id: I3a6721d31e1d961ab10b434ea6e92959bd5a70ab
2023-12-02 02:13:32 +00:00
Wan-Teh Chang 378af160ff Merge "Document vpx_codec_decode() ignores deadline param" into main 2023-12-01 23:09:06 +00:00
Wan-Teh Chang 478df94cd2 Merge "Define vpx_enc_deadline_t type for encode deadline" into main 2023-12-01 22:57:58 +00:00
Bohan Li a9f1bfdb8e Fix edge case when downsizing to one.
BUG: b/310329177
Change-Id: I2ebf4165adbc7351d6cc73554827812dedc4d362
2023-12-01 13:44:56 -08:00
Wan-Teh Chang 967c59b190 Merge "Fix scaled reference offsets." into main 2023-12-01 21:40:42 +00:00
James Zern d5ec829489 Merge "CHANGELOG: add CVE for issue #1642" into main 2023-12-01 21:39:31 +00:00
James Zern a88e7d869f Merge changes Ic2c3cb30,I027eaf2d,I455e5a94,I8f7633d9,I5116d10d, ... into main
* changes:
  Specialise Armv8.0 Neon horiz convolution for 4-tap filters
  Specialise Armv8.0 Neon vert convolution for 4-tap filters
  Specialise Armv8.6 Neon 2D horiz convolution for 4-tap filters
  Specialise Armv8.6 Neon horiz convolution for 4-tap filters
  Specialise Armv8.4 Neon 2D horiz convolution for 4-tap filters
  Specialise Armv8.4 Neon horiz convolution for 4-tap filters
  Specialise Armv8.6 Neon vert convolution for 4-tap filters
  Specialise Armv8.4 Neon vert convolution for 4-tap filters
  Make reporting of filter sizes more granular
  Delete redundant code in Neon SDOT/USDOT vertical convolutions
2023-12-01 21:38:51 +00:00
James Zern 5cad6fdc92 CHANGELOG: add CVE for issue #1642
CVE-2023-6349 was reserved for this issue. It's not yet published.

Bug: webm:1642, b:302710624
Change-Id: Iaab2a0bcae449a45e35678f5c049413fe0a4d2a4
2023-12-01 13:10:58 -08:00
Wan-Teh Chang 070d7e5cf3 Document vpx_codec_decode() ignores deadline param
The changes in this CL show that both the VP8 and VP9 implementations of
the decode function eventually discard the deadline parameter. Change
the code to ignore the deadline parameter in vpx_codec_decode() without
passing it to the decode function, and document that the deadline
parameter is ignored and 0 should be passed.

Change-Id: Ia977e16cdbdf97901207aa2d749887980137c4c0
2023-12-01 09:55:23 -08:00
Bohan Li 845a817c05 Fix scaled reference offsets.
Since the reference frame is already scaled, do not scale the offsets.

BUG: b/311489136, b/312656387
Change-Id: Ib346242e7ec8c4d3ed26668fa4094271218278ed
2023-12-01 09:52:59 -08:00
Wan-Teh Chang 4e29b9638d Merge "Add a test for b/312517065" into main 2023-12-01 02:18:21 +00:00
Jonathan Wright d144e6e95d Specialise Armv8.0 Neon horiz convolution for 4-tap filters
Add an Armv8.0 MLA Neon implementation of horizontal convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: Ic2c3cb307b95964cd0ba86f1c42eece3a8ab7cf4
2023-12-01 00:20:00 +00:00
Wan-Teh Chang 15c2a9a02f Add a test for b/312517065
Bug: b/312517065
Change-Id: I6b5529a8e034fb0468f110e420fafb4944a19d0f
2023-11-30 16:10:04 -08:00
Jonathan Wright a33ac12dc0 Specialise Armv8.0 Neon vert convolution for 4-tap filters
Add an Armv8.0 MLA Neon implementation of vertical convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: I027eaf2d1bb9711c2217cc8aa6b1e379d3e66b26
2023-11-30 12:00:00 +00:00
Wan-Teh Chang b027590c30 Define vpx_enc_deadline_t type for encode deadline
The deadline parameter of vpx_codec_encode() is of the unsigned long
type. The cpplint runtime/int check and the clang-tidy
google-runtime-int warn about the use of the unsigned long type. Adding
a type alias works around this issue.

Note: vpx_codec_decode() also has a deadline parameter, but it is of the
long type. So unfortuntely this type alias cannot be simply named
vpx_codec_deadline_t and the name must suggest it is encoder-specific.

Change-Id: I27b6b25730b620b328422ec3f91e63fdc55b377a
2023-11-29 17:36:24 -08:00
James Zern 97a0d139ce psnr.h,cosmetics: fix a typo (PNSR -> PSNR)
Change-Id: I2adea9f150852c106acc57e5aeeac571d6bd15fb
2023-11-29 16:47:34 -08:00
Jerome Jiang 2f3d3726b2 Merge "vp9 rc: support screen content" into main 2023-11-29 23:52:13 +00:00
Jerome Jiang fe5dc9f7fc vp9 rc: support screen content
Bug: b/281463780
Change-Id: I23668550257b28031bdca0537459f93ec63f1e2e
2023-11-29 17:38:38 -05:00
Wan-Teh Chang 57b72fe807 Add VP9Encoder class to simplify fuzz test cases
Bug: b:306422625
Change-Id: I8344cb7fb4e1aee87d46f683746517cdcddf5c5d
2023-11-29 14:22:10 -08:00
Wan-Teh Chang 73e38df5d5 Merge "Adding "const" to vpx_codec_iface_t is redundant" into main 2023-11-29 20:04:24 +00:00
Hirokazu Honda 2faf9c3e0e Merge "ratectrl_rtc: Remove duplicated DropFrameReason enum class" into main 2023-11-29 01:32:21 +00:00
Wan-Teh Chang 59a3b791fb Merge "Add vpx_sse and vpx_highbd_sse" into main 2023-11-28 18:39:49 +00:00
Marco Paniconi 77b4614b2f Merge "rtc: Set nonrd keyframe under dynamic change of deadline" into main 2023-11-28 17:24:48 +00:00
Marco Paniconi adebf364cb rtc: Set nonrd keyframe under dynamic change of deadline
For realtime mode: if the deadline mode (good/best/realtime)
is changed on the fly (via codec_encode() call), force a
key frame and set the speed feature nonrd_keyframe = 1 to
avoid entering the rd pickmode.

nonrd_pickmode=0/off is the only feature in realtime mode that
involves rd pickmode, so by forcing it on/1 we can cleanly
separate nonrd (realtime) from rd (good/best), so we can
avoid possible issues on this dynamic mode switch, such as in
bug listed below.

Dynamic change of deadline, in particular for realtime mode,
involves a lot of coding/speed feature changes, so best to
also force reset with keyframe.

Added unitest that triggers the issue in the bug.
Bug: b/310663186

Change-Id: Idf8fd7c9ee54b301968184be5481ee9faa06468d
2023-11-27 20:08:57 -08:00
Wan-Teh Chang 72a57ebe48 Merge "Tests kf_max_dist in one-pass zero-lag encoding" into main 2023-11-27 23:54:04 +00:00
Marco Paniconi d7358ed53a Unitest for issue: b/310477034
Fix is made here:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5055827

Bug: b/310477034
Change-Id: Id1cc7a6a95e1ea5d1a022f36d7971915c9918339
2023-11-27 09:35:10 -08:00
Jonathan Wright cc89450a48 Specialise Armv8.6 Neon 2D horiz convolution for 4-tap filters
Add an Armv8.6 USDOT Neon path for the horizontal portion of 2D
convolution, specialised for executing with 4-tap filters (the most
common filter size for settings --good --cpu-used=1.) This new path
is also used when executing with bilinear (2-tap) filters.

Change-Id: I455e5a94bdcea1358025bd8e4d4c8c62e373aa5d
2023-11-27 16:44:02 +00:00
Jonathan Wright 0dc67ecf54 Specialise Armv8.6 Neon horiz convolution for 4-tap filters
Add an Armv8.6 USDOT Neon implementation of horizontal convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: I8f7633d9852ebfe8feb9b4a055715f849cccf297
2023-11-27 16:43:56 +00:00
Jonathan Wright 9cdb688919 Specialise Armv8.4 Neon 2D horiz convolution for 4-tap filters
Add an Armv8.4 SDOT Neon path for the horizontal portion of 2D
convolution, specialised for executing with 4-tap filters (the most
common filter size for settings --good --cpu-used=1.) This new path
is also used when executing with bilinear (2-tap) filters.

Change-Id: I5116d10ddb371ac2cf302ef905d06f2140dc7600
2023-11-27 16:43:21 +00:00
Jonathan Wright 68ef57f997 Specialise Armv8.4 Neon horiz convolution for 4-tap filters
Add an Armv8.4 SDOT Neon implementation of horizontal convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: Ib396681b3f7b8b0eeba94381fbe33a06cf7b4a13
2023-11-27 16:43:15 +00:00
Jonathan Wright 2f8e94715d Specialise Armv8.6 Neon vert convolution for 4-tap filters
Add an Armv8.6 USDOT Neon implementation of vertical convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: Ic893b25541e3317c5d5c270c338f868f080aed7c
2023-11-27 16:38:20 +00:00
Jonathan Wright bdc9e1c9d4 Specialise Armv8.4 Neon vert convolution for 4-tap filters
Add an Armv8.4 SDOT Neon implementation of vertical convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: I3eb00b5a34f5676b68bda60a2a29be56e3d7d0cd
2023-11-27 16:37:56 +00:00
Jonathan Wright 1b3ec0676c Make reporting of filter sizes more granular
vpx_get_filter_taps() currently reports either 8-tap or 2-tap.
However, many 8-tap filters are actually 0-padded, resulting in a
lot of redundant work (multiplying by, and adding, 0) when processing
using an 8-tap convolution function. In preparation for adding 2- and
4-tap SIMD implementations for the convolution paths, make the filter
size reporting more granular, stripping any 0 padding. Filter sizes
can now be reported as 2-, 4-, 6- or 8-tap.

Change-Id: I100133aac7173134af34b918c9ad3007d98d6060
2023-11-23 15:11:11 +00:00
Jonathan Wright 9b729500d5 Delete redundant code in Neon SDOT/USDOT vertical convolutions
Delete redundant transpose/permute code in the Neon dot-product
vertical convolution paths. Variable values were assigned but never
used before subsequent assignment.

Change-Id: I15b29d0c993f56599e0d18ac1d5787e6385d2a3a
2023-11-23 13:58:42 +00:00
Wan-Teh Chang 366425079b Tests kf_max_dist in one-pass zero-lag encoding
The test shows that the comment for kf_max_dist in vpx/vpx_encoder.h
differs from its behavior by one. We should modify the comment to match
the encoding behavior.

Bug: webm:1829
Change-Id: Icdc58b8f6b25353f10ce8ecc481c862bd3fe86df
2023-11-22 17:50:42 -08:00
Jingning Han 3bd54a37d0 Disable intra mode search speed features conditionally
When all the inter reference frames are invalid, disable the speed
features that bypass intra mode search.

BUG=b/312517065

Change-Id: I246c953fad3be61b9d307da11c752a21a36b90ff
2023-11-22 15:38:27 -08:00
Wan-Teh Chang 635eba3319 Adding "const" to vpx_codec_iface_t is redundant
vpx_codec_iface_t is defined as follows:

  typedef const struct vpx_codec_iface vpx_codec_iface_t;

Since vpx_codec_iface_t is already a const struct, it is redundant to
add "const" to vpx_codec_iface_t.

Note: I think vpx_codec_iface_t should not have been defined as a const
struct, but it is too late to change that now.

Change-Id: Ifbd3f8a63c1d48e9169ff77fa0b505ea1e65519d
2023-11-22 15:38:04 -08:00
Jingning Han b562fdd4e6 Remove invalid reference frames
Remove the reference frames whose scaling factor is not in the
supported range.

BUG=b/312517065

Change-Id: Iaf8610ff7a95cd4a433bf529f741459d820d4f8b
2023-11-22 15:07:04 -08:00
Jingning Han 79257fd459 Conditionally skip using inter frames in speed features
When the reference frame's scaling factor is not in the supported
range, skip using it for motion compensation prediction in the
partition speed features.

BUG=b/312517065

Change-Id: Ie3687186521ad2616be258e80d3e5b16e5f2d5e9
2023-11-22 15:06:29 -08:00
Jerome Jiang 741b8f6228 Check null ptr before use
prev_mi is a pointer to pointer

Bug: b/310401647
Bug: b/310590556
Change-Id: Ic3c39a7eec14693357bd2485a5451d4b7f031b5e
2023-11-21 16:33:50 -05:00
Hirokazu Honda 56c78a68b0 ratectrl_rtc: Remove duplicated DropFrameReason enum class
DropFrameReason is declared in two places. This moves it
to the common place.

Change-Id: I04c16db4a49135588edff7e1746dcf9172750bb9
2023-11-21 17:10:48 +09:00
James Zern b067e73e4b Merge "vp8_dx_iface.c: add include for MAX_PARTITIONS" into main 2023-11-21 02:21:20 +00:00
Wan-Teh Chang a8db542b24 Add vpx_sse and vpx_highbd_sse
The code is ported from libaom's aom_sse and aom_highbd_sse at
commit 1e20d2da96515524864b21010dbe23809cff2e9b.

The vpx_sse and vpx_highbd_sse functions will be used by vpx_dsp/psnr.c.

Bug: webm:1819
Change-Id: I4fbffa9000ab92755de5387b1ddd4370cb7020f7
2023-11-20 14:59:27 -08:00
James Zern 1231fce45e vp8_dx_iface.c: add include for MAX_PARTITIONS
fixes clang-tidy warning:
no header providing "MAX_PARTITIONS" is directly included

Change-Id: Iba7a9d95df7f5bdee76e7975df764cd71461fc93
2023-11-20 11:55:49 -08:00
James Zern 9f8776ff4a vp8_ratectrl_rtc.h: fix a few typos
is -> if
returns -> computes

in the documentation for ComputeQP().

This is the same as:
9142314c2 ratectrl_rtc.h: fix a few typos

+ remove a duplicate, commented out, version of GetLoopfilterLevel()

Change-Id: I8832e628b63b0b7dac6236631072f36ad55d90e8
2023-11-20 11:49:14 -08:00
James Zern 9142314c2c ratectrl_rtc.h: fix a few typos
is -> if
returns -> computes

in the documentation for ComputeQP().

Change-Id: If70706736b0dc2ae56e45e2489dc208c61fd557a
2023-11-15 18:55:04 -08:00
Marco Paniconi 81aaa7f04b rtc: Add frame dropper to VP8 external RC
Move some internal drop_frame code to separate
function so the external RC can use.
And add new flag setting under VP8E_SET_RTC_EXTERNAL_RATECTRL
to disable vp8_drop_encodedframe_overshoot() for
testing the external RC.

Unittest added for single layer and 3 temporal layers.

Bug: b/280363228

Change-Id: Ibea2f627cc54e7156ff35259a64dd111d42d146c
2023-11-14 12:06:17 -08:00
Wan-Teh Chang b7d847d0e7 Merge "Delete -Wdeclaration-after-statement" into main 2023-11-11 02:29:46 +00:00
Wan-Teh Chang e4127f591d Document how VP9 treats a negative speed value
Change-Id: I12948b08a7bb5beb5024b8676de9dafc239f8e89
2023-11-10 13:14:18 -08:00
Wan-Teh Chang d15a1970c1 Delete -Wdeclaration-after-statement
Older versions of MSVC do not allow declarations after statements in C
files. We don't need to support those versions of MSVC now.

Use -std=gnu99 instead of -std=gnu89.

Change-Id: I76ba962f5a2bca30d6a5b2b05c5786507398ad32
2023-11-09 19:23:24 -08:00
Wan-Teh Chang f05122d35c Fix ClangTidy warnings
Most are related to include-what-you-use. One is to avoid using the
unsigned long type explicitly (by passing VPX_DL_REALTIME directly to
vpx_codec_encode).

Change-Id: Ieaf3418382ad8516cb4b172f7678893286fcb8cf
2023-11-09 18:35:51 -08:00
Wan-Teh Chang 97833b61ee Merge "Declare oxcf arg of vp8_create_compressor as const" into main 2023-11-10 01:40:17 +00:00
Wan-Teh Chang ef3eb01269 Merge "Fix float-cast-overflow in vp8_change_config()" into main 2023-11-09 23:10:35 +00:00
Wan-Teh Chang 296784c83a Declare oxcf arg of vp8_create_compressor as const
Declare the oxcf parameters of vp8_create_compressor() and init_config()
as const. This helps code analysis.

Change-Id: I344ef3e6afc3adced2b2865b7e0057c6d4b1d3c0
2023-11-09 12:42:58 -08:00
Wan-Teh Chang 4e05c38c85 Document the units of VP8 target_bandwidth/bitrate
Change-Id: I6298a0acb4ef546ae198bb1f16dea50ed34b2dae
2023-11-09 12:38:05 -08:00
Wan-Teh Chang 7ab673a9f6 Fix float-cast-overflow in vp8_change_config()
Bug: b:309716574
Change-Id: I9c523d5e9211f895c7497a9e3674b55f6be6c742
2023-11-09 12:01:13 -08:00
Wan-Teh Chang 8a35c7e585 Merge "Use symbolic constant VPX_CBR instead of 1" into main 2023-11-08 02:40:28 +00:00
Wan-Teh Chang c732fa7070 Use symbolic constant VPX_CBR instead of 1
Change-Id: Idae94cfc6d7a882691deeb4fa3ce0015f80ed937
2023-11-07 15:47:43 -08:00
Jerome Jiang 11b98025c3 Merge "Check fragments count before use" into main 2023-11-07 15:26:52 +00:00
James Zern 5b8d24f678 configure: detect PIE and enable PIC
Fixes the creation of DT_TEXTREL entries in binaries built with PIE
enabled:
  /usr/bin/ld: warning: creating DT_TEXTREL in a PIE

This matches the changes made in libaom:
1df26009da aom_configure: only override CONFIG_PIC if not set on cmd line
7235e65746 aom_configure.cmake: detect PIE and set CONFIG_PIC

Change-Id: I0a43e964af2d8eb8c5e7811ce14ad39285eec3a8
2023-11-06 15:17:19 -08:00
Jerome Jiang 879c9bd906 Check fragments count before use
Bug: webm:1827
Change-Id: I8d603d5db92476222cbee1c61fece957ad03a49f
2023-11-03 16:24:46 -04:00
Yunqing Wang f08d238867 Merge "Modify C vs SIMD test script" into main 2023-11-03 16:11:24 +00:00
Anupam Pandey 1464d7738a Modify C vs SIMD test script
- Enable C vs SIMD test for x86 32-bit platform
- Correct a print message in run_tests()

BUG=webm:1800

Change-Id: Ib1ccd3a87a64b5ec6cde524a14d5d1b7e200abfb
2023-11-03 18:26:33 +05:30
Yunqing Wang 54fc6a7558 Merge "Add an emms instruction to vpx_subtract_block" into main 2023-11-02 03:39:59 +00:00
Marco Paniconi 0d3ef6ffd2 vp9-RC: Add drop_frame support to external RC
Supports single layer and svc. For svc only the
framedrop_mode = FULL_SUPERFRAME_DROP is allowed
for now.

Dropping frames due to overshoot is enabled by the
oxcf->drop_frames_water_mark, which is zero as default.
Note that this CL also allows for drop/skip encoding of
enhancement layers if that layer bitrate is zero.

max_consec_drop is also added, set to INT_MAX as default.
Note that max_consec_drop is only used for svc mode.
It has not been added yet for single layer in libvpx encoder.

Tests added for single layer and svc case.

Change-Id: Ic12f6a0eb3fbf07d8eb8456c46cec27b2e1930d3
2023-10-31 09:34:42 -07:00
James Zern cab1f4b9b2 Merge "calc_pframe_target_size: fix integer overflow" into main 2023-10-30 19:20:46 +00:00
James Zern 7b51f50205 Merge "Fix 'unused variable' warning when neon_i8mm is disabled" into main 2023-10-30 19:20:18 +00:00
Jonathan Wright 3f3576098f Fix 'unused variable' warning when neon_i8mm is disabled
Guard hwcap2 feature interrogation on HAVE_NEON_I8MM so that it gets
disabled if neon_i8mm is disabled when configuring the build.

Bug: webm:1825
Change-Id: Ic6ff71f17387b96219591928a583d43560bb7c7a
2023-10-30 15:53:18 +00:00
Xiahong Bao 61c927a4ed calc_pframe_target_size: fix integer overflow
The intermediate value in the target bandwidth
calculation may exceed integer bounds.

Bug: 308007926

Change-Id: I8288c5820db06a550d88bf91fccc86106996deaa
Signed-off-by: Xiahong Bao <xiahong.bao@nxp.com>
2023-10-30 11:17:17 +09:00
Marco Paniconi b759032a0e Clear some clang-tidy complaints on header includes
Change-Id: Id6f54dc4643172f6a5576dc4846c47c8eda31c0f
2023-10-27 10:02:46 -07:00
Jonathan Wright 6457f06529 Add Arm SVE build flags and run-time CPU feature detection
Add 'sve' arch options to the configure, build and unit test files -
adding appropriate conditional options where necessary. Arm SIMD
extensions are treated as supersets in libvpx, so disable SVE if
either Neon DotProd or I8MM are unavailable.

Change-Id: I39dd24f2b209251084d1e28d7ac68099460309bb
2023-10-24 10:42:06 +01:00
Jerome Jiang 974c14578c Merge "Reduce memory usage of test with large frame size" into main 2023-10-20 21:14:18 +00:00
Jerome Jiang 352f9f64df Reduce memory usage of test with large frame size
- Use smaller frame size that still triggers the overflow
 - Do not run encoder as the encoder init also triggers the overflow

Bug: chromium:1492864
Change-Id: I392549abf69f1cfb3754cc847a214513ec9bedc5
2023-10-20 15:40:43 -04:00
Wan-Teh Chang 9004ace978 Also test VPX_ARCH_AARCH64 for 64-bit platforms
Change-Id: Ic11ccd791ff78801e0aba1d12ad2d99b9941ce9d
2023-10-19 18:28:48 -07:00
Jerome Jiang 424723dc02 Run bitrate overflow test only on 64bit systems
Frame size caps the target bitrate internally, so the frame size needs
to be large enough to reproduce the target bitrate overflow in the
fuzzing test.

However the frame size needed exceeds the max buffer allowed on 32bit
system defined by VPX_MAX_ALLOCABLE_MEMORY

Bug: chromium:1492864

Change-Id: Ia3a9a78cd35516373897039a7769b492e29e8450
2023-10-19 11:36:47 -04:00
Jerome Jiang e4db6c3aac Cap avg_frame_bandwidth at INT_MAX
avg_frame_bandwidth = target_bandwidth / framerate

If target_bandwidth is too big and/or framerate is too small (< 1),
avg_frame_bandwidth could be overflow

Bug: chromium:1492864
Change-Id: I32314da1414b472ae4bf2acdcd81b8a948286146
2023-10-17 17:06:28 -04:00
Jerome Jiang 0129e64a65 Fix ubsan failure caused by left shift of negative
Bug: b/305642441
Change-Id: Iddb1572c284161140da48f61b04cf600e5b57ecc
2023-10-16 11:22:48 -04:00
Jerome Jiang 2ab7ba8251 Force mode search on 64x64 if no mode is selected
A speed feature disable_split_mask (set to 63) could cause no mode and
partition to be selected in rd_pick_partition because:

-> thresh_mult_sub8x8 all INT_MAX
-> All modes skipped for sub8x8 blocks
-> found_best_rd is 0 -> break from the loop of 4 sub blocks
-> sum_rdc is INT_MAX -> No rd update -> should_encode_sb is 0
-> Propagating to top of the tree
-> No partition / mode is selected

Bug: b/290499385
Change-Id: Ia655e262f3b32445347ae0aaf1a2d868cea997f3
2023-10-13 20:28:21 -04:00
Wan-Teh Chang 9c377eafbe Handle Arm/AArch64 runtime feature detection
Port the following libaom CLs to libvpx:
https://aomedia-review.googlesource.com/c/aom/+/178361
https://aomedia-review.googlesource.com/c/aom/+/180701
https://aomedia-review.googlesource.com/c/aom/+/181821

The tests themselves are not feature-gated in the same way that they are
used in the rest of the codebase since they are not controlled by
rtcd.pl. This means that tests that assume the existence of features not
present on the target can cause SIGILL to be thrown.

This commit extends init_vpx_test.cc to match the behaviour for other
targets and automatically disable testing for features that are not
available on the machine running the tests.

Call arm_cpu_caps() and x86_simd_caps() inside #if !CONFIG_SHARED.
All the SIMD-specialized functions (arm or x86) are internal functions,
so they are not exported from the libvpx shared library. If
CONFIG_SHARED is 1, it is not necessary to call arm_cpu_caps(),
x86_simd_caps(), and append_negative_gtest_filter() either.

Change-Id: I330631816bdb52842020c5aa2a1ad802865cc285
2023-10-10 09:27:20 -07:00
Wan-Teh Chang 7c31749387 Declare some "VP8_CONFIG *oxcf" params as const
Change-Id: Ia5e8445098e18da5978aacf17281f16252413f17
2023-10-07 07:23:11 -07:00
Wan-Teh Chang 8cb4544c21 VP8: allow thread count changes
Fix the TODO(https://crbug.com/1486441) comment in vp8/vp8_cx_iface.c.

Make vp8cx_create_encoder_threads() work after it has been called
before. If there are already the exact number of threads it needs to
create, return immediately. Otherwise, shut down the existing threads
(by calling vp8cx_remove_encoder_threads()) and create the required
number of threads.

Call vp8cx_create_encoder_threads() in vp8e_set_config() to respond to
changes in g_threads or g_w (which also affects the number of threads
through cm->mb_cols and cpi->mt_sync_range).

Change-Id: I552eeca5b1f1f5313f59559eb1da396f270a2429
2023-10-06 10:40:14 -07:00
Wan-Teh Chang c23da380a3 VP8: Allocate cpi->mt_current_mb_col array lazily
Add the mt_current_mb_col_size field to VP8_COMP to record the size of
the mt_current_mb_col array.

Move the allocation of the mt_current_mb_col array from
vp8_alloc_compressor_data() to vp8_encode_frame(), where the use of
mt_current_mb_col starts. Allocate mt_current_mb_col right before use
if mt_current_mb_col hasn't been allocated or if the current size is
incorrect.

Move the deallocation of the mt_current_mb_col array from
dealloc_compressor_data() to vp8cx_remove_encoder_threads().

Move the TODO(https://crbug.com/1486441) comment from
vp8/encoder/onyx_if.c to vp8/vp8_cx_iface.c.

Change-Id: Ic5a0793278c2cc94876669aaa0dd732412876673
2023-10-04 13:05:18 -07:00
Wan-Teh Chang 874bcaa164 Merge "Clean up vp8cx_create/remove_encoder_threads()" into main 2023-10-04 19:50:20 +00:00
Jerome Jiang 25a9a8e35a Merge "Use correct include guards for init_vpx_test.h" into main 2023-10-04 13:42:09 +00:00
Anupam Pandey 41caf8fef5 Add an emms instruction to vpx_subtract_block
This CL adds an `emms` instruction at the end of the MMX assembly
for the vpx_subtract_block function, to properly clear the register
state. This resolves a mismatch between x86 build and C only build.

BUG=webm:1816

Change-Id: I79d2947da7f587f3558a2ae17df214d2faf59e74
2023-10-04 11:13:05 +05:30
James Zern 448c5e8684 Merge "vp9_encoder: normalize sizeof() calls" into main 2023-10-04 04:42:16 +00:00
Wan-Teh Chang a1d4b56208 Merge "Declare cur_row inside #if CONFIG_MULTITHREAD" into main 2023-10-04 03:50:39 +00:00
Wan-Teh Chang d7c7383298 Merge "Have vp9_enc_build and vp9_enc_test restore cwd" into main 2023-10-04 03:21:50 +00:00
Wan-Teh Chang 450dfa908b Merge "Fix a potential resource leak and add alloc checks" into main 2023-10-04 03:20:27 +00:00
Wan-Teh Chang ea67878f8c Clean up vp8cx_create/remove_encoder_threads()
Make vp8cx_create_encoder_threads() undo everything cleanly before
returning an error.

Make vp8cx_remove_encoder_threads() reset pointer fields to NULL after
freeing them, reset encoding_thread_count to 0, and reset b_lpf_running
to 0 (false). This makes it safe to call vp8cx_create_encoder_threads()
after calling vp8cx_remove_encoder_threads().

Change-Id: I586f06ce3d5b1c88ca46884bb4d6667ffc97e440
2023-10-03 20:08:18 -07:00
Wan-Teh Chang f67f9ce346 Declare cur_row inside #if CONFIG_MULTITHREAD
Fix the following compiler warning when libvpx is configured with
the --disable-multithread option:

  vp9/common/vp9_thread_common.c:391:7: warning:
  variable 'cur_row' set but not used [-Wunused-but-set-variable]
    int cur_row;
        ^

Change-Id: I53aa279152715083df40990eb7fdcaeb77a66777
2023-10-03 19:16:36 -07:00
Jerome Jiang f73026c2cc Use correct include guards for init_vpx_test.h
Bug: b/303112617
Change-Id: Ie18df33b2bcab91c18e920825f4ed3a29e18373b
2023-10-03 22:00:02 -04:00
Jerome Jiang 5b6ceba996 Include vpx_config.h for macros
Clear some clang-tidy complaints

Change-Id: I6690428d336c81709befd19a33e11c1367275df3
2023-10-03 14:26:50 -04:00
Jerome Jiang 0a3e2b4ca1 Factor out common code used in test binaries
Bug: b/303112617
Change-Id: Icbe16e95ff334a9578a692cc51b4773393ad0005
2023-10-03 14:26:44 -04:00
Wan-Teh Chang b729684b05 Use big cfg.g_w in ConfigResizeChangeThreadCount
vp8cx_create_encoder_threads() caps the thread count at
(cm->mb_cols / cpi->mt_sync_range) - 1. If cfg.g_w is 16, cm->mb_cols is
only 1 (see vp8_alloc_frame_buffers: mb_cols = width >> 4), so we won't
be using multiple threads. To reproduce bug chromium:1486441, the test
just needs to increase cfg.g_h sufficiently.

Bug: chromium:1486441
Change-Id: Ie6b2da2e31cfa1717a481f55eebc8c875db94d87
2023-10-02 13:55:16 -07:00
James Zern 95cb5eae70 Merge "Merge tag 'v1.13.1'" into main 2023-10-02 19:11:51 +00:00
Wan-Teh Chang b863d8bb47 Have vp9_enc_build and vp9_enc_test restore cwd
Use $PWD to get the current directory.

Quote directory pathnames.

Suggested by James Zern.

Bug: webm:1800
Change-Id: I51e922b24da0e89d936370f858eab55d193ebdcb
2023-09-30 10:32:49 -07:00
Wan-Teh Chang 6512f994da Disable vpx_highbd_8_mse16x16_neon_dotprod, etc.
These functions assume the uint16_t samples are <= 255 (bit depth 8),
but vpx_highbd_8_mse16x16() is called for any bit depth, not just 8.

A better fix is to port the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/175063 to libvpx, but
that requires porting aom_sse() and aom_highbd_sse() to libvpx, which is
quite involved. So disable vpx_highbd_8_mse16x16_neon_dotprod, etc.
first.

Bug: webm:1819
Change-Id: If495a5dedc58d9981317b9993c9fbb81ff3ab50c
2023-09-29 16:58:26 -07:00
James Zern 1672f4db71 Merge tag 'v1.13.1'
libvpx 1.13.1

2023-09-29 v1.13.1 "Ugly Duckling"
  This release contains two security related fixes. One each for VP8 and VP9.

  - Upgrading:
    This release is ABI compatible with the previous release.

  - Bug fixes:
    https://crbug.com/1486441 (CVE-2023-5217)
    Fix to a crash related to VP9 encoding (#1642)

* tag 'v1.13.1':
  update CHANGELOG
  update version to 1.13.1
  Fix bug with smaller width bigger size
  vp9_alloccommon: clear allocation sizes on free
  VP8: disallow thread count changes
  encode_api_test: add ConfigResizeChangeThreadCount
  README: update release version to 1.13.0

Bug: webm:1818
Change-Id: I732e2423f635d4115890f00fd63f9886e31f39a6
2023-09-29 16:36:46 -07:00
James Zern ec9e1ed41f vp9_encoder: normalize sizeof() calls
use sizeof(var) instead of sizeof(type) and sizeof(*var) instead of
sizeof(var[0]) for consistency in some places.

Change-Id: Ibd9a783cfef5ce1d06131df3831a4093821a502f
2023-09-29 15:17:37 -07:00
Wan-Teh Chang 7f568f9876 Fix a potential resource leak and add alloc checks
Backport fixes from libaom:
https://aomedia-review.googlesource.com/c/aom/+/109061
https://aomedia-review.googlesource.com/c/aom/+/158102

Change-Id: Ia9d42d474be2898f9ae2fdc28606273377da3e90
2023-09-29 15:08:28 -07:00
James Zern 10b9492dcf update CHANGELOG
Bug: webm:1818
Change-Id: Ic0a943b5d1c69a3621ad3f91012fb5308a0c11f1
2023-09-29 15:06:14 -07:00
James Zern 490a7067e8 update version to 1.13.1
SO_VERSION_MAJOR = 8
SO_VERSION_MINOR = 0
SO_VERSION_PATCH = 1

The increase of the patch number corresponds to the revision number in
the libtool text.

3. If the library source code has changed at all since the last update,
then increment revision (‘c:r:a’ becomes ‘c:r+1:a’).

Bug: webm:1818
Change-Id: Ia114368e9fd7a908e7fcf6e4d3142f142770e3f4
2023-09-29 13:13:47 -07:00
Jerome Jiang df9fd9d5b7 Fix bug with smaller width bigger size
Fixed previous patch that clusterfuzz failed on.

Local fuzzing passing overnight.

Bug: webm:1642
Change-Id: If0e08e72abd2e042efe4dcfac21e4cc51afdfdb9
(cherry picked from commit 263682c9a2)
2023-09-29 13:13:47 -07:00
James Zern a53700e4a3 vp9_alloccommon: clear allocation sizes on free
This fixes reallocations (and avoids potential crashes) if any
allocations fails and the application continues to call
vpx_codec_decode().

Found with vpx_dec_fuzzer_vp9 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: If5dc96b73c02efc94ec84c25eb50d10ad6b645a6
(cherry picked from commit 02ab555e99)
2023-09-29 13:13:47 -07:00
Wan-Teh Chang 4697b110ac Update 32-bit version of horizontal_add_uint32x2
The code was originally added in
https://aomedia-review.googlesource.com/c/aom/+/162267
by Jonathan Wright.

Change-Id: Iafd9e31d82abe22387e0d68f02c7ab81e85367ed
2023-09-29 10:35:20 -07:00
Cheng Chen fd2052d4c9 Properly determine end of sequence
When the next frame is null and the current frame is an overlay
frame, which is equivalent to there is an active alt ref frame,
we call this an end of sequence.

Change-Id: I49c2cf7a001df98aff8b62ba034317e408274bd4
2023-09-28 18:36:21 -07:00
Wan-Teh Chang a9998b53c2 Merge "vp9_c_vs_simd_encode: Restore cwd at end of test" into main 2023-09-28 17:52:22 +00:00
Jerome Jiang 18bc7ffea7 Merge "Fix bug with smaller width bigger size" into main 2023-09-28 17:39:23 +00:00
Wan-Teh Chang e462a0de03 vp9_c_vs_simd_encode: Restore cwd at end of test
Restore the original current directory at the end of
vp9_c_vs_simd_enc_test().

Bug: webm:1800
Change-Id: Iad64848a231e3c900149cc2b248055b02dda80a6
2023-09-28 09:03:40 -07:00
Wan-Teh Chang 58a854fb27 Skip the y4m_360p_10bit_input clip for armv8
It is a known mismatch.

Bug: webm:1819
Change-Id: Ieb707a6f53ddf6c7b0d1202c6520599d3e45da76
2023-09-27 16:12:12 -07:00
Yunqing Wang 8e63202fb3 Merge "Modify vp9_c_vs_simd_enc_test script" into main 2023-09-27 20:33:39 +00:00
Jerome Jiang 263682c9a2 Fix bug with smaller width bigger size
Fixed previous patch that clusterfuzz failed on.

Bug: webm:1642
Change-Id: If0e08e72abd2e042efe4dcfac21e4cc51afdfdb9
2023-09-27 11:21:11 -04:00
James Zern baed121877 VP8: disallow thread count changes
Currently allocations are done at encoder creation time. Going from
threaded to non-threaded would cause a crash.

Bug: chromium:1486441
Change-Id: Ie301c2a70847dff2f0daae408fbef1e4d42e73d4
(cherry picked from commit 3fbd1dca6a)
2023-09-26 18:42:50 -07:00
James Zern 452199ca85 encode_api_test: add ConfigResizeChangeThreadCount
Update thread counts and resolution to ensure allocations are updated
correctly. VP8 is disabled to avoid a crash.

Bug: chromium:1486441
Change-Id: Ie89776d9818d27dc351eff298a44c699e850761b
(cherry picked from commit af6dedd715)
2023-09-26 18:41:50 -07:00
Yunqing Wang 61f868bcf7 Modify vp9_c_vs_simd_enc_test script
Applied James' change to the script. Enabled the test:
vp9_c_vs_simd_enc_test

BUG=webm:1800

Change-Id: If1e33e5ccb6ca9315004f3e3f5b910f8a8255317
2023-09-26 14:12:14 -07:00
James Zern 3fbd1dca6a VP8: disallow thread count changes
Currently allocations are done at encoder creation time. Going from
threaded to non-threaded would cause a crash.

Bug: chromium:1486441
Change-Id: Ie301c2a70847dff2f0daae408fbef1e4d42e73d4
2023-09-25 19:33:07 -07:00
James Zern af6dedd715 encode_api_test: add ConfigResizeChangeThreadCount
Update thread counts and resolution to ensure allocations are updated
correctly. VP8 is disabled to avoid a crash.

Bug: chromium:1486441
Change-Id: Ie89776d9818d27dc351eff298a44c699e850761b
2023-09-25 19:28:08 -07:00
Jerome Jiang 67bfb41ed8 Do not call WebM RC for new GOP at the end of seq
define_gf_group is called at the last frame of each GOP to get GOP size
for next one, which means it'll also be called at the last GOP of the
sequence, when calling WebM RC will be returned with error since WebM RC
does not have any more GOP to return.

When gop_coding_frames from the encoder is 1, it means it's running out
of firstpass stats, which means end of sequence.

Bug: b/299610956
Change-Id: I30e077a28fe41593ebabbc1dc0c2915a4bcbece3
2023-09-21 18:10:27 -04:00
Martin Storsjo ad3301e6a3 aarch64: Generalize Windows cpu detection to any Windows variant
This cpu detection implementation doesn't do anything MSVC specific,
it just calls the IsProcessorFeaturePresent function. This can be
compiled with mingw compilers just as well.

Change-Id: I55e607a47c8f5b70d9f707ef96b2fa7553f2f79f
2023-09-18 11:56:51 -07:00
Jerome Jiang 8f8e741468 Add max/min_gf_interval to vpx_rc_config_t
Bug: b/300499738
Change-Id: Id32cb5e3ce667539c0d1efe1ff5fcc7a49e35329
2023-09-14 16:04:19 -04:00
Jerome Jiang 8e61b3cd1b Fix ref frame buffer in TPL stats for RC
The original ref frame index was the index in the GF group; RC expects
the index to be the one for ref frame buffer.

Change-Id: I9a2b0e72b6332023fb2e8da131b557f82db02e39
2023-09-14 13:27:39 -04:00
Jerome Jiang 9c2e33ff74 Set frame width height for 1st TPL GOP frame
Change-Id: Ic92dfd232bf90e8cbe6c6233af523ed40d12097a
2023-09-14 11:59:57 -04:00
yuanhecai eb232b662a loongarch: Fix bugs from vp8_sixtap_predict4x4/16x16_lsx
Bug: webm:1755

Change-Id: I7295e0f9a1551b8a418d5b65a2b7351df1fdc063
2023-09-13 10:06:23 +08:00
yuanhecai 391bb5604b loongarch: simplify vpx_quantize_b/b_32x32_lsx args
Bug: webm:1755

Change-Id: I42fdb1c34f959dd1204b343b8192e3d9b49821b4
2023-09-13 10:05:34 +08:00
Jerome Jiang d549cb74b9 Add missing headers for clang-tidy warnings
Change-Id: I97edec8ecffdcc79b8f3528deb60a3a0332ea0cc
2023-09-12 15:28:06 -04:00
Jonathan Wright 1a1f50a89d Use run-time feature detection for Neon DotProd HBD MSE
Arm Neon DotProd implementations of vpx_highbd_8_mse<w>x<h> currently
need to be enabled at compile time since they're guarded by #ifdef
feature macros. Now that run-time feature detection has been enabled
for Arm platforms, expose these implementations with distinct
*neon_dotprod names in a separate file and wire them up to the build
system and rtcd.pl. Also add new test cases for the new functions.

Change-Id: I26be6fb587258c8fa9fbf03509b7602358a001a8
2023-09-03 23:04:50 +01:00
Jonathan Wright 15d6215716 Use run-time feature detection for Neon DotProd specialty var.
Enable Arm Neon DotProd implementations of vpx_get_var_sse_sum*
specialty variance functions via run-time feature detection, wiring
up the new *neon_dotprod names to rtcd.pl. Also add new test cases.

Change-Id: I04ac3db87d32ee7f94702b6c0360254e5688f713
2023-09-03 23:04:49 +01:00
Jonathan Wright ad4f28abaa Use run-time feature detection for Neon DotProd variance
Arm Neon DotProd implementations of vpx_variance<w>x<h> currently
need to be enabled at compile time since they're guarded by #ifdef
feature macros. Now that run-time feature detection has been enabled
for Arm platforms, expose these implementations with distinct
*neon_dotprod names in a separate file and wire them up to the build
system and rtcd.pl. Also add new test cases for the new functions.

Remove the _neon suffix in functions making reference to
vpx_variance<w>x<h>_neon() (e.g. sub-pixel variance) - enabling use
of the appropriate *neon or *neon_dotprod version at run time.

Similar changes for the specialty variance and MSE functions will be
made in a subsequent commit.

Change-Id: I69a0ef0d622ecb2d15bd90b4ace53273a32ed22d
2023-09-03 23:04:49 +01:00
Jonathan Wright 7009fe55a9 Use run-time CPU feature detection for Neon DotProd SAD4D
Arm Neon DotProd implementations of vpx_sad*4d currently need to be
enabled at compile time since they're guarded by ifdef feature
macros. Now that run-time feature detection has been enabled for Arm
platforms, expose these implementations with distinct *neon_dotprod
names in separate files and wire them up to the build system and
rtcd.pl. Also add new test cases for the new DotProd functions.

Change-Id: Ie99ee0b03ec488626f52c3f13e4111fe26cc5619
2023-09-03 23:04:49 +01:00
Jonathan Wright 02dc617f8c Use run-time CPU feature detection for Neon DotProd SAD
Arm Neon DotProd implementations of vpx_sad* currently need to be
enabled at compile time since they're guarded by ifdef feature
macros. Now that run-time feature detection has been enabled for Arm
platforms, expose these implementations with distinct *neon_dotprod
names in separate files and wire them up to the build system and
rtcd.pl. Also add new test cases for the new DotProd functions.

Change-Id: Ic6906c28240276ba89787eadbc9393a232374f95
2023-09-03 23:04:41 +01:00
Jonathan Wright 91158c99f7 Use run-time CPU feature detection for vpx_convolve8_neon
Arm Neon DotProd and I8MM implementations of vpx_convolve8* currently
need to be enabled at compile time since they're guarded by ifdef
feature macros. Now that run-time feature detection has been enabled
for Arm platforms, expose these implementations with distinct
*neon_dotprod/*neon_i8mm names in separate files and wire them up to
the build system and rtcd.pl. Also add new test cases for the new
DotProd and I8MM functions.

Change-Id: I3db3cd62e8596099d9fec7805ca3ee86b2a01c74
2023-09-03 20:43:06 +01:00
Jonathan Wright 148d1085f7 Refactor and extend run-time CPU feature detection on Arm
1) Overhaul the Arm CPU feature detection code, taking inspiration
   from similar recent changes in libaom.
2) Add neon_dotprod and neon_i8mm arch options in the configure,
   build and unit test files, adding appropriate conditional options
   where necessary.
3) Soft-enable run-time CPU feature detection by default for both 32-
   bit and 64-bit Arm platforms.

Change-Id: I3f13317d88324acc5753394351188baa8d18a261
2023-09-01 20:25:17 +01:00
Jonathan Wright 7ee16bc178 Simplify Neon MSE helper function params/return values
Simplify the parameters and return values of the Neon MSE helper
functions for both standard and high bitdepth - avoiding unused
return values.

Change-Id: I6f9208f9ce890fbe58346d9c7d9d701f28f2f90f
2023-08-31 14:21:01 +01:00
Marco Paniconi 6da1bd01d6 vp9 svc: fix interger overflow
Overflow was happening in two places:
one in set_encoder_config(), where the input
layer_target_bitrates are converted from kbps to bps,
the other in vp9_calc_pframe_target_size_one_pass_vbr(),
where target is scaled by kf_ratio.

vp9_ratectrl.c:2039: runtime error: signed integer overflow:
-137438983 * 25 cannot be represented in type 'int'

Bug: chromium:1475943

Change-Id: I1ab0980862548c8827fae461df9a7a74425209ff
2023-08-29 11:35:12 -07:00
Jerome Jiang e052ada780 Do not call ext rc functions when they're null
Change-Id: Ie78afadd4ad5845e42bd4d5412703369f8d5e0f5
2023-08-25 10:56:23 -04:00
James Zern 24c0dcc851 Merge "vp9_calc_pframe_target_size_one_pass_cbr: fix int overflow" into main 2023-08-21 22:46:42 +00:00
James Zern 99a4462b8d Merge "vp8,ratectrl.c: fix integer overflow" into main 2023-08-21 22:46:25 +00:00
James Zern b124b05dcb Merge "fdct4x4_neon: fix compile w/cl" into main 2023-08-21 19:05:49 +00:00
Jerome Jiang ade6905e39 vp9 ext rc: copy under/overshoot% for all RC modes
Bug: b/295507002
Change-Id: Ie4b302b82fa2d83e0be450cea60c59907b37f954
2023-08-21 10:05:15 -04:00
James Zern c7aa75ac55 vp9_calc_pframe_target_size_one_pass_cbr: fix int overflow
vp9/encoder/vp9_ratectrl.c:2171:23: runtime error: signed integer
overflow: 103079280 * -22 cannot be represented in type 'int'

Bug: chromium:1473268
Change-Id: Ic1de7d48e74d94c2a992e53ec4382b5b44dba7af
2023-08-18 14:04:33 -07:00
James Zern 80b1b5a7e9 vp8,ratectrl.c: fix integer overflow
in calc_iframe_target_size():
vp8/encoder/ratectrl.c:349:31: runtime error: signed integer overflow:
38 * 343597280 cannot be represented in type 'int'

Bug: chromium:1473473
Change-Id: Ie8f7b147efb27c92314df09837b66f7d97046883
2023-08-18 12:36:45 -07:00
James Zern 401d8f36be vp9_cx_iface: fix code compatibility
Remove '= {}' (C23 [1]) and use memset to clear a vpx_rc_config_t
instance.

after:
6e2c3b9b3 Add RC mode to vpx external RC interface

Fixes compile with -pedantic and Microsoft's cl compiler.

[1] https://en.cppreference.com/w/c/language/initialization

Change-Id: I2019cdf0c42103cfc80b1e58c68b7596e497007f
2023-08-18 09:10:00 -07:00
Jerome Jiang d3188cab65 Merge "vp9 ext rc: Assign over/undershoot % for CQ mode" into main 2023-08-17 17:47:50 +00:00
Jerome Jiang 87a467f356 vp9 ext rc: Assign over/undershoot % for CQ mode
Bug: b/295507002
Change-Id: Ie5b4dabc620f6d17c4039f186e0709d8e9479b47
2023-08-17 12:34:48 -04:00
Jerome Jiang e7bfd8b6c2 Merge "Extend ext RC mode to have CQ mode" into main 2023-08-17 15:11:48 +00:00
Jerome Jiang 4b1ac3c23f Extend ext RC mode to have CQ mode
Also do not return error if it's not specified.

Bug: b/295507002
Change-Id: Ib1f83551272bdde1bceff03554abc4c02d95ca09
2023-08-16 16:05:01 -04:00
James Zern bbf36c8839 Merge "tools_common,die_codec(): output to stderr" into main 2023-08-16 19:41:23 +00:00
James Zern 58eed626d8 tools_common,die_codec(): output to stderr
This function is used to report a failure, messages of this type should
go to stderr.

Change-Id: I0dee246dddc886a3278b247a770a356446658864
2023-08-16 11:15:55 -07:00
Jerome Jiang 6e2c3b9b3c Add RC mode to vpx external RC interface
Bug: b/295507002
Change-Id: Id2dd21482828ec64eef9abdf6a1cca83100d21ba
2023-08-15 16:11:43 -04:00
James Zern 8d2c357eab fdct4x4_neon: fix compile w/cl
Use an array for constant initialization rather than array syntax which
assumes the underlying type is a vector. Fixes compile error with
cl targeting Windows Arm64:

vpx_dsp\arm\fdct4x4_neon.c(55,52): error C2078: too many initializers

No change in assembly with gcc 12.2.0 & clang 14.

Bug: b/277255390
Bug: webm:1810
Fixed: webm:1810

Change-Id: Ia30edcdbb45067dfe865b9958a5eecf1fd9ddfc8
2023-08-11 15:52:26 -07:00
James Zern 335728c987 *quantize*.c: fix visual studio warnings
after:
22818907d normalize *const in rtcd

fixes warnings of the form:
vpx_dsp\x86\quantize_avx.c(145): warning C4028: formal parameter 2
different from declaration

Change-Id: I4dc423f11ec4a9171e18bdb6be2fa8dfb65ee61a
2023-08-11 13:42:49 -07:00
Jonathan Wright c8610c266c Fix bug and re-enable vpx_int_pro_row/col_neon
Fix a bug in vpx_int_pro_row_neon (increment pointer after peeled
first loop iteration) and re-enable both vpx_int_pro_row/col_neon
paths.

Also fix IntProRowTest to use width_ (instead of 0) as the src_stride
for the input data block. The test's use of 0 for src_stride is the
reason the tests passed with the buggy Neon implementation noted in
the listed bugs. (The old buggy Neon implementation fails the
adjusted unit tests.)

BUG=webm:1800
BUG=webm:1809

Change-Id: I1f4572ee155653a7596fe2c10b5938ea7a3f63ae
2023-08-11 00:08:56 +01:00
Yunqing Wang 67961dc5f7 Merge "Enable arm test in c vs SIMD bit-exactness test" into main 2023-08-08 20:32:52 +00:00
Yunqing Wang 32a4ecc3cf Merge "Disable vpx_int_pro_row/col neon SIMD functions" into main 2023-08-08 20:32:41 +00:00
Yunqing Wang 685715b698 Enable arm test in c vs SIMD bit-exactness test
Arm SIMD testing was enabled in c vs SIMD bit-exactness test after
arm SIMD mismatch was resolved.

BUG=webm:1800

Change-Id: Id60127313a0955f4a5c8468281fd5a441668fddb
2023-08-07 22:06:58 +00:00
Yunqing Wang 6e5fc00001 Disable vpx_int_pro_row/col neon SIMD functions
The vpx_int_pro_row/col neon SIMD version caused a mismatch between
neon encoding vs c encoding. Disabled them for now to ensure the
correctness of VP9 encoding on the arm platform. Since these 2
functions were not used much, so this wouldn't affect the overall
encoder speed much.

BUG=webm:1800
BUG=webm:1809

Change-Id: Id1a7d542fc03d4cf9fa1039a49832abf35fb722f
2023-08-07 15:04:43 -07:00
Jerome Jiang 242c743170 VP9 RC: Add pixel row/col of a TPL block
Bug: b/294049605
Change-Id: I383a88a037a2a48a5fc1b9def6f991278c3665a8
2023-08-07 16:42:43 -04:00
Jerome Jiang d4b6132d2b Fix more clang-tidy warnings
- Include vpx/vpx_ext_ratectrl.h in vp9_ext_ratectrl.c
 - Include vpx/internal/vpx_codec_internal.h
 - Include <stddef.h> for NULL

Bug: b/294049605
Change-Id: Iedd8b3864da27fde1678bfa6606e6fc5630a7a09
2023-08-07 10:33:18 -04:00
Jerome Jiang fc29b8533e Fix some clang-tidy warnings
- Use zero initializer instead of memset to avoid including <cstring>
 - Include vpx_codec.h for vpx_codec_err_t and error codes
 - Include vpx_tpl.h for VpxTplGopStats

Change-Id: Iac5837ce2173bd945bfe8eeb401ff4dfd04fd2e1
2023-08-04 16:29:10 -04:00
Jerome Jiang f6aaad370d Fix include path fpr vpx_tpl.h,vpx_ext_ratectrl.h
Bug: b/294049605
Change-Id: I6422fc4250c2192f985cce2e296a19a05934969b
2023-08-04 16:29:06 -04:00
Jerome Jiang 5556ebd894 Merge "vp9_quantize_fp_neon: Same params name as in decl" into main 2023-08-03 19:28:44 +00:00
Jerome Jiang 7dfeaffacc Merge "vp9 ext rc: Add callback for tpl stats" into main 2023-08-03 18:33:32 +00:00
Jerome Jiang 44f2819298 vp9_quantize_fp_neon: Same params name as in decl
Clear some clang-tidy warnings

Change-Id: Iea4c4e77b3d515ec6384bd34875a0002ab13c14c
2023-08-03 14:07:55 -04:00
Jerome Jiang 2f2761c261 vp9 ext rc: Add callback for tpl stats
Added test

Bug: b/294049605
Change-Id: I3967a0f915e1a6e7a0d34d04732c33e1ca6f35e7
2023-08-03 13:16:42 -04:00
Anupam Pandey 6075b1a36f Add test to check bit exactness of C and SIMD in VP9 encoder
This CL adds a shell script to test bit exactness of C and SIMD
VP9 encoder for x86 platform.

As C Vs NEON encoding outputs are not bit-exact (BUG=webm:1809),
ARM tests are currently disabled.

BUG=webm:1800

Change-Id: Iffcc70863e8cf83ccb5bc5be73e8866165697358
2023-08-03 15:03:38 +05:30
Yunqing Wang 2b82efa769 Add a 10-bit test file
Added a 10-bit test file for VP9 end-to-end c vs SIMD bit-
exactness test.

BUG=webm:1800

Change-Id: I4a864f1a740abee27049d68231adf2ec308f9a96
2023-08-01 21:04:09 -07:00
Johann 22818907d2 normalize *const in rtcd
Change-Id: Iece50143b43263c0c8f90299bedd7d2a5b9aa56b
2023-07-29 05:44:56 +09:00
Johann e7a4730fcc remove incorrect (void)
n_coeffs is used in this function

Change-Id: I5f5d2933304bb636a33e0fa294b4526edb65a08d
2023-07-28 20:21:34 +09:00
Johann 7c7ab9165a quantize_fp: reduce parameters
apply similar steps as to the other quantize functions to switch to
macroblock_plane and ScanOrder

Change-Id: I486d653326aaf52ffd3beafd2e891ba6a5d57ef3
2023-07-28 20:13:24 +09:00
Johann 70fc756383 quantize: reduce parameters
Pass macroblock_plane and ScanOrder instead of looking up the values
beforehand. Avoids pushing arguments to the stack.

Change-Id: I22df6f645eb1a1d89ba5a4d9bc58acb77af51aa9
2023-07-28 17:34:43 +09:00
James Zern 4f19de3826 resize_test: prefer 'override' to 'virtual'
Update functions in WRITE_COMPRESSED_STREAM blocks, which are disabled
by default. This caused them to be missed in:
84e6b7ab0 test/*.cc: prefer 'override' to 'virtual'

Change-Id: I0e462263f19c15eb0a30d0c0f4e145062f789489
2023-07-27 17:52:49 -07:00
James Zern 626e37e777 test/*.h: prefer 'override' to 'virtual'
created with clang-tidy --fix --checks=-*,modernize-use-override

Change-Id: I53412f35590799574edb573ae417a4a004cccd1e
2023-07-27 17:52:49 -07:00
James Zern d899b97945 encode_test_driver.h: use bool literal
Change-Id: If47be9ca0daa18d92cb849484f9e139e65e3560e
2023-07-27 17:52:49 -07:00
James Zern d62edaf41f test/**.cc: use bool literals
created with clang-tidy --fix --checks=-*,modernize-use-bool-literals

Change-Id: Ifaed8ca824676555acaf1053b2a5a52c51a70638
2023-07-25 12:18:03 -07:00
James Zern 5740cb3929 test/decode_perf_test.cc: use nullptr
created with clang-tidy --fix --checks=-*,modernize-use-nullptr

Change-Id: Ibf4a80fa00e9b59d471c92788ec4c7c72e4662e5
2023-07-25 12:10:21 -07:00
James Zern 1c2297b2bc test/*.cc: use '= default'
created with clang-tidy --fix --checks=-*,modernize-use-equals-default

Change-Id: Ie373fb5501491fce53479d20f3a6d908c4b7c535
2023-07-25 12:04:57 -07:00
James Zern f9e577cecb Merge changes I71e1b442,Ibbfb949b into main
* changes:
  test/*.cc: prefer 'override' to 'virtual'
  test,AbstractBench: fix -Wnon-virtual-dtor
2023-07-25 18:27:34 +00:00
James Zern 84e6b7ab02 test/*.cc: prefer 'override' to 'virtual'
created with clang-tidy --fix --checks=-*,modernize-use-override

Change-Id: I71e1b4423c143b3e47fe90929ee110b307cdb565
2023-07-24 18:09:31 -07:00
James Zern 5fb280ebb9 test,AbstractBench: fix -Wnon-virtual-dtor
In file included from ../test/bench.cc:14:
../test/bench.h:17:7: warning: 'AbstractBench' has virtual functions but
non-virtual destructor [-Wnon-virtual-dtor]
class AbstractBench {

Change-Id: Ibbfb949b63c8dff936c7ed4f2d056dea0343377b
2023-07-24 17:07:51 -07:00
Jerome Jiang e1c124f896 Add new_mv_count to ext rate control interface
Bug: b/290385227
Change-Id: Ia87c4bf1e9315bf1134c998f88e9d5548c497777
2023-07-24 18:04:58 -04:00
Jerome Jiang 37200b6abb cleanup: _pt -> _ptr in vp9 external RC interface
Change-Id: Ic483488f8f6273e8977cfc324466bda41f1e47a7
2023-07-24 13:08:05 -04:00
James Zern 1d1ee888d3 vp9_rdopt,handle_inter_mode: fix -Wmaybe-uninitialized warning
With gcc 13.1.1

In function ‘handle_inter_mode’,
inlined from ‘vp9_rd_pick_inter_mode_sb’ at
    ../vp9/encoder/vp9_rdopt.c:3872:17:
../vp9/encoder/vp9_rdopt.c:3142:8: warning: ‘tmp_rd’ may be used
    uninitialized [-Wmaybe-uninitialized]
 3142 |     rd = tmp_rd + RDCOST(x->rdmult, x->rddiv, rs, 0);
../vp9/encoder/vp9_rdopt.c: In function ‘vp9_rd_pick_inter_mode_sb’:
../vp9/encoder/vp9_rdopt.c:2846:15: note: ‘tmp_rd’ was declared here
 2846 |   int64_t rd, tmp_rd, best_rd = INT64_MAX;

Change-Id: I8608957cc8bbeb1ae525f3c3dad6fe9785b2a9b4
2023-07-13 09:49:30 -07:00
James Zern 9ad950a9c4 Merge "vp8: remove missing prototypes from the rtcd header" into main 2023-07-11 00:55:30 +00:00
L. E. Segovia e9b9972ca4 vp8: remove missing prototypes from the rtcd header
These were removed in If7a49e920e12f7fca0541190b87e6dae510df05c but
the leftovers can cause a build to fail if the code isn't optimized out.
I just found this out in the Meson port of libvpx for GStreamer.

BUG=webm:1584

Change-Id: I1c953720a2cbec3796200d4ec4020dca0b672bfb
2023-07-10 22:38:24 +00:00
James Zern f30532a6d9 vpx_free_tpl_gop_stats: normalize param name
this fixes a clang-tidy warning

Change-Id: I13f4750c15b7d6a395494c8dbcb896bde125b3c4
2023-07-10 10:06:13 -07:00
James Zern b2c2955c82 Merge "delete some dead code" into main 2023-07-06 17:10:37 +00:00
James Zern dcb91aa3dd mfqe_partition: fix -Wunreachable-code
vp9/common/vp9_mfqe.c|240 col 16| warning: code will never be executed
[-Wunreachable-code]
 BLOCK_SIZE mfqe_bs, bs_tmp;
            ^~~~~~~

Change-Id: I566b20d8c294e19bc4b90b57b730f933048e71a5
2023-06-29 09:52:26 -07:00
Wan-Teh Chang 3ef9934789 Fix a bug in vpx_highbd_hadamard_32x32_neon().
This CL is the highbd version of
https://chromium-review.googlesource.com/c/webm/libvpx/+/4646573.

The bug is caused by the incorrect assumption that
(a / 2) + (b / 2) == (a + b) / 2 and (a / 2) - (b / 2) == (a - b) / 2.

Also fix the Rand() inputs to Hadamard functions in unit tests.

This CL ports the following libaom CLs to libvpx:
https://aomedia-review.googlesource.com/c/aom/+/177101
https://aomedia-review.googlesource.com/c/aom/+/177241

Change-Id: Ic20e7684eab5d6507417fa2b75e572064d37ad2c
2023-06-28 16:09:36 -07:00
James Zern dc26707f80 delete some dead code
follow-up to:
3ecba3980 Fix Clang -Wunreachable-code-aggressive warnings

Change-Id: I364312987bc838c69c010cce024bd3d62a918417
2023-06-28 12:26:32 -07:00
James Zern 9598c384bc Merge "Fix Clang -Wunreachable-code-aggressive warnings" into main 2023-06-28 19:21:40 +00:00
James Zern 3ecba39802 Fix Clang -Wunreachable-code-aggressive warnings
Based on the change in libaom:
fe36011455 Fix Clang -Wunreachable-code-aggressive warnings

Clang's -Wunreachable-code-aggressive flag enables several warning flags
such as -Wunreachable-code-break and -Wunreachable-code-return. Chrome's
build system enables -Wunreachable-code-aggressive (in
build/config/compiler/BUILD.gn), so it would be good if libvpx could be
compiled without -Wunreachable-code-aggressive warnings.

This requires the VPX_NO_RETURN macro be defined correctly for all the
compilers we support, otherwise some compilers may warn about missing
return statements after a die() or fatal() call (which does not return).

Change-Id: I0c069133af45a7a61759538b6d74c681ea087dcd
2023-06-28 11:06:50 -07:00
Jerome Jiang 3bd65ac776 vp9 firstpass stats in a separate header
Change-Id: If91c5c74c71affc48eb858beb314a6c194b14023
2023-06-28 11:17:15 -04:00
James Zern 44d6cacec6 Merge changes I1c17302f,Ic084894b,I9867f5fc,Ie3faf7b3,If5dc96b7, ... into main
* changes:
  vp8_decode: fix keyframe resync after decode error
  vp8_decode: only remove threads on thread create failure
  vp8_decode: clear stream info on decoder create failure
  vp9_decodeframe,init_mt: free tile_workers on alloc failure
  vp9_alloccommon: clear allocation sizes on free
  vp9_dx_iface: fix leaks on init_decoder() failure
2023-06-28 00:02:47 +00:00
James Zern 44a5eaa3ba vp8_decode: fix keyframe resync after decode error
This fixes a crash if the application continues to call
vpx_codec_decode(). Previously a non-keyframe could cause a crash if the
decoder failed before fully initializing due to an allocation failure.
The stream info and frame resolution would be 0, skipping an allocation.

Found with vpx_dec_fuzzer_vp8 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: I1c17302f4d3a488ba3b4eefe0bf53853dc558bc1
2023-06-27 12:58:28 -07:00
James Zern a166c52d3a vp8_decode: only remove threads on thread create failure
This fixes a crash if the application continues to call
vpx_codec_decode(). Previously the decoder instance would be freed,
causing a crash when attempting to access it with restart_threads=1.

Found with vpx_dec_fuzzer_vp8 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: Ic084894b776729bb1572f747082cef002f0832a8
2023-06-26 19:23:58 -07:00
James Zern 263ddc9e38 vp8_decode: clear stream info on decoder create failure
This fixes a crash if the application continues to call
vpx_codec_decode().

Found with vpx_dec_fuzzer_vp8 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: I9867f5fc3d1163026f521a9609d3cbbc00568d1d
2023-06-26 19:23:41 -07:00
James Zern a31e818ef8 vp9_decodeframe,init_mt: free tile_workers on alloc failure
This avoids a crash if any of the thread allocations fail and the
application continues to call vpx_codec_decode(). Previously
num_tile_workers would be non-zero, but not equal to num_threads, which
would cause a crash during later thread management.

Found with vpx_dec_fuzzer_vp9 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: Ie3faf7b36764aebedac0924acb6e4cb7545aec7d
2023-06-26 19:20:41 -07:00
James Zern 02ab555e99 vp9_alloccommon: clear allocation sizes on free
This fixes reallocations (and avoids potential crashes) if any
allocations fails and the application continues to call
vpx_codec_decode().

Found with vpx_dec_fuzzer_vp9 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: If5dc96b73c02efc94ec84c25eb50d10ad6b645a6
2023-06-26 19:15:30 -07:00
James Zern 885ecc7c66 vp9_dx_iface: fix leaks on init_decoder() failure
If any allocations fail in init_decoder() and the application continues
to call vpx_codec_decode() some of the allocations would be orphaned or
the decoder would be left in a partially initialized state.

Found with vpx_dec_fuzzer_vp9 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: I44f662526d715ecaeac6180070af40672cd42611
2023-06-26 19:14:14 -07:00
Wan-Teh Chang 19f3a754d6 Fix a bug in vpx_hadamard_32x32_neon()
A right shift by 2 is equivalent to two halving operations if there is
no no addition or subtraction between the two halving operations.

Note: Since vhaddq_s16() and vhsubq_s16() have 17-bit intermediate
precision, the Neon code doesn't need to go to int32_t as was done in
https://chromium-review.googlesource.com/c/webm/libvpx/+/4604169.

Change-Id: Ibe0691cde0fd3b94ee7c497845ba459d30d503b0
2023-06-26 15:48:03 -07:00
James Zern 14e52008ed Merge "configure.sh: Improve a comment." into main 2023-06-20 20:06:32 +00:00
Yunqing Wang 74e8c77425 Merge "Remove vp9_diamond_search_sad_avx function" into main 2023-06-20 16:34:58 +00:00
Anupam Pandey 80d4172f07 Remove vp9_diamond_search_sad_avx function
This CL removes the avx of vp9_diamond_search_sad function as
there is no speed up seen wrt C.

Change-Id: Ife6005d8e444ea2c8d07ac0f686c840344b9e0ea
2023-06-19 16:05:12 +05:30
Chen Wang af40910197 configure.sh: Improve a comment.
The corresponding case block is not only for ARM.
Original comment text makes reader confused.

Test: N/A, just comment text changes.

Change-Id: I3154d18d3b3d237c1eecfe07dc7ec237c98194cf
Signed-off-by: Chen Wang <wangchen20@iscas.ac.cn>
2023-06-17 00:42:17 +00:00
Jerome Jiang 8cee267d3d Add new_mv_count to firstpass stats
Mostly follows the logic of how it's calculated in libaom.

Bug: b/287283080
Change-Id: I9ee67d844ef9db7cca63339b5304459eaa28d324
2023-06-16 14:13:29 -04:00
Yunqing Wang 8789421bf3 Merge "Fix c vs intrinsic mismatch of vpx_hadamard_32x32() function" into main 2023-06-12 16:44:21 +00:00
Jerome Jiang bdb8ccc0af RTC RC: clean up unnecessary headers
Change-Id: I77c407be59f4eb0c70a89a5fffd88c648e634123
2023-06-09 15:33:39 -04:00
Anupam Pandey 8c308aefea Fix c vs intrinsic mismatch of vpx_hadamard_32x32() function
This CL resolves the mismatch between C and intrinsic implementation
of vpx_hadamard_32x32 function. The mismatch was due to integer
overflow during the addition operation in the intrinsic functions.
Specifically, the addition in the intrinsic function was performed
at the 16-bit level, while the calculation of a0 + a1 resulted in
a 17-bit value.

This code change addresses the problem by performing
the addition at the 32-bit level (with sign extension) in both SSE2
and AVX2, and then converting the results back to the 16-bit level
after a right shift.

STATS_CHANGED

Change-Id: I576ca64e3b9ebb31d143fcd2da64322790bc5853
2023-06-09 17:15:37 +05:30
Jerome Jiang 2245df50a6 Replace NONE with NO_REF_FRAME
NONE is a common name and it has conflicts with symbols defined in
Chromium.

Bug: b/286163500
Change-Id: I3d935a786f771a4d90b258fabc6fd6c2ecbf1c59
2023-06-08 11:08:23 -04:00
Jerome Jiang da02ccde30 Merge "Fix more typos (n/n)" into main 2023-06-08 14:11:24 +00:00
Jerome Jiang fb6aebcbbf Merge "Fix more typos (3/n)" into main 2023-06-07 21:10:04 +00:00
Jerome Jiang d42b7fd661 Fix more typos (n/n)
impace -> impact
taget -> target
prediciton -> prediction
addtion -> addition
the the -> the

Bug: webm:1803
Change-Id: I759c9d930a037ca69662164fcd6be160ed707d77
2023-06-07 16:41:18 -04:00
Jerome Jiang 6a8eb04fec Fix more typos (3/n)
Propogation -> Propagation
propogate -> propagate
cant -> can't
upto -> up to
canddiates -> candidates
refernce -> reference
USEAGE -> USAGE

Change-Id: Iadaf2dffd86b54e04411910f667e8c2dfc6c4c77
2023-06-07 15:46:33 -04:00
Jerome Jiang da9735a37e Merge "Fix more typos (2/n)" into main 2023-06-07 19:10:43 +00:00
Jerome Jiang aafa55cc3f Merge "Fix more typos (1/n)" into main 2023-06-07 19:10:36 +00:00
Jerome Jiang 1c95f5d17b Merge "Fix a few typos" into main 2023-06-07 18:19:08 +00:00
Jerome Jiang ffb9345109 Fix more typos (2/n)
kernal -> kernel
e.g -> e.g.
paritioning -> partitioning
partioning -> partitioning
coefficents -> coefficients
i.e, -> i.e.,
equivalend -> equivalent
recive -> receive
resoultions -> resolutions

Bug: webm:1803
Change-Id: I1d6176202ee5daee7a64bf59114e8b304aeb4db7
2023-06-07 13:09:58 -04:00
Jerome Jiang ad14a32b33 Fix more typos (1/n)
Dont -> Don't
setings -> settings
thresold -> thresh
thresold -> threshold
becasue -> because
itterations -> iterations
its a -> it's a
an constant -> a constant

Bug: webm:1803
Change-Id: I1e019393939ed25c59c898c88d4941ec360b026d
2023-06-07 13:09:39 -04:00
Jerome Jiang bcd491a6be Fix a few typos
segement -> segment
dont -> don't
useage -> usage
devide -> divide

Bug: webm:1803
Change-Id: I0153380b0003825c4b62cf323d4f2bc837c8a264
2023-06-07 12:39:07 -04:00
Deepa K G e510716d7e Add comments in vp9_diamond_search_sad_avx()
Added comments related to re-arranging the
elements of the SAD vector to find the
minimum.

Change-Id: I58b702d304a6cdd32f04775fba603e39c19a8947
2023-06-06 14:35:14 +05:30
Deepa K G 7b66c730a2 Fix c vs avx mismatch of diamond_search_sad()
In the function vp9_diamond_search_sad_avx(), arranged
the cost vector in a specific order. This ensures that
the motion vector with the least index is selected,
when there exists more than one candidate motion
vector with the minimum cost, thus resolving the
c vs avx mismatch.

STATS_CHANGED

Change-Id: I4f8864f464f9ea2aae6250db3d8ad91cb08b26e2
2023-06-05 12:48:24 +05:30
Jerome Jiang 575bd73f61 Merge "Trim tpl stats by 2 extra frames" into main 2023-05-31 19:31:04 +00:00
Jerome Jiang 1aff4a5655 Trim tpl stats by 2 extra frames
Not applicable to the last GOP.

Bug: b/284162396
Change-Id: I55b7e04e9fc4b68a08ce3e00b10743823c828954
2023-05-31 14:26:49 -04:00
James Zern 60ee1b149b Merge changes I6a906803,I0307a3b6 into main
* changes:
  Optimize Neon implementation of vpx_int_pro_row
  Optimize Neon implementation of vpx_int_pro_col
2023-05-31 17:44:00 +00:00
Jonathan Wright c36aa2e9c4 Optimize Neon implementation of vpx_int_pro_row
Double the number of accumulator registers to remove the bottleneck.
Also peel the first loop iteration.

Change-Id: I6a90680369f9c33cdfe14ea547ac1569ec3f50de
2023-05-31 14:34:43 +01:00
Jonathan Wright c738e87f27 Optimize Neon implementation of vpx_int_pro_col
Use widening pairwise addition instructions to halve the number of
additions required.

Change-Id: I0307a3b65e50d2b1ae582938bc5df9c2b21df734
2023-05-31 14:30:02 +01:00
James Zern ad5677eafc Merge changes Ia3647698,I55caf34e,Id2c60f39 into main
* changes:
  vpx_dsp_common.h,clip_pixel: work around VS2022 Arm64 issue
  fdct_partial_neon.c: work around VS2022 Arm64 issue
  fdct8x8_test.cc: work around VS2022 Arm64 issue
2023-05-25 04:54:09 +00:00
James Zern 47fa9804b2 Merge "examples.mk,vpxdec: rm libwebm muxer dependency" into main 2023-05-24 17:43:20 +00:00
Jerome Jiang 31c07211ba Merge "Add IO for TPL stats" into main 2023-05-24 16:27:20 +00:00
James Zern 25f2e1ef25 vpx_dsp_common.h,clip_pixel: work around VS2022 Arm64 issue
cl.exe targeting AArch64 with optimizations enabled
produces invalid code for clip_pixel() when the return type is uint8_t.
See:
https://developercommunity.visualstudio.com/t/Misoptimization-for-ARM64-in-VS-2022-17/10363361

Bug: b/277255076
Bug: webm:1788
Change-Id: Ia3647698effd34f1cf196cd33fa4a8cab9fa53d6
2023-05-23 15:52:09 -07:00
James Zern 95b56ab7df fdct_partial_neon.c: work around VS2022 Arm64 issue
cl.exe targeting AArch64 with optimizations enabled
will fail with an internal compiler error.
See:
https://developercommunity.visualstudio.com/t/Compiler-crash-C1001-when-building-a-for/10346110

Bug: b/277255076
Bug: webm:1788
Change-Id: I55caf34e910dab47a7775f07280677cdfe606f5b
2023-05-23 15:52:05 -07:00
James Zern 62d09a3e94 fdct8x8_test.cc: work around VS2022 Arm64 issue
cl.exe targeting AArch64 with optimizations enabled
produces invalid code in RunExtremalCheck() and RunInvAccuracyCheck().
See:
https://developercommunity.visualstudio.com/t/1770-preview-1:-Misoptimization-for-AR/10369786

Bug: b/277255076
Bug: webm:1788
Change-Id: Id2c60f3948d8f788c78602aea1b5232133415dea
2023-05-23 15:51:56 -07:00
Jerome Jiang d45cc8edda Add IO for TPL stats
Overload TempOutFile constructor to allow IO mode.

Bug: b/281563704

Change-Id: I1f4f5b29db0e331941b6795e478eeeab51f625ad
2023-05-23 17:58:01 -04:00
Jerome Jiang 7a5f328a66 Merge "Add new vpx_tpl.h API file" into main 2023-05-18 17:20:03 +00:00
Yunqing Wang 4bbdd6b046 Merge "Improve convolve AVX2 intrinsic for speed" into main 2023-05-18 15:48:49 +00:00
Jerome Jiang 7e7a1706e3 Add new vpx_tpl.h API file
New file (vpx_tpl.c) in the following CLs will add new APIs dealing with
TPL stats from VP9 encoder.

Change-Id: I5102ef64214cba1ca6ecea9582a19049666c6ca4
2023-05-17 20:43:35 -04:00
Anupam Pandey e6b9a8d667 Improve convolve AVX2 intrinsic for speed
This CL refactors the code related to convolve function.
Furthermore, improved the AVX2 intrinsic to compute
convolve vertical for w = 4 case, and convolve horiz for
w = 16 case.

Please note the module level scaling w.r.t C function
(timer based) for existing (AVX2) and new AVX2 intrinsics:

Block     Scaling
Size   AVX2       AVX2
     (existing)   (New)
4x4    5.34x      5.91x
4x8    7.10x      7.79x
16x8  23.52x     25.63x
16x16 29.47x     30.22x
16x32 33.42x     33.44x

This is a bit exact change.

Change-Id: If130183bc12faab9ca2bcec0ceeaa8d0af05e413
2023-05-17 14:24:34 +05:30
James Zern 99522d307c Merge changes Ie77ad184,Idfcac43c into main
* changes:
  Add 2D-specific Neon horizontal convolution functions
  Refactor standard bitdepth Neon convolution functions
2023-05-16 00:05:05 +00:00
Jonathan Wright 3e1e38d117 Add 2D-specific Neon horizontal convolution functions
2D 8-tap convolution filtering is performed in two passes -
horizontal and vertical. The horizontal pass must produce enough
input data for the subsequent vertical pass - 3 rows above and 4 rows
below, in addition to the actual block height.

At present, all Neon horizontal convolution algorithms process 4 rows
at a time, but this means we end up doing at least 1 row too much
work in the 2D first pass case where we need h + 7, not h + 8 rows of
output.

This patch adds additional dot-product (SDOT and USDOT) Neon paths
that process h + 7 rows of data exactly, saving the work of the
unnecessary extra row. It is impractical to take a similar approach
for the Armv8.0 MLA paths since we have to transpose the data block
both before and after calling the convolution helper functions.

vpx_convolve_neon performance impact: we observe a speedup of ~9% for
smaller (and wider) blocks, and a speedup of 0-3% for larger blocks.
This is to be expected since the proportion of redundant work
decreases as the block height increases.

Change-Id: Ie77ad1848707d2d48bb8851345a469aae9d097e1
2023-05-13 20:43:20 +01:00
James Zern 8adf1be644 Merge "Don't use -Wl,-z,defs with Clang's sanitizers" into main 2023-05-12 19:23:47 +00:00
James Zern 2a9b810d3d Don't use -Wl,-z,defs with Clang's sanitizers
This avoids link errors related to the sanitizers:
https://clang.llvm.org/docs/AddressSanitizer.html#usage
"When linking shared libraries, the AddressSanitizer run-time is not
linked, so -Wl,-z,defs may cause link errors ..."

See also:
https://crbug.com/aomedia/3438

Bug: webm:1801
Fixed: webm:1801
Change-Id: Ie212318005a5f7222e5486775175534025306367
2023-05-12 10:20:54 -07:00
Jonathan Wright 8ecf584321 Refactor standard bitdepth Neon convolution functions
1) Use #define constant instead of magic numbers for right shifts.
2) Move saturating narrow into helper functions that return 4-element
   result vectors.
3) Use mem_neon.h helpers for load/store sequences in Armv8.0 paths.
4) Tidy up: assert conditions and some longer variable names.
5) Prefer != 0 to > 0 where possible for loop termination conditions.

Change-Id: Idfcac43ca38faf729dca07b8cc8f7f45ad264d24
2023-05-12 14:53:51 +01:00
James Zern 9e0fc37f6f configure: add -Wshadow
libraries under third_party/ are out of scope for this change.

Bug: webm:1793
Change-Id: I562065a3c0ea9fdfc9615d1a6b1ae47da79b8ce0
2023-05-09 14:04:19 -07:00
James Zern 894262fb8f Merge "vp8_macros_msa.h: clear -Wshadow warnings" into main 2023-05-09 21:03:31 +00:00
James Zern bf5facce39 Merge changes Iac020280,I8ca8660a into main
* changes:
  gen_msvs_vcxproj: add ARM64EC w/VS >= 2022
  configure: add clang-cl vs1[67] arm64 targets
2023-05-09 20:55:55 +00:00
Yunqing Wang cc1b3886f2 Merge "Add AVX2 intrinsic for vpx_comp_avg_pred() function" into main 2023-05-09 15:57:09 +00:00
Anupam Pandey 457b7f5986 Add AVX2 intrinsic for vpx_comp_avg_pred() function
The module level scaling w.r.t C function (timer based) for
existing (SSE2) and new AVX2 intrinsics:

If ref_padding = 0
Block     Scaling
size    SSE2    AVX2
8x4     3.24x   3.24x
8x8     4.22x   4.90x
8x16    5.91x   5.93x
16x8    1.63x   3.52x
16x16   1.53x   4.19x
16x32   1.38x   4.82x
32x16   1.28x   3.08x
32x32   1.45x   3.13x
32x64   1.38x   3.04x
64x32   1.39x   2.12x
64x64   1.46x   2.24x

If ref_padding = 8
Block     Scaling
size    SSE2    AVX2
8x4     3.20x   3.21x
8x8     4.61x   4.83x
8x16    5.50x   6.45x
16x8    1.56x   3.35x
16x16   1.53x   4.19x
16x32   1.37x   4.83x
32x16   1.28x   3.07x
32x32   1.46x   3.29x
32x64   1.38x   3.22x
64x32   1.38x   2.14x
64x64   1.38x   2.12x

This is a bit-exact change.

Change-Id: I72c5d155f64d0c630bc8c3aef21dc8bbd045d9e6
2023-05-09 16:33:59 +05:30
James Zern fbbe1d0115 vp8_macros_msa.h: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ia940b06bd23a915a050432e03bb630567e891d8d
2023-05-08 21:44:32 -07:00
James Zern 19ec57e149 Merge "README: update target list" into main 2023-05-08 20:52:52 +00:00
James Zern 2108b7a26f Merge changes Ie165d410,I6d9bb8da,I6858e574 into main
* changes:
  vp8_[cd]x_iface: clear setjmp flag on function exit
  vp9_decodeframe,tile_worker_hook: relocate setjmp=1
  vp9,encoder_set_config: set setjmp flag after setjmp()
2023-05-08 20:52:31 +00:00
Jerome Jiang f5f3a64862 Merge "Add VpxTplGopStats" into main 2023-05-08 19:47:30 +00:00
Jerome Jiang 7ab013f8a9 Merge "Unify implementation of CHECK_MEM_ERROR" into main 2023-05-08 19:47:21 +00:00
Jerome Jiang 62cd0c9c3e Merge "CHECK_MEM_ERROR to return in vp9_set_roi_map" into main 2023-05-08 19:46:44 +00:00
James Zern 3916e0e130 gen_msvs_vcxproj: add ARM64EC w/VS >= 2022
rather than define new targets, add a platform to the arm64 list as they
share the same configuration.

Bug: webm:1788
Change-Id: Iac020280b1103fb12b559f21439aeff26568fba4
2023-05-08 10:53:21 -07:00
James Zern 3fe1365884 configure: add clang-cl vs1[67] arm64 targets
x86 and armv7 are skipped for now as the intrinsics will need different
flags than cl.exe (/arch:... -> -m...).

Bug: webm:1788
Change-Id: I8ca8660a8644cdd84c51cb1f75005e371ba8207d
2023-05-08 10:53:21 -07:00
Jerome Jiang 745c6392f7 Add VpxTplGopStats
Contains the size of GOP - also the size of the list of TPL stats for
each frame in this GOP.

VpxTplGopStats will be the unit for VP9E_GET_TPL_STATS control to return
TPL stats from the encoder.

Bug: b/273736974
Change-Id: I1682242fc6db4aafcd6314af023aa0d704976585
2023-05-08 13:27:26 -04:00
Jerome Jiang 1710c9282a Unify implementation of CHECK_MEM_ERROR
There were multiple implementations of CHECK_MEM_ERROR across the
library that take different arguments and used in different places.

This CL will unify them and have only one implementation that takes
vpx_internal_error_info.

Change-Id: I2c568639473815bc00b1fc2b72be56e5ccba1a35
2023-05-08 13:27:24 -04:00
Jerome Jiang 75f9551efb CHECK_MEM_ERROR to return in vp9_set_roi_map
Also change the return type of vp9_set_roi_map to vpx_codec_err_t

Change-Id: I60d9ff45f2d3dfc44cd6e2aab2cb1ba389ff15f3
2023-05-08 13:25:36 -04:00
James Zern b14d20b470 examples.mk,vpxdec: rm libwebm muxer dependency
vpxdec only requires the parser.

Change-Id: I54ead453d4af400ca5c3412a3211d6d0b1383046
2023-05-06 15:48:58 -07:00
James Zern 4818f997fe Merge "vp9_encoder: clear -Wshadow warning" into main 2023-05-06 02:26:55 +00:00
James Zern 3d57fb69af README: update target list
Change-Id: If2d5811a55f6bb60eeba7d28b69c78157a17e87f
2023-05-05 19:12:27 -07:00
Jerome Jiang 905f991acd Merge "Set setjmp flag in VP9 RTC rate control library" into main 2023-05-05 23:02:14 +00:00
Jerome Jiang 5636f098b3 Set setjmp flag in VP9 RTC rate control library
Change-Id: Ic5ec8dc7d9637091d4137a47d793cf29e76fdc45
2023-05-05 15:41:33 -04:00
James Zern 497f246d29 sixtap_filter_msa.c: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I5f9c09f31b06fecc123c6a9d01f5fbed39142356
2023-05-05 12:26:13 -07:00
James Zern 662af59716 Merge "macros_msa.h: clear -Wshadow warnings" into main 2023-05-05 19:15:29 +00:00
James Zern 851a76ff65 vp8_[cd]x_iface: clear setjmp flag on function exit
in vp8e_encode, also move setting the setjmp() call closer to setting
the flag.

Change-Id: Ie165d4100b84776f9c34eddcf64657bd78cce4f5
2023-05-05 11:18:08 -07:00
James Zern eb7014c80c vp9_decodeframe,tile_worker_hook: relocate setjmp=1
after the call to setjmp(); this is more correct and consistent with
other code.

Change-Id: I6d9bb8daad6a959bfe4f25484f9d6664b99da19e
2023-05-05 11:03:19 -07:00
James Zern b030d033b8 vp9,encoder_set_config: set setjmp flag after setjmp()
Change-Id: I6858e574d24aaff64f725404706f58e04e43717d
2023-05-05 11:01:50 -07:00
James Zern e2f217c075 Merge changes I8089e90a,I46890224,I1b0e090d into main
* changes:
  Overwrite cm->error->detail before freeing
  Have vpx_codec_error take const vpx_codec_ctx_t *
  Add comments about vpx_codec_enc_init_ver failure
2023-05-05 17:26:11 +00:00
James Zern b3920105c3 Merge "vpx_subpixel_8t_intrin_avx2,cosmetics: shorten long comment" into main 2023-05-05 16:47:28 +00:00
James Zern 28c5d70650 vp9_encoder: clear -Wshadow warning
with --enable-experimental --enable-rate-ctrl

Bug: webm:1793
Change-Id: I9ca664538bcf0c2aca8aea73283bbb0232eb86e9
2023-05-05 09:46:53 -07:00
James Zern c85b7331a5 macros_msa.h: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ib2e3bd3c52632cdd4410cb2c54d69750e64e5201
2023-05-05 09:25:52 -07:00
Yunqing Wang 17f1a23f55 Merge "Add AVX2 intrinsic for idct16x16 and idct32x32 functions" into main 2023-05-05 15:33:21 +00:00
Anupam Pandey 255ee18885 Add AVX2 intrinsic for idct16x16 and idct32x32 functions
Added AVX2 intrinsic optimization for the following functions
1. vpx_idct16x16_256_add
2. vpx_idct32x32_1024_add
3. vpx_idct32x32_135_add

The module level scaling w.r.t C function (timer based) for
existing (SSE2) and new AVX2 intrinsics:
                            Scaling
   Function Name         SSE2      AVX2
vpx_idct32x32_1024_add  3.62x     7.49x
vpx_idct32x32_135_add   4.85x     9.41x
vpx_idct16x16_256_add   4.82x     7.70x

This is a bit-exact change.

Change-Id: Id9dda933aa1f5093bb6b35ac3b8a41846afca9d2
2023-05-05 15:55:16 +05:30
Wan-Teh Chang 3d6b86e704 Overwrite cm->error->detail before freeing
Help detect use after free of the return value of
vpx_codec_error_detail(). If vpx_codec_error_detail() is called after
vpx_codec_encode() fails, the return value may be equal to
cm->error->detail, which is freed when vpx_codec_destroy() is called.

Document the lifetime of the string returned by
vpx_codec_error_detail().

Change-Id: I8089e90a4499b4f3cc5b9cfdbb25d72368faa319
2023-05-04 22:08:21 -07:00
Wan-Teh Chang 8e47341b0e Have vpx_codec_error take const vpx_codec_ctx_t *
Also have vpx_codec_error_detail take vpx_codec_ctx_t *. Both functions
are getter functions that don't modify the codec context.

Change-Id: I4689022425efbf7b1da5034255ac052fce5e5b4f
2023-05-04 22:08:21 -07:00
Wan-Teh Chang 601a98b154 Add comments about vpx_codec_enc_init_ver failure
Address the questions:
1. If vpx_codec_enc_init_ver() fails, should I still call
   vpx_codec_destroy() on the encoder context?
2. Is it safe to call vpx_codec_error_detail() when
   vpx_codec_enc_init_ver() failed?

Change-Id: I1b0e090d11dd9f853fe203f4cbb6080c3c7b0506
2023-05-04 22:08:21 -07:00
James Zern 4e23e7abfe vpx_subpixel_8t_intrin_avx2,cosmetics: shorten long comment
Change-Id: I8badedc2ad07d60896e45de28b707ad9f6c4d499
2023-05-04 17:17:10 -07:00
Jerome Jiang 3580bc559a Merge "Add num_blocks to VpxTplFrameStats" into main 2023-05-04 18:06:10 +00:00
Jerome Jiang bd3a5ae3ea Merge "Add Vpx* prefix to Tpl{Block,Frame}Stats" into main 2023-05-04 18:00:51 +00:00
Chi Yo Tsai 4379041094 Merge changes I226215a2,Ia4918eb0,If6219446,Ibf00a6e1,I900a0a48 into main
* changes:
  Fix mismatched param names in vpx_dsp/x86/sad4d_avx2.c
  Fix mismatched param names in vpx_dsp/arm/highbd_sad4d_neon.c
  Fix mismatched param names in vpx_dsp/arm/sad4d_neon.c
  Fix mismatched param names in vpx_dsp/arm/highbd_avg_neon.c
  Fix clang warning on const-qualification of parameters
2023-05-04 17:04:17 +00:00
Jerome Jiang 2e5261647f Add num_blocks to VpxTplFrameStats
I realized the calculation of the size of the list of VpxTplBlockStats
is non-trivial. So it's better to add the field for the size.

Bug: b/273736974
Change-Id: Ic1b50597c1f89a8f866b5669ca676407be6dc9d8
2023-05-04 10:59:46 -04:00
Jerome Jiang f059f9ee2d Add Vpx* prefix to Tpl{Block,Frame}Stats
This is to avoid symbols redifinition when integrating with other
libraries.

Bug: b/273736974
Change-Id: I891af78b1907504d5bb9f735164aea18c2aba944
2023-05-04 10:33:07 -04:00
James Zern 4dd3afc00e Merge changes I4d26f5f8,I12e25710 into main
* changes:
  s/__aarch64__/VPX_ARCH_AARCH64/
  configure: add aarch64 to ARCH_LIST
2023-05-04 02:16:12 +00:00
Jerome Jiang 69d5d16552 Merge "Add codec control to export TPL stats" into main 2023-05-04 01:55:41 +00:00
Jerome Jiang de45e4b612 Add codec control to export TPL stats
new codec control: VP9E_GET_TPL_STATS with unit test

Bug: b/273736974
Change-Id: I27343bd3f6dffafc86925234537bcdb557bc4079
2023-05-03 19:16:24 -04:00
chiyotsai 2c03388231 Fix mismatched param names in vpx_dsp/x86/sad4d_avx2.c
Change-Id: I226215a2ff8798b72abe0c2caf3d18875595caa5
2023-05-03 14:45:13 -07:00
chiyotsai 174e782fe5 Fix mismatched param names in vpx_dsp/arm/highbd_sad4d_neon.c
Change-Id: Ia4918eb0bac3b28b27e1ef205b9171680b2eb9a4
2023-05-03 14:44:08 -07:00
chiyotsai 701392c1b0 Fix mismatched param names in vpx_dsp/arm/sad4d_neon.c
Change-Id: If621944684cf9bb9f353db5961ed8b4b4ae38f24
2023-05-03 14:12:41 -07:00
chiyotsai 8782fd070d Fix mismatched param names in vpx_dsp/arm/highbd_avg_neon.c
Change-Id: Ibf00a6e1029284e637b10ef01ac9b31ffadc74ca
2023-05-03 14:12:19 -07:00
chiyotsai 3dbadd1b83 Fix clang warning on const-qualification of parameters
Change-Id: I900a0a48dde5fcb262157b191ac536e18269feb3
2023-05-03 14:12:12 -07:00
James Zern a398b60d6c fdct8x8_test: EXPECT_* -> ASSERT_*
This avoids unnecessary logging when a block has multiple errors.

Change-Id: If0f3e6f8ff5bd284655f7cabfd23c253c93d44c5
2023-05-03 10:09:03 -07:00
James Zern 57b9afa58f s/__aarch64__/VPX_ARCH_AARCH64/
This allows AArch64 to be correctly detected when building with Visual
Studio (cl.exe) and fixes a crash in vp9_diamond_search_sad_neon.c.
There are still test failures, however.

Microsoft's compiler doesn't define __ARM_FEATURE_*. To use those paths
we may need to rely on _M_ARM64_EXTENSION.

Bug: webm:1788
Bug: b/277255076
Change-Id: I4d26f5f84dbd0cbcd1cdf0d7d932ebcf109febe5
2023-05-03 10:04:34 -07:00
James Zern 33aba6ecc1 configure: add aarch64 to ARCH_LIST
This will allow identifying Windows Visual Studio targets as aarch64;
the Microsoft compiler does not define __aarch64__.

An alternative would be to define this in the code, checking for
_M_ARM64 or _M_ARM64EC. For now we'll use the existing VPX_ARCH_*
system. For compatibility VPX_ARCH_ARM will continue to be defined to 1
in this case.

Bug: webm:1788
Bug: b/277255076
Change-Id: I12e25710891e86f0c7339ba96884c18ed90ba16f
2023-05-02 17:43:39 -07:00
Jerome Jiang 84a180fe85 Move TplFrameStats to public header
Get ready for changes to follow:

- Custom reader/writer IO functions
- Codec control to get TPL stats from the encoder

Move the definition of TplFrameStats to public header so applications
can use them directly.

Bug: b/273736974
Change-Id: Ieb0db4560ddd966df1bc01f6a7e179cc97f9bac1
2023-05-01 13:39:01 -04:00
Jerome Jiang dbb1e8c7a6 Clean up a stale TODO in tpl
Change-Id: Ieccaff1cc94cbb2c5a294d83f3080f7407267016
2023-04-27 15:58:08 -04:00
James Zern 1a3e5567f2 Merge "register_state_check: clear -Wshadow warning" into main 2023-04-25 20:34:13 +00:00
James Zern 97d40abf9a Merge "highbd_vpx_convolve8_neon: clear -Wshadow warning" into main 2023-04-25 20:21:24 +00:00
James Zern 59d40c1415 Merge "vp9_highbd_iht16x16_add_neon: clear -Wshadow warning" into main 2023-04-25 20:12:30 +00:00
Yunqing Wang 52076a9c79 Merge "Reduce joint motion search iters based on bsize" into main 2023-04-24 21:15:14 +00:00
Neeraj Gadgil e7b58b69fd Reduce joint motion search iters based on bsize
Joint motion search during compound mode eval is optimized by
reducing the number of mv search iterations based on bsize.
The sf 'comp_inter_joint_search_thresh' is renamed as
'comp_inter_joint_search_iter_level' and used to add the logic.

cpu  Testset  Instr. Cnt     BD Rate loss (%)
               Red (%)   avg. psnr  ovr.psnr    ssim
 0   LOWRES2    5.373     0.0917     0.1088    0.0294
 0   MIDRES2    3.395     0.0239     0.0520    0.0783
 0    HDRES2    2.291     0.0223     0.0301    0.0053
 0   Average    3.686     0.0460     0.0636    0.0377

STATS_CHANGED

Change-Id: I7ee8873ebc8af967382324ae8f5c70c26665d5e6
2023-04-24 10:40:56 +05:30
Jerome Jiang 24802201ac Reland "Calculate recrf_dist and recrf_rate"
This is a reland of commit 3c59378e4e

Addressed issues from the previous CL:

- Both recon_error and rate_cost are scaled up
- recon_error and rate_cost are not accumulated across ref frames,
  instead they are calculated with the best ref frame picked.
- get_quantize_error() is put where it was, so there is no behavior
  change for vp9.

Bug: b/273736974

Original change's description:
> Calculate recrf_dist and recrf_rate
>
> Change-Id: I74e74807436b92d729e2ccaab96149780f1f52d9

Change-Id: I20e1f5543e83b576a074bd4e6b44d99da65f4b56
2023-04-21 19:18:37 -04:00
James Zern fed3de997c highbd_vpx_convolve8_neon: clear -Wshadow warning
Bug: webm:1793
Change-Id: If1a46fe183cd18e05b5538b1eba098e420b745ec
2023-04-21 13:07:04 -07:00
James Zern ec2a75ce9c vp9_highbd_iht16x16_add_neon: clear -Wshadow warning
Bug: webm:1793
Change-Id: I4e79a4d7d41b6abf88e3e60c54ab48a92b0346d2
2023-04-21 13:06:07 -07:00
Jerome Jiang a425371ccd Revert "Calculate recrf_dist and recrf_rate"
This reverts commit 3c59378e4e.

Reason for revert:

recon_error and recon_rate is summed by mistake across reference frames, as pointed out by Angie.

It could also cause vp9 behavior changes.

Original change's description:
> Calculate recrf_dist and recrf_rate
>
> Change-Id: I74e74807436b92d729e2ccaab96149780f1f52d9

Change-Id: I6106ce77cb0fe8c12b2bcf070d01513ffa8dc613
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2023-04-21 18:10:46 +00:00
Jerome Jiang 3c59378e4e Calculate recrf_dist and recrf_rate
Change-Id: I74e74807436b92d729e2ccaab96149780f1f52d9
2023-04-20 20:20:46 -04:00
Jerome Jiang 85fe280196 Merge "Store tpl stats before propagation" into main 2023-04-21 00:16:02 +00:00
James Zern b27cf67c30 register_state_check: clear -Wshadow warning
with --target=x86_64-win64-gcc

Bug: webm:1793
Change-Id: I265533af4e8d05adbe1d66a62b6dcb191ca48747
2023-04-20 14:18:26 -07:00
James Zern a7a9983314 Merge "configure: skip arm64_neon.h workaround w/VS >= 2019" into main 2023-04-20 21:11:09 +00:00
James Zern 885a533482 Merge "vp9_tpl_model: clear -Wshadow warning" into main 2023-04-20 19:51:56 +00:00
Jerome Jiang f49879a2a3 Store tpl stats before propagation
Add two new structs TplBlockStats and TplFrameStats to store tpl stats
before propagation

Change-Id: I903db99326b199ed8f2d8b19ccb973a8c8910501
2023-04-20 15:24:18 -04:00
James Zern f7d5c3eff8 configure: skip arm64_neon.h workaround w/VS >= 2019
Visual Studio 2019+ include arm64_neon.h from arm_neon.h

Bug: b/277255076
Change-Id: I52f42b69a5efe8214a4c541b68e940ad07499584
2023-04-20 12:20:23 -07:00
James Zern 5248bf6be5 Merge "vp9_spatial_svc_encoder: quiet -Wunused-but-set-variable" into main 2023-04-20 16:56:29 +00:00
James Zern 76c8f5dbcf Merge "vp9_ratectrl,vp9_encodedframe_overshoot: rm unused var" into main 2023-04-20 16:56:13 +00:00
James Zern 57e48bc8f7 Merge "onyx_if,encode_frame_to_data_rate: rm unused var" into main 2023-04-20 16:55:59 +00:00
James Zern 4cf48ce099 Merge "libs.mk: quote $(LIBVPX_TEST_DATA_PATH)" into main 2023-04-20 16:55:38 +00:00
James Zern e8fa7a038b libs.mk: quote $(LIBVPX_TEST_DATA_PATH)
This allows the testdata target to work environments like cygwin/msys
when a windows style path is used. It may also fix using paths with
spaces, though that's not generally recommended.

Change-Id: Id444c14468b05d589bce49c1f612aa712a3f0c8c
2023-04-19 18:58:59 -07:00
James Zern 4366ff7222 vp9_spatial_svc_encoder: quiet -Wunused-but-set-variable
with clang-17. Move frames_received under OUTPUT_FRAME_STATS; it's only
used in a printf.

Change-Id: Idfdd59ccd04e43df1855203db82bb4c8a1d059fb
2023-04-19 18:46:11 -07:00
James Zern 895317cdf1 vp9_ratectrl,vp9_encodedframe_overshoot: rm unused var
quiets -Wunused-but-set-variable with clang-17

Change-Id: I5212a20286d0252e45a8e8813d15cb780494b0ad
2023-04-19 18:46:05 -07:00
James Zern 84b4dfa5ba vp9_encodeframe: rm unused vars
in get_rdmult_delta() and compute_frame_aq_offset().

quiets -Wunused-but-set-variable with clang-17

Change-Id: I726852f3bc42afa80a18475de910040a9436b0bb
2023-04-19 18:46:00 -07:00
James Zern 933cf345dd onyx_if,encode_frame_to_data_rate: rm unused var
quiets -Wunused-but-set-variable with clang-17

Change-Id: Ia819beac84cbd57f4eeca6174c785fd320bc40c6
2023-04-19 18:45:53 -07:00
James Zern 860f245de9 Merge changes Ib0c2f852,Ieb77661e,I56ea656e,Ibda734c2 into main
* changes:
  Add Neon implementations of vpx_highbd_sad_skip_<w>x<h>x4d
  Add Neon implementation of vpx_sad_skip_<w>x<h>x4d functions
  Add Neon implementation of vpx_highbd_sad_skip_<w>x<h> functions
  Add Neon implementation of vpx_sad_skip_<w>x<h> functions
2023-04-19 23:17:10 +00:00
Jonathan Wright ab830fe6a1 Add Neon implementations of vpx_highbd_sad_skip_<w>x<h>x4d
Add Neon implementations of high bitdepth downsampling SAD4D
functions for all block sizes.

Also add corresponding unit tests.

Change-Id: Ib0c2f852e269cbd6cbb8f4dfb54349654abb0adb
2023-04-19 00:57:25 +01:00
Jonathan Wright 42c0cbb9cb Add Neon implementation of vpx_sad_skip_<w>x<h>x4d functions
Add Neon implementations of standard bitdepth downsampling SAD4D
functions for all block sizes.

Also add corresponding unit tests.

Change-Id: Ieb77661ea2bbe357529862a5fb54956e34e8d758
2023-04-19 00:57:18 +01:00
Jonathan Wright 05b244af52 Add Neon implementation of vpx_highbd_sad_skip_<w>x<h> functions
Add Neon implementations of high bitdepth downsampling SAD functions
for all block sizes.

Also add corresponding unit tests.

Change-Id: I56ea656e9bb5f8b2aedfdc4637c9ab4e1951b31b
2023-04-19 00:57:08 +01:00
Jonathan Wright 7b7f84fe14 Add Neon implementation of vpx_sad_skip_<w>x<h> functions
Add Neon implementations of standard bitdepth downsampling SAD
functions for all block sizes.

Also add corresponding unit tests.

Change-Id: Ibda734c270278d947673ffcc29ef17a2f4970b01
2023-04-19 00:56:43 +01:00
James Zern e85f9003be Merge "mr_dissim: clear -Wshadow warning" into main 2023-04-18 23:51:04 +00:00
James Zern 873fd58973 Merge "onyx_if: clear -Wshadow warning" into main 2023-04-18 19:24:35 +00:00
James Zern 1dda358cb0 Merge "vp9_rdcost: clear -Wshadow warnings" into main 2023-04-18 19:19:45 +00:00
Yunqing Wang 3d7358796d Merge "Downsample SAD computation in motion search" into main 2023-04-18 16:11:01 +00:00
James Zern d725bdd8a1 vp9_tpl_model: clear -Wshadow warning
with --enable-experimental --enable-non-greedy-mv

Bug: webm:1793
Change-Id: I19e38d7196291ae1ffbb5fb3daa70a4fefd54c55
2023-04-17 22:09:17 -07:00
James Zern eef765751a mr_dissim: clear -Wshadow warning
Bug: webm:1793
Change-Id: I73ced43aba45215264134f917fd69ab0b1f10d01
2023-04-17 22:09:09 -07:00
James Zern 7bdce0887b onyx_if: clear -Wshadow warning
with --enable-internal-stats

Bug: webm:1793
Change-Id: I9d375e4cb45f78b82afe455f2c7ad2b56e217f7d
2023-04-17 22:09:02 -07:00
Yunqing Wang 8f14f66490 Merge "Add AVX2 intrinsic for vpx_fdct16x16() function" into main 2023-04-17 20:21:21 +00:00
Anupam Pandey e15c2e3445 Add AVX2 intrinsic for vpx_fdct16x16() function
Introduced AVX2 intrinsic to compute FDCT for block size
16x16 case. This is a bit-exact change.

Please check the module level scaling w.r.t C function (timer based)
for existing (SSE2) and new AVX2 intrinsics:

   Scaling
SSE2      AVX2
3.88x     5.95x

Change-Id: I02299c3746fcb52d808e2a75d30aa62652c816dc
2023-04-17 15:23:51 +05:30
James Zern bdba4591a7 vp9_rdcost: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I6d48038d74e510ecb5773dfffbdc4c10b765c2aa
2023-04-14 10:25:58 -07:00
Jerome Jiang 27171320f5 Merge "Add VP8RateControlRTC::GetLoopfilterLevel" into main 2023-04-14 17:25:07 +00:00
James Zern 11d6f069ee Merge "libs.mk: Fix wrong scope end comments" into main 2023-04-14 17:22:47 +00:00
L. E. Segovia dca0a8b860 libs.mk: Fix wrong scope end comments
I believe the following comments are wrongly scoped, possibly left over
from previous changesets. This made me very confused when reading the
test suite Makefile, in order to port it to Meson.

Change-Id: Ice3c7ba50c6909a9c7dfd4001afa1e1ddfa4b5ce
2023-04-14 10:41:20 +00:00
Jerome Jiang 536c986764 Add VP8RateControlRTC::GetLoopfilterLevel
New linear model to calculate loopfilter level from frame qp.

Linear regression was done on qvga, vga, and hd clips.

Bug: b/275304642
Change-Id: I552b312212bb4de21b53b762d139aa9588c64ae2
2023-04-13 17:02:51 -04:00
James Zern dfe285a6b9 Merge "vp9_frame_scale_ssse3: clear -Wshadow warnings" into main 2023-04-13 20:59:43 +00:00
James Zern e3c50aa072 Merge changes I2a26c929,I0b7f0136,Ib65a2dff into main
* changes:
  vpxenc: clear -Wshadow warnings
  vpxdec: clear -Wshadow warnings
  svc_encodeframe: clear -Wshadow warnings
2023-04-13 18:35:49 +00:00
James Zern ec2993d549 Merge changes I571a9d64,I22db73cb into main
* changes:
  dct_test: clear -Wshadow warnings
  convolve_test: clear -Wshadow warning
2023-04-13 18:35:21 +00:00
James Zern 329fa7009a Merge "vp9_pickmode: clear -Wshadow warnings" into main 2023-04-13 18:35:10 +00:00
James Zern 6c65608253 vpxenc: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I2a26c9297016d3fa2c32e8974ef3d7dab1e524c4
2023-04-12 14:57:28 -07:00
James Zern 556e4f6cad vpxdec: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I0b7f013682229cde50df7c62db9dab6eab0fd341
2023-04-12 14:57:28 -07:00
James Zern a3eb39ab6f svc_encodeframe: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ib65a2dff124034d8e653572f8ada65984e55ed70
2023-04-12 14:57:28 -07:00
James Zern 968960c7b3 dct_test: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I571a9d641b2f7f4b9d7c473ca815d4ea10b9f9af
2023-04-12 14:57:13 -07:00
James Zern 698eb779f2 convolve_test: clear -Wshadow warning
Bug: webm:1793
Change-Id: I22db73cb756c6c680b73684caef1e08bb6e729d8
2023-04-12 14:57:13 -07:00
James Zern ff4123215d vp9_frame_scale_ssse3: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I85608ac7bb6d3a61649ba342c13c3bf6a39a5dea
2023-04-12 14:56:16 -07:00
James Zern 39a6b6c136 vp9_temporal_filter: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ia681ce636ae99f95b875ee1b0189bc6fa66a7608
2023-04-12 14:56:00 -07:00
James Zern 2513f6d5f4 vp9_svc_layercontext: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I63669de9835713ec70dafa88ca8f2c2459e59698
2023-04-12 14:56:00 -07:00
James Zern aaffc6e306 vp9_pickmode: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I26c063818144d11c4c91165c3fcbf6f258453cc7
2023-04-12 14:55:45 -07:00
James Zern f254e6da84 vp9_speed_features: clear -Wshadow warning
Bug: webm:1793
Change-Id: I9f509c4461631e358f80b98afbb745ce88e9d7a2
2023-04-11 21:47:10 -07:00
James Zern bde26b9961 vp9_ratectrl: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I2476a9d8e1d62414fdbe6feee87d5167058f499b
2023-04-11 21:47:10 -07:00
James Zern e3c458149c vp9_mbgraph: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ibffb62775f09922d37f7d0460aa2751e74c36738
2023-04-11 19:16:28 -07:00
James Zern cd2ec5c3df Merge "vp9_quantize_avx2,highbd_get_max_lane_eob: fix mask" into main 2023-04-11 18:40:00 +00:00
Yunqing Wang 5f615708e4 Merge "Add assert to ensure NEARESTMV or NEWMV modes are not skipped" into main 2023-04-11 18:35:10 +00:00
Yunqing Wang 23d37b3d04 Merge "Avoid redundant start MV SAD calculation" into main 2023-04-11 18:31:25 +00:00
Deepa K G 232f8659aa Downsample SAD computation in motion search
Added a speed feature to skip every other row
in SAD computation during motion search.

                 Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      0.958         0.0204     0.0095    0.0275
 0       MIDRES2      1.891        -0.0636     0.0032    0.0247
 0        HDRES2      2.869         0.0434     0.0345    0.0686
 0       Average      1.905         0.0000     0.0157    0.0403

STATS_CHANGED

Change-Id: I1a8692757ed0cbcb2259729b3ecfb0436cdf49ce
2023-04-11 19:11:51 +05:30
Cherma Rajan A 35c32b1d22 Add assert to ensure NEARESTMV or NEWMV modes are not skipped
Added an assert for prune_single_mode_based_on_mv_diff_mode_rate
speed feature. This ensures NEARMV or ZEROMV modes are pruned
only when NEARESTMV and NEWMV modes are not early terminated.

Change-Id: Id8b03eef6d1ef3f16714a9cbfde0c171c0c6fe0b
2023-04-11 17:15:08 +05:30
Deepa K G 987ed6937b Avoid redundant start MV SAD calculation
Avoided repeated calculation of start MV
SAD during full pixel motion search.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      0.162
 0       MIDRES2      0.246
 0        HDRES2      0.325
 0       Average      0.245

Change-Id: I2b4786901f254ce32ee8ca8a3d56f1c9f112f1d4
2023-04-11 17:07:06 +05:30
James Zern 61709a177a vp9_quantize_avx2,highbd_get_max_lane_eob: fix mask
Pack nz_mask with zero. After the result is permuted this has the effect
of ignoring the upper half of the iscan register which is only loaded
with 128-bits. Depending on the optimization level and the load used the
upper half of the ymm register may contain undefined values which can
produce an incorrect eob. If this is large enough it can cause a crash.

Bug: chromium:1431729
Change-Id: I4ebae9fa39f228bdd29dcc19935f3f07759d75f5
2023-04-10 14:47:06 -07:00
Yunqing Wang 31b6d12892 Merge "Add AVX2 intrinsic for variance function for block width 8" into main 2023-04-10 18:50:09 +00:00
Yunqing Wang bd53ceef3a Merge "Prune single ref modes based on mv difference and mode rate" into main 2023-04-10 17:01:19 +00:00
James Zern 6fd360c684 Merge "Optimize Armv8.0 Neon SAD4D 16xh, 32xh, and 64xh functions" into main 2023-04-07 22:19:18 +00:00
James Zern 12ab4af3ae vp9_dx_iface: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ice6cd08f145e5813e24345d03e0913e5eda5289f
2023-04-06 15:37:40 -07:00
James Zern bebc860915 vp9_encoder: clear -Wshadow warning
Bug: webm:1793
Change-Id: Id390c61f82b9f15063d0310a2c252b02b479d9c5
2023-04-06 15:37:26 -07:00
James Zern 868674d330 vpx_subpixel_8t_intrin_avx2: clear -Wshadow warning
Bug: webm:1793
Change-Id: Icba4ad242dcd0cad736b9a203829361c5bd1ca3f
2023-04-06 12:57:23 -07:00
James Zern 3c0c01357f Merge "Optimize Neon paths of high bitdepth SAD and SAD4d for 8xh blocks" into main 2023-04-06 17:50:00 +00:00
Jonathan Wright ff8a965856 Optimize Armv8.0 Neon SAD4D 16xh, 32xh, and 64xh functions
Add a widening 4D reduction function operating on uint16x8_t vectors
and use it to optimize the final reduction in Armv8.0 Neon standard
bitdepth 16xh, 32xh and 64h SAD4D computations.

Also simplify the Armv8.0 Neon version of the sad64xhx4d_neon helper
function since VP9 block sizes are not large enough to require
widening to 32-bit accumulators before the final reduction.

Change-Id: I32b0a283d7688d8cdf21791add9476ed24c66a28
2023-04-06 17:41:01 +01:00
Jonathan Wright a5801b00a8 Optimize 4D Neon reduction for 4xh and 8xh SAD4D blocks
Add a 4D reduction function operating on uint16x8_t vectors and use
it to optimize the final reduction in standard bitdepth 4xh and 8xh
SAD4D computations. Similar 4D reduction optimizations have already
been implemented for all other standard bitdepth block sizes, and all
high bitdepth block sizes.[1]

[1] https://chromium-review.googlesource.com/c/webm/libvpx/+/4224681

Change-Id: I0aa0b6e0f70449776f316879cafc4b830e86ea51
2023-04-04 14:52:52 +01:00
Anupam Pandey e2465dfc25 Add AVX2 intrinsic for variance function for block width 8
Added AVX2 intrinsic optimization for the following functions
1. vpx_variance8x4
2. vpx_variance8x8
3. vpx_variance8x16

This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      0.698
 0       MIDRES2      0.577
 0        HDRES2      0.469
 0       Average      0.582

Change-Id: Iae8fdf9344fd012cda4955ed140633141d60ba86
2023-04-04 15:36:22 +05:30
James Zern 0f42bd3fb8 Merge changes Idaf49de6,I6d7d96ff,I0d64c923 into main
* changes:
  svc_datarate_test: clear -Wshadow warning
  vp9_mcomp.c: clear -Wshadow warnings
  vp9_rc_get_second_pass_params: clear -Wshadow warning
2023-03-30 22:44:51 +00:00
Cherma Rajan A 1025d37b03 Prune single ref modes based on mv difference and mode rate
This patch introduces a speed feature to prune single reference
modes - NEARMV and ZEROMV based on motion vector difference and
mode rate w.r.t previously evaluated single reference modes
corresponding to the same reference frame.

                Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      1.686        -0.0039    -0.0105   -0.0098
 0       MIDRES2      1.026        -0.0234     0.0029    0.0120
 0        HDRES2      0.000         0.0000     0.0000    0.0000
 0       Average      0.889        -0.0091    -0.0025    0.0007

STATS_CHANGED

Change-Id: I387acd3a73d8256904a7ce684b198d251cf3dd04
2023-03-30 14:37:06 +05:30
George Steed a257b4d6be Avoid vshr and vget_{low,high} in Neon d135 predictor impl
The shift instructions have marginally worse performance on some
micro-architectures, and the vget_{low,high} instructions are
unnecessary.

This commit improves performance of the d135 predictors by 1.5% geomean
averaged across a range of compilers and micro-architectures.

Change-Id: Ied4c3eecc12fc973841696459d868ce403ed4e6c
2023-03-30 09:00:26 +00:00
George Steed c1c7dd3138 Use sum_neon.h helpers in Neon DC predictors
Use sum_neon.h helpers for horizontal reductions in Neon DC predictors,
enabling use of dedicated Neon reduction instructions on AArch64. Some
of the surrounding code is also optimized to remove redundant broadcast
instructions in the dc_store helpers.

Performance is largely unchanged on both the standard as well as the
high bit-depth predictors. The main improvement appears to be the 16x16
standard-bitdepth dc predictor, which improves by 10-15% when
benchmarked on Neoverse N1.

Change-Id: Ibfcc6ecf4b1b2f87ce1e1f63c314d0cc35a0c76f
2023-03-30 09:00:19 +00:00
James Zern 01d282ac95 Merge changes Ie4ffa298,If5ec220a,I670dc379 into main
* changes:
  Avoid LD2/ST2 instructions in highbd v predictors in Neon
  Avoid interleaving loads/stores in Neon for highbd dc predictor
  Avoid LD2/ST2 instructions in vpx_dc_predictor_32x32_neon
2023-03-29 20:52:46 +00:00
Jerome Jiang e47676c11c Merge "svc: Fix a case where target bandwidth is 0" into main 2023-03-29 18:24:18 +00:00
Jerome Jiang 0f893ea0b6 svc: Fix a case where target bandwidth is 0
Bug: webrtc:15033
Change-Id: Iea2997c2ce8982f106a1eed3ec4f7dd1c6e83666
2023-03-29 13:06:19 -04:00
Salome Thirot cf1efecebf Optimize Neon paths of high bitdepth SAD and SAD4d for 8xh blocks
For these block sizes there is no need to widen to 32-bits until the
final reduction, so use a single vabaq instead of vabd + vpadalq.

Change-Id: I9c19d620f7bb8b3a6b0bedd37789c03bb628b563
2023-03-29 16:50:34 +01:00
George Steed 9824167ad2 Avoid LD2/ST2 instructions in highbd v predictors in Neon
The interleaving load/store instructions (LD2/LD3/LD4 and ST2/ST3/ST4)
are useful if we are dealing with interleaved data (e.g. real/imag
components of complex numbers), but for simply loading or storing larger
quantities of data it is preferable to simply use the normal load/store
instructions.

This patch replaces such occurrences in the two larger block sizes:
vpx_highbd_v_predictor_16x16_neon and vpx_highbd_v_predictor_32x32_neon.

Change-Id: Ie4ffa298a2466ceaf893566fd0aefe3f66f439e4
2023-03-29 08:39:35 +00:00
George Steed 83def747ff Avoid interleaving loads/stores in Neon for highbd dc predictor
The interleaving load/store instructions (LD2/LD3/LD4 and ST2/ST3/ST4)
are useful if we are dealing with interleaved data (e.g. real/imag
components of complex numbers), but for simply loading or storing larger
quantities of data it is preferable to simply use two or more of the
normal load/store instructions.

This patch replaces such occurrences in the two larger block sizes:
vpx_highbd_dc_predictor_16x16_neon, vpx_highbd_dc_predictor_32x32_neon,
and related helper functions.

Speedups over the original Neon code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 | 16x16 |    1.25
Neoverse N1 |  LLVM 15 | 32x32 |    1.13
Neoverse N1 |   GCC 12 | 16x16 |    1.56
Neoverse N1 |   GCC 12 | 32x32 |    1.52
Neoverse V1 |  LLVM 15 | 16x16 |    1.63
Neoverse V1 |  LLVM 15 | 32x32 |    1.08
Neoverse V1 |   GCC 12 | 16x16 |    1.59
Neoverse V1 |   GCC 12 | 32x32 |    1.37

Change-Id: If5ec220aba9dd19785454eabb0f3d6affec0cc8b
2023-03-29 08:39:35 +00:00
George Steed 4cf9819282 Avoid LD2/ST2 instructions in vpx_dc_predictor_32x32_neon
The LD2 and ST2 instructions are useful if we are dealing with
interleaved data (e.g. real/imag components of complex numbers), but for
simply loading or storing larger quantities of data it is preferable to
simply use two of the normal load/store instructions.

This patch replaces such occurrences in vpx_dc_predictor_32x32_neon and
related functions.

With Clang-15 this speeds up this function by 10-30% depending on the
micro-architecture being benchmarked on. With GCC-12 this speeds up the
function by 40-60% depending on the micro-architecture being benchmarked
on.

Change-Id: I670dc37908aa238f360104efd74d6c2108ecf945
2023-03-29 08:39:35 +00:00
Yunqing Wang 6d0e5e56ae Merge "Add AVX2 for convolve vertical filter for block width 4" into main 2023-03-28 22:14:51 +00:00
James Zern aba570ac95 Merge changes If83ff1ad,I8fb00a15,Iaad58e77,Iac166d60 into main
* changes:
  Randomize second half of above_row_ in intrapred tests for Neon
  Allow non-uniform above array in d63 predictor Neon impl
  Allow non-uniform above array in d45 predictor Neon impl
  Allow non-uniform above array in highbd d45 predictor Neon impl
2023-03-28 20:14:12 +00:00
James Zern 8e58d504fa Merge "update libwebm to libwebm-1.0.0.29-9-g1930e3c" into main 2023-03-28 18:36:01 +00:00
Jerome Jiang 972149cafe svc: Fix a case where target bandwidth is 0
Bug: webrtc:15033
Change-Id: I28636de66842671b03284408186c4c18254109a5
2023-03-28 11:26:54 -04:00
George Steed 100ca0356d Randomize second half of above_row_ in intrapred tests for Neon
The existing tests duplicate `above_row_[block_size - 1]` after the
first `block_size` elements, which can lead to tests incorrectly passing
due to differing behaviour when calculating the average for the last
elements of the output.

This change adjusts the above array setup to be fully random instead,
allowing us to catch such issues here rather than in other larger tests
like the external MD5 tests.

It doesn't appear that other architectures are fully clean with this
change so restrict it to just Neon for now until they are fixed.

Bug: webm:1797
Change-Id: If83ff1adbf1e8d30f2a92474d7186c65840a5d0b
2023-03-28 13:46:11 +00:00
George Steed 911d6e165e Allow non-uniform above array in d63 predictor Neon impl
The existing standard bitdepth implementation doesn't appear to manifest
as a failure in any of the predictor or MD5 tests, but it does rely on
the predictor tests filling the second `bs` elements of the `above`
input array with copies of `above[bs - 1]` in order to match the C
implementation.

This patch adjusts the Neon implementation to correctly match the C
implementation in the case where the elements of the `above` array all
differ.

The geomean of performance for the predictor is approximately a 2%
slowdown compared to the previous vectorized implementation. This is
still considerably faster than the unspecialized naive C implementation.

Bug: webm:1797
Change-Id: I8fb00a154288d54b24a72a7ff63c816bdcf3aca3
2023-03-28 13:27:22 +00:00
George Steed 3eb3781589 Allow non-uniform above array in d45 predictor Neon impl
The existing implementation doesn't appear to manifest as a failure in
any of the predictor or MD5 tests, but it does rely on the predictor
tests filling the second `bs` elements of the `above` input array with
copies of `above[bs - 1]` in order to match the C implementation.

This patch adjusts the Neon implementation to correctly match the C
implementation in the case where the elements of the `above` array all
differ.

Performance of the predictor is mostly unchanged, except for the 32x32
block size where it appears to have gotten about 40% faster when
compiled with clang-15.

Bug: webm:1797
Change-Id: Iaad58e77c5467307a3c80d6989b7cf2988e09311
2023-03-28 13:27:11 +00:00
George Steed 25825f6a78 Allow non-uniform above array in highbd d45 predictor Neon impl
The existing implementation doesn't appear to manifest as a failure in
any of the predictor or MD5 tests, but it does rely on the predictor
tests filling the second `bs` elements of the `above` input array with
copies of `above[bs - 1]` in order to match the C implementation.

This patch adjusts the Neon implementation to correctly match the C
implementation in the case where the elements of the `above` array all
differ.

Performance of the predictor is mostly unchanged, except for the 16x16
block size where it appears to have gotten marginally faster across most
compiler/micro-architecture combinations.

Bug: webm:1797
Change-Id: Iac166d6047316c0382e0f2790ce780fc99674b43
2023-03-28 08:29:01 +00:00
Anupam Pandey b4d154c948 Add AVX2 for convolve vertical filter for block width 4
Introduced AVX2 intrinsic to compute convolve vertical for
w = 4 case. This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      0.364
 0       MIDRES2      0.236
 0        HDRES2      0.162
 0       Average      0.254

Change-Id: I413f58aa6333a6f2421d4c10d49dec01e55b2098
2023-03-28 10:01:03 +05:30
James Zern 8f17482e82 vp9_rdopt,block_rd_txfm: fix clang-tidy warning
argument name 'recon' in comment does not match parameter name
'out_recon'.

https://clang.llvm.org/extra/clang-tidy/checks/bugprone/argument-comment.html

+ normalize similar calls, using /*var=*/NULL to better match the style
  guidelines

https://google.github.io/styleguide/cppguide.html#Function_Argument_Comments

Change-Id: I089591317f7138965735f737c1536a8b16fcd4e4
2023-03-27 16:20:22 -07:00
James Zern 66885a69ff svc_datarate_test: clear -Wshadow warning
rename class member from ref_frame_config to the correct style:
ref_frame_config_.

Bug: webm:1793
Change-Id: Idaf49de6d724014adee75f81efe974b2031241ba
2023-03-24 11:23:12 -07:00
James Zern 89765feb99 vp9_mcomp.c: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I6d7d96ffb3e388eac94d1d41563f7079a8297c85
2023-03-24 11:23:12 -07:00
James Zern 601904d1f7 vp9_rc_get_second_pass_params: clear -Wshadow warning
Bug: webm:1793
Change-Id: I0d64c9234b4bdcfb49a06566dc41df26f5862c1f
2023-03-24 11:23:12 -07:00
James Zern 5b05f6f3a0 Merge changes Ide512788,I77c7abae into main
* changes:
  vp9_scan.h: rename scan_order struct to ScanOrder
  vp9_encodeframe.c: clear -Wshadow warnings
2023-03-24 18:04:19 +00:00
James Zern bad39ce7a3 vp9_scan.h: rename scan_order struct to ScanOrder
This matches the style guide and fixes some -Wshadow warnings related to
variables with the same name. Something similar was done in libaom in:
03f6fdcfca Fix warnings reported by -Wshadow: Part1b: scan_order struct
           and variable

Bug: webm:1793
Change-Id: Ide5127886b7fd7778e6d8a983bfba6edda21ff28
2023-03-24 09:35:55 -07:00
James Zern 1701d55e33 vp9_encodeframe.c: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I77c7abae7bbb1e1f4972cd31e3a67d62477b896e
2023-03-23 19:21:27 -07:00
James Zern cda56fa019 update libwebm to libwebm-1.0.0.29-9-g1930e3c
changelog:
https://chromium.googlesource.com/webm/libwebm/+log/ee0bab576..1930e3ca2

Bug: webm:1792
Change-Id: I5c5c30c767d357528f102ff38957655e2ec0c645
2023-03-23 19:14:31 -07:00
Wan-Teh Chang 5817bce969 Fix comment typos (likely copy-and-paste errors)
Fix comment typos for vpx_codec_destroy() and vpx_codec_enc_init_ver().

Based on the change made in libaom:
https://aomedia.googlesource.com/aom/+/365a968684
365a968684 Fix comment typos (likely copy-and-paste errors)

Change-Id: I39edae835ed0752b569e8e7328d0709c59724ac2
2023-03-23 17:54:35 -07:00
James Zern 81250791dd Merge "Add Neon implementations of vpx_highbd_avg_<w>x<h>_c" into main 2023-03-23 21:40:13 +00:00
James Zern 5afeb89867 Merge "test.mk: use CONFIG_VP(8|9)_ENCODER for vp8/vp9-only tests" into main 2023-03-23 17:22:28 +00:00
James Zern 27424a8176 Merge "svc_encodeframe.c: fix -Wstringop-truncation" into main 2023-03-23 17:21:57 +00:00
Jerome Jiang ccb0597e4b Merge "Revert "Add codec control to get tpl stats"" into main 2023-03-22 20:48:44 +00:00
Jerome Jiang 78bb8e1c0a Revert "Add codec control to get tpl stats"
This reverts commit 9c15fb62b3.

Reason for revert:

vpxenc should only use public interface

Original change's description:
> Add codec control to get tpl stats
>
> Add command line flag to vpxenc to export tpl stats
>
> Bug: b/273736974
> Change-Id: I6980096531b0c12fbf7a307fdef4c562d0c29e32

Bug: b/273736974
Change-Id: Ifa8951bb34e5936bbfc33086b22e9fc36d379bc9
2023-03-22 20:18:39 +00:00
Wan-Teh Chang a0bf98de0d Merge "Change UpdateRateControl() to return bool" into main 2023-03-22 16:09:24 +00:00
Salome Thirot 5c7867beac Add Neon implementations of vpx_highbd_avg_<w>x<h>_c
Add Neon implementation of vpx_highbd_avg_4x4_c and vpx_highbd_avg_8x8_c
as well as the corresponding tests.

Change-Id: Ib1b06af5206774347690c9c56e194b76aa409c91
2023-03-22 10:50:17 +00:00
James Zern 882399bd54 Merge changes I8abac3c9,If678fc19 into main
* changes:
  vp9_bitstream.c: clear -Wshadow warnings
  vp9_setup_mask: clear -Wshadow warnings
2023-03-22 02:14:12 +00:00
James Zern 0a5f886a0c Merge changes I650b305c,If3e4cf37,I4c791e3a into main
* changes:
  sixtappredict_neon.c: remove redundant returns
  sixtappredict_neon.c,cosmetics: fix a typo
  vp8_sixtap_predict16x16_neon: fix overread
2023-03-21 20:20:51 +00:00
Jerome Jiang 9c643a5ef2 Merge "Add codec control to get tpl stats" into main 2023-03-21 18:34:34 +00:00
James Zern d3c9e39635 Merge "Reland "quantize: use scan_order instead of passing scan/iscan"" into main 2023-03-21 00:33:00 +00:00
James Zern 3b6909977c test.mk: use CONFIG_VP(8|9)_ENCODER for vp8/vp9-only tests
fixes some uninstantiated test failures when configured with
--disable-vp8 or --disable-vp9

Change-Id: If9a6705bd070edee02306e89da103ed474688ec8
2023-03-20 17:28:11 -07:00
James Zern 1c37aefcbd svc_encodeframe.c: fix -Wstringop-truncation
use sizeof(buf) - 1 with strncpy.

fixes:
examples/svc_encodeframe.c:282:3: warning: ‘strncpy’ specified bound
1024 equals destination size [-Wstringop-truncation]
  282 |   strncpy(si->options, options, sizeof(si->options));
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Change-Id: I46980872f9865ae1dc2b56330c3a65d8bc6cf1f7
2023-03-20 17:09:42 -07:00
James Zern 44250287fb sixtappredict_neon.c: remove redundant returns
Change-Id: I650b305c2599fc32353daba030e6241d330796a7
2023-03-20 16:58:28 -07:00
James Zern faa9142f5d sixtappredict_neon.c,cosmetics: fix a typo
Change-Id: If3e4cf372fc6ed076f0d42c435a72262494aab68
2023-03-20 16:56:58 -07:00
James Zern e4f0df53ec vp8_sixtap_predict16x16_neon: fix overread
Shift the final read from the source by 3 to avoid breaking the
assumption that the 6-tap filter needs only 5 pixels outside of the
macroblock; this matches the sse2 and ssse3 implementations.

It's possible this restriction could be removed if the source buffers
are assumed to be padded.

Bug: webm:1795
Change-Id: I4c791e3a214898a503c78f4cedca154c75cdbaef
Fixed: webm:1795
2023-03-20 16:51:51 -07:00
Yunqing Wang c0f11c7f6c Merge "Skip trellis coeff opt based on tx block properties" into main 2023-03-20 16:35:44 +00:00
Yunqing Wang 97aa7b2a4c Merge "Refactor logic of skipping trellis coeff opt" into main 2023-03-20 16:27:53 +00:00
Jerome Jiang 9c15fb62b3 Add codec control to get tpl stats
Add command line flag to vpxenc to export tpl stats

Bug: b/273736974
Change-Id: I6980096531b0c12fbf7a307fdef4c562d0c29e32
2023-03-20 12:02:38 -04:00
Deepa K G 55e102dc54 Skip trellis coeff opt based on tx block properties
The trellis coefficient optimization is skipped for blocks
with larger residual mse.

                 Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      9.467        0.0921     0.1057    0.0362
 0       MIDRES2      4.328       -0.0155     0.0694    0.0178
 0        HDRES2      1.858        0.0231     0.0214   -0.0034
 0       Average      5.218        0.0332     0.0655    0.0169

STATS_CHANGED

Change-Id: I321a9b1a34ebb59b7b6a065b5b2d717c8767a4a5
2023-03-19 23:12:04 +05:30
Deepa K G 405ae85666 Refactor logic of skipping trellis coeff opt
The code to enable trellis coefficient optimization
is refactored using the sf 'trellis_opt_tx_rd'. This
change facilitates adaptive skipping of trellis
optimization based on block properties.

Change-Id: Ia1ff7cbbe5acf86414410f62655d46c099387847
2023-03-19 22:13:11 +05:30
James Zern 492f4c5538 vp9_bitstream.c: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I8abac3c901ad24b642b39ea6e6081d8ba626853d
2023-03-17 19:33:50 -07:00
James Zern cb5b047ad8 vp9_setup_mask: clear -Wshadow warnings
Bug: webm:1793
Change-Id: If678fc195ef87cc634d31fb7b24e0c844a5cb7b0
2023-03-17 19:24:20 -07:00
Johann f23f27bb80 Reland "quantize: use scan_order instead of passing scan/iscan"
This is a reland of commit 14fc40040f

Parent change fixed in crrev.com/c/webm/libvpx/+/4305500

Original change's description:
> quantize: use scan_order instead of passing scan/iscan
>
> further reduces the arguments for the 32x32. This will be applied to the base
> version as well.
>
> Change-Id: I25a162b5248b14af53d9e20c6a7fa2a77028a6d1

Change-Id: I2a7654558eaddd68bd09336bf317b297f18559d2
2023-03-18 06:39:45 +09:00
James Zern 6788c75055 Merge changes I5d9444a2,I1f127df9 into main
* changes:
  Add Neon implementation of vpx_highbd_minmax_8x8_c
  Add tests for vpx_highbd_minmax_8x8_c
2023-03-17 20:35:24 +00:00
James Zern d446ddd32d Merge "Reland "quantize: simplifly highbd 32x32_b args"" into main 2023-03-17 20:32:11 +00:00
Salome Thirot fff4e76b55 Add Neon implementation of vpx_highbd_minmax_8x8_c
Add Neon implementation of vpx_highbd_minmax_8x8_c as well as the
corresponding tests.

Change-Id: I5d9444a239fb1baa53634c1bdb5292b44067d90c
2023-03-17 18:40:41 +00:00
Salome Thirot c6da2329b9 Add tests for vpx_highbd_minmax_8x8_c
Write tests for vpx_highbd_minmax_8x8_c, and fix initial value of min in
vpx_highbd_minmax_8x8_c.

Change-Id: I1f127df945bbb8c7d373c5430ff5f94f28575968
2023-03-17 18:40:41 +00:00
Johann 02fd7d6aeb Reland "quantize: simplifly highbd 32x32_b args"
This is a reland of commit 573f5e662b

Alignment issue with tests fixed in crrev.com/c/webm/libvpx/+/4305500

Original change's description:
> quantize: simplify highbd 32x32_b args
>
> Change-Id: I431a41279c4c4193bc70cfe819da6ea7e1d2fba1

Change-Id: Ic868b6f987c99d88672858fedd092fa49c125e19
2023-03-17 12:52:15 +00:00
Wan-Teh Chang 430c6c1553 Change UpdateRateControl() to return bool
Change the VP9RateControlRtcConfig constructor to initialize
ss_number_layers (to 1).

Change UpdateRateControl() to return bool so that it can report failure
(due to invalid configuration).

Also change InitRateControl() to return bool to propagate the return
value of UpdateRateControl().

Note: This is a port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/172042.

Change-Id: I90b60353b5f15692dba5d89e7b1a9c81bb2fdd89
2023-03-16 19:55:41 -07:00
Wan-Teh Chang af63e31978 Merge "Set oxcf->ts_rate_decimator[tl] only once" into main 2023-03-17 02:54:21 +00:00
Wan-Teh Chang d92681b06f Set oxcf->ts_rate_decimator[tl] only once
The code that sets oxcf->ts_rate_decimator[tl] does not need to be
inside a loop that iterates over sl. Move the code out of the sl loop so
that oxcf->ts_rate_decimator[tl] is set only once.

Change-Id: I22f6c117d200ec38a757b749a8700660d15436c1
2023-03-16 18:36:13 -07:00
Wan-Teh Chang d6b6f85063 Remove repeated field from VP9RateControlRtcConfig
Remove the `ts_number_layers` field from VP9RateControlRtcConfig because
the base class VpxRateControlRtcConfig already has that field.

Note: In commit 65a1751e5b,
`ts_number_layers` was moved to the newly created base class
VpxRateControlRtcConfig but was inadvertently left in
VP9RateControlRtcConfig:
https://chromium-review.googlesource.com/c/webm/libvpx/+/3140048,

Change-Id: I98d48e152683ec2e5e62efffb56b7f010c5d0695
2023-03-16 15:32:02 -07:00
Wan-Teh Chang 4265e364ff Merge "Update the sample code for VP9RateControlRTC" into main 2023-03-16 21:40:14 +00:00
Yunqing Wang 5ca4953569 Merge "Add AVX2 for convolve horizontal filter for block width 4" into main 2023-03-16 20:44:11 +00:00
Wan-Teh Chang d67a0021e7 Update the sample code for VP9RateControlRTC
Update the sample code to the current VP9RateControlRTC interface.

Change-Id: I30b0712c897f93fd62ebce51ce39afce3cac1fd7
2023-03-16 13:37:56 -07:00
Anupam Pandey 5c2cd048a0 Add AVX2 for convolve horizontal filter for block width 4
Introduced AVX2 intrinsic to compute convolve horizontal for
w = 4 case. This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      0.763
 0       MIDRES2      0.466
 0        HDRES2      0.317
 0       Average      0.516

Change-Id: I124f3f8e994c24461812f4963b113819466db44f
2023-03-16 08:48:45 +05:30
Salome Thirot 362c69cfe5 Optimize vpx_minmax_8x8_neon for aarch64
Optimize vpx_minmax_8x8_neon on AArch64 targets by using the UMAXV and
UMINV instructions - computing the maximum and minimum elements in a
Neon vector.

Change-Id: I54c3a3a087d266f6774e6113e5947253df288a64
2023-03-14 22:43:04 +00:00
James Zern bbd6bc85a3 Merge "Add Neon implementation of vpx_highbd_satd_c" into main 2023-03-14 19:38:04 +00:00
James Zern f4400abb25 Merge "Optimize vpx_satd_neon" into main 2023-03-14 19:32:32 +00:00
James Zern bfe0ec066b Merge "Add Neon implementation of vp9_highbd_block_error_c" into main 2023-03-14 19:31:02 +00:00
Salome Thirot be84aa14dc Add Neon implementation of vpx_highbd_satd_c
Add Neon implementation of vpx_highbd_satd_c as well as the
corresponding tests.

Change-Id: I3d50e6abdf168fb13743e7d8da9364f072308b7f
2023-03-14 09:32:42 +00:00
Salome Thirot f7dbd848e4 Optimize vpx_satd_neon
Optimize Neon implementation of vpx_satd by using ABD and UADALP instead
of ABAL and ABAL2, splitting the accumulator and using a dedicated
helper function to perform the final reduction.

Change-Id: Idcfa49e001b68b1dcd87c13fd9acc317a208cd2a
2023-03-14 09:24:39 +00:00
Salome Thirot e553e3acff Add Neon implementation of vp9_highbd_block_error_c
Add Neon implementation of vp9_highbd_block_error_c as well as the
corresponding tests.

Change-Id: Ibe0eb077f959ced0dcd7d0d8d9d529d3b5bc1874
2023-03-14 09:11:43 +00:00
Konstantinos Margaritis 29beea8243 [NEON] Add temporal filter functions, 8-bit and highbd
Both are around 3x faster than original C version. 8-bit gives a
small 0.5% speed increase, whereas highbd gives ~2.5%.

Change-Id: I71d75ddd2757b19aa201e879fd9fa8f3a25431ad
2023-03-14 08:22:40 +00:00
James Zern d32a410880 Merge "Fix buffer overrun in highbd Neon subpel variance filters" into main 2023-03-14 00:22:31 +00:00
Matt Oliver c21151bf84 project: Update for 1.13.0 merge. 2023-03-12 03:00:05 +11:00
Matt Oliver cdedb9afd4 Merge commit 'd6eb9696aa72473c1a11d34d928d35a3acc0c9a9' 2023-03-12 01:59:24 +11:00
James Zern f40c89459f Merge "reland: quantize: simplify 32x32_b args" into main 2023-03-10 21:40:59 +00:00
Yunqing Wang d40a8608cc Merge "Add AVX2 for vpx_filter_block1d8_v8() function" into main 2023-03-10 01:02:25 +00:00
Anupam Pandey 775d594e46 Add AVX2 for vpx_filter_block1d8_v8() function
Introduced AVX2 intrinsic to compute convolve vertical for
w = 8 case. This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      1.347
 0       MIDRES2      1.046
 0        HDRES2      0.805
 0       Average      1.066

Change-Id: Idf77fff054beaf2c985b9bf2335591bda47e811f
2023-03-09 16:50:40 +05:30
Neeraj Gadgil 4959770032 Rename function 'model_rd_for_sb_earlyterm'
Function renamed as 'build_inter_pred_model_rd_earlyterm' and
added a comment to explain its behavior.

Change-Id: I804e6273558ba36241232f62cf18ea754b85e369
2023-03-09 15:22:51 +05:30
Jonathan Wright eab52a4f3c Fix buffer overrun in highbd Neon subpel variance filters
The high bitdepth Neon code applying the first pass of the bilinear
filter for subpixel variance on blocks of width 4 processed two rows
at a time. This resulted in a source buffer overread, attempting to
produce two rows of padding for the second (vertical) pass of the
bilinear filter.

This patch modifies highbd_var_filter_block2d_bil_w4 and
highbd_avg_pred_var_filter_block2d_bil_w4 such that they only process
a single row per iteration, and only require a single row of padding
for the second pass. This prevents the buffer overread.

Since all block sizes are now processed one row at a time, there is
no need for a "padding" macro parameter - the value is always 1, with
no special case for 4xh blocks. As well as re-enabling the Neon paths
and their associated tests, we remove the now-redundant 'padding'
macro parameter.

Bug: webm:1796
Change-Id: Icd6076b38eb4476139795bb1734ca800c9edf079
2023-03-08 23:40:14 +00:00
James Zern 79b1347a51 Merge "disable vpx_highbd_*_sub_pixel_avg_variance4x{4,8}_neon" into main 2023-03-08 23:05:08 +00:00
James Zern 7a47294675 Merge "Optimize vpx_sum_squares_2d_i16_neon" into main 2023-03-08 21:54:30 +00:00
James Zern a47967700d disable vpx_highbd_*_sub_pixel_avg_variance4x{4,8}_neon
vpx_highbd_8_sub_pixel_avg_variance4x4_neon
vpx_highbd_8_sub_pixel_avg_variance4x8_neon
vpx_highbd_10_sub_pixel_avg_variance4x4_neon
vpx_highbd_10_sub_pixel_avg_variance4x8_neon
vpx_highbd_12_sub_pixel_avg_variance4x4_neon
vpx_highbd_12_sub_pixel_avg_variance4x8_neon

all cause heap overflows of the form:

i[ RUN      ] NEON/VpxHBDSubpelAvgVarianceTest.Ref/33
=================================================================
==535205==ERROR: AddressSanitizer: heap-buffer-overflow on address
0xffff95bb0b89 at pc 0x00000116dabc bp 0xffffd09f6430 sp 0xffffd09f6428
READ of size 8 at 0xffff95bb0b89 thread T0
    #0 0x116dab8 in load_unaligned_u16q vpx_dsp/arm/mem_neon.h:176:3
    #1 0x116dab8 in highbd_var_filter_block2d_bil_w4
       vpx_dsp/arm/highbd_subpel_variance_neon.c:49:21
    #2 0x116dab8 in vpx_highbd_8_sub_pixel_avg_variance4x4_neon
       vpx_dsp/arm/highbd_subpel_variance_neon.c:543:1
    ...

0xffff95bb0b89 is located 0 bytes to the right of 73-byte region
[0xffff95bb0b40,0xffff95bb0b89)
allocated by thread T0 here:
    #0 0x5f18b0 in malloc (test_libvpx+0x5f18b0)
    #1 0xce4a40 in vpx_memalign vpx_mem/vpx_mem.c:62:10
    #2 0xce4a40 in vpx_malloc vpx_mem/vpx_mem.c:70:40
    #3 0xa52238 in (anonymous namespace)::SubpelVarianceTest<unsigned
       int (*)(unsigned char const*, int, int, int, unsigned char
               const*, int, unsigned int*, unsigned char
               const*)>::SetUp()
       test/variance_test.cc:586:14
    ...

This is the same issue as:
  e33d4c276 disable vpx_highbd_*_sub_pixel_variance4x{4,8}_neon
They have highbd_var_filter_block2d_bil_w4 in common.

Bug: webm:1796
Change-Id: I3ed70d0ba22e127720542612ea9f6665948eedfc
2023-03-08 13:17:17 -08:00
James Zern e33d4c276d disable vpx_highbd_*_sub_pixel_variance4x{4,8}_neon
vpx_highbd_8_sub_pixel_variance4x4_neon
vpx_highbd_8_sub_pixel_variance4x8_neon
vpx_highbd_10_sub_pixel_variance4x4_neon
vpx_highbd_10_sub_pixel_variance4x8_neon
vpx_highbd_12_sub_pixel_variance4x4_neon
vpx_highbd_12_sub_pixel_variance4x8_neon

all cause heap overflows of the form:

[ RUN      ] NEON/VpxHBDSubpelVarianceTest.Ref/24
=================================================================
==450528==ERROR: AddressSanitizer: heap-buffer-overflow on address
0xffff8311a571 at pc 0x0000010ca52c bp 0xffffc63e96b0 sp 0xffffc63e96a8
READ of size 8 at 0xffff8311a571 thread T0
    #0 0x10ca528 in load_unaligned_u16q vpx_dsp/arm/mem_neon.h:176:3
    #1 0x10ca528 in highbd_var_filter_block2d_bil_w4
       vpx_dsp/arm/highbd_subpel_variance_neon.c:49:21
    #2 0x10ca528 in vpx_highbd_10_sub_pixel_variance4x8_neon
       vpx_dsp/arm/highbd_subpel_variance_neon.c:257:1
    ...

0xffff8311a571 is located 0 bytes to the right of 113-byte region
[0xffff8311a500,0xffff8311a571)
allocated by thread T0 here:
    #0 0x5f18b0 in malloc (test_libvpx+0x5f18b0)
    #1 0xce4f90 in vpx_memalign vpx_mem/vpx_mem.c:62:10
    #2 0xce4f90 in vpx_malloc vpx_mem/vpx_mem.c:70:40
    #3 0xa4ad44 in (anonymous namespace)::SubpelVarianceTest<unsigned
       int (*)(unsigned char const*, int, int, int, unsigned char
       const*, int, unsigned int*)>::SetUp() test/variance_test.cc:586:14

Bug: webm:1796
Change-Id: I39f7f936bae2bcbbe1f803fb10375ec02d1c1277
2023-03-07 22:16:56 -08:00
James Zern 0f17aa986a Merge "[SSE4_1] Fix overflow in highbd temporal_filter" into main 2023-03-07 23:40:10 +00:00
James Zern ccdcba6dc9 Merge changes I79247b5a,Ic6016cf8,Ibab7ec5f into main
* changes:
  Add Neon implementation of vp9_block_error_c
  Fix return type of horizontal_add_int64x2 helper
  Optimize vp9_block_error_fp_neon
2023-03-07 23:00:19 +00:00
James Zern 13bd85f687 Merge changes Ic021e82e,I2bce6f19,I250ab56e,I910692b1,Iefaa774d into main
* changes:
  Implement highbd_d207_predictor using Neon
  Implement highbd_d153_predictor using Neon
  Implement d207_predictor using Neon
  Implement d153_predictor using Neon
  Implement highbd_d63_predictor using Neon
2023-03-07 22:48:54 +00:00
Yunqing Wang 8874873bef Merge "Add AVX2 for vpx_filter_block1d8_h8() function" into main 2023-03-07 16:40:52 +00:00
Yunqing Wang ba3d606630 Merge "Use cb pattern for interp eval when filter is not switchable" into main 2023-03-07 16:37:22 +00:00
Yunqing Wang f138a4004d Merge "Early terminate interp filt search based on best RD cost" into main 2023-03-07 16:35:18 +00:00
Anupam Pandey b7fabadc5d Add AVX2 for vpx_filter_block1d8_h8() function
Introduced AVX2 intrinsic to compute convolve horizontal for
w = 8 case. This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      1.509
 0       MIDRES2      1.165
 0        HDRES2      0.898
 0       Average      1.191

Change-Id: I699c94aa3d7ea74c58f901df906eed0b81b4ee79
2023-03-07 18:20:30 +05:30
Salome Thirot eec4808393 Add Neon implementation of vp9_block_error_c
Add Neon implementation of vp9_block_error_c as well as the
corresponding tests.

Change-Id: I79247b5ae24f51b7b55fc5e517d5e403dc86367a
2023-03-07 12:04:25 +00:00
Salome Thirot 57c6ea9752 Fix return type of horizontal_add_int64x2 helper
horizontal_add_int64x2 was incorrectly returning a uint64_t instead of
an int64_t. This patch fixes that.

Change-Id: Ic6016cf87aebfc6a14f540b784d6648757e12b49
2023-03-07 11:34:05 +00:00
Salome Thirot 5ae84ea5ae Optimize vp9_block_error_fp_neon
Currently vp9_block_error_fp_neon is only used when
CONFIG_VP9_HIGHBITDEPTH is set to false. This patch optimizes the
implementation and uses tran_low_t instead of int16_t so that the
function can also be used in builds where vp9_highbitdepth is enabled.

Change-Id: Ibab7ec5f74b7652fa2ae5edf328f9ec587088fd3
2023-03-07 11:29:31 +00:00
Neeraj Gadgil b9933679bf Use cb pattern for interp eval when filter is not switchable
This CL uses a checkerboard pattern for interp filter eval when
the filter is not switchable.

                 Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      0.725         0.0017    -0.0000    0.0192
 0       MIDRES2      0.968         0.0004     0.0504    0.0810
 0        HDRES2      1.135         0.0089     0.0130    0.0113
 0       Average      0.943         0.0037     0.0211    0.0372

STATS_CHANGED

Change-Id: Ia713e5170101302f264ffaa2350bc0ab15c27090
2023-03-07 12:39:10 +05:30
Neeraj Gadgil f2210fd290 Early terminate interp filt search based on best RD cost
The CL prunes interpolation filter search based on rdcost of
individual planes.

                 Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      1.613         0.0143     0.0208    0.0146
 0       MIDRES2      1.637         0.0214    -0.0316    0.0036
 0        HDRES2      1.369         0.0171     0.0178    0.1222
 0       Average      1.539         0.0176     0.0023    0.0468

STATS_CHANGED

Change-Id: I4be30bd1c7bbbc93c6bbc840565893a97d2598a4
2023-03-07 12:36:00 +05:30
James Zern 2d7d2fcf7b Merge "Fix heap buffer overrun in vpx_get4x4sse_cs_neon" into main 2023-03-07 05:45:53 +00:00
James Zern 925b8156d9 Merge changes I05dc4d43,Ia0977ff0 into main
* changes:
  Fix potential buffer over-read in highbd d117 predictor Neon
  Implement d117_predictor using Neon
2023-03-07 01:28:15 +00:00
Jonathan Wright 5a2bb12c52 Fix heap buffer overrun in vpx_get4x4sse_cs_neon
Use a mem_neon.h helper to do strided 4-byte loads instead of Neon
8-byte loads - where the last 4 bytes are out of bounds.

Re-enable the Neon code path and the tests.

Bug: webm:1794
Change-Id: I69ccff730f4a5cbf585dd6a9aa0f3eb13e150074
2023-03-07 00:05:10 +00:00
James Zern d94e16404a vpx_convolve_copy_neon: fix unaligned loads w/w==4
Fixes a -fsanitize=undefined warning:

vpx_dsp/arm/vpx_convolve_copy_neon.c:29:26: runtime error: load of
misaligned address 0xffffa8242bea for type 'const uint32_t' (aka 'const
unsigned int'), which requires 4 byte alignment
0xffffa8242bea: note: pointer points here
 88 81  7d 7d 7d 7d 7d 81 81 7d  81 80 87 97 a8 ab a0 91 ...
              ^
    #0 0xb0447c in vpx_convolve_copy_neon
       vpx_dsp/arm/vpx_convolve_copy_neon.c:29:26
    #1 0x12285c8 in inter_predictor vp9/common/vp9_reconinter.h:29:3
    #2 0x1228430 in dec_build_inter_predictors
       vp9/decoder/vp9_decodeframe.c
    ...

Change-Id: Iaec4ac2a400b6e6db72d12e5a7acb316262b12a7
2023-03-06 15:19:31 -08:00
Jonathan Wright 6b783c6975 Optimize vpx_sum_squares_2d_i16_neon
Add an additional 32-bit vector accumulator to allow parallel
processing on CPUs that have more than one Neon multiply-accumulate
pipeline. Also use sum_neon.h horizontal-add helpers for reduction.

Change-Id: Ibcb48a738f5dee1430c3ebcd305b5ea8ea344c40
2023-03-06 18:34:23 +00:00
George Steed 9e35c35945 Implement highbd_d207_predictor using Neon
Add Neon implementations of the highbd d207 predictor for 4x4, 8x8,
16x16 and 32x32 block sizes. Also update tests to add new corresponding
cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.61
Neoverse N1 |  LLVM 15 |   8x8 |    5.30
Neoverse N1 |  LLVM 15 | 16x16 |    8.93
Neoverse N1 |  LLVM 15 | 32x32 |    8.35
Neoverse N1 |   GCC 12 |   4x4 |    2.16
Neoverse N1 |   GCC 12 |   8x8 |    5.75
Neoverse N1 |   GCC 12 | 16x16 |    7.28
Neoverse N1 |   GCC 12 | 32x32 |    3.31
Neoverse V1 |  LLVM 15 |   4x4 |    1.71
Neoverse V1 |  LLVM 15 |   8x8 |    7.46
Neoverse V1 |  LLVM 15 | 16x16 |   10.09
Neoverse V1 |  LLVM 15 | 32x32 |    8.10
Neoverse V1 |   GCC 12 |   4x4 |    1.99
Neoverse V1 |   GCC 12 |   8x8 |    7.81
Neoverse V1 |   GCC 12 | 16x16 |    8.34
Neoverse V1 |   GCC 12 | 32x32 |    5.74

Change-Id: Ic021e82eed0c7bc8263eb68606411354eb5e4870
2023-03-06 13:35:45 +00:00
George Steed cf85ae9a49 Implement highbd_d153_predictor using Neon
Add Neon implementations of the highbd d153 predictor for 4x4, 8x8,
16x16 and 32x32 block sizes. Also update tests to add new corresponding
cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.71
Neoverse N1 |  LLVM 15 |   8x8 |    4.05
Neoverse N1 |  LLVM 15 | 16x16 |    7.04
Neoverse N1 |  LLVM 15 | 32x32 |    7.71
Neoverse N1 |   GCC 12 |   4x4 |    1.84
Neoverse N1 |   GCC 12 |   8x8 |    4.19
Neoverse N1 |   GCC 12 | 16x16 |    6.07
Neoverse N1 |   GCC 12 | 32x32 |    3.14
Neoverse V1 |  LLVM 15 |   4x4 |    3.19
Neoverse V1 |  LLVM 15 |   8x8 |    5.51
Neoverse V1 |  LLVM 15 | 16x16 |    7.73
Neoverse V1 |  LLVM 15 | 32x32 |    7.72
Neoverse V1 |   GCC 12 |   4x4 |    3.97
Neoverse V1 |   GCC 12 |   8x8 |    5.52
Neoverse V1 |   GCC 12 | 16x16 |    6.31
Neoverse V1 |   GCC 12 | 32x32 |    5.36

Change-Id: I2bce6f1921d76d1c10d163e0cd4f395b40799184
2023-03-06 13:35:27 +00:00
George Steed 33f3ae3414 Fix potential buffer over-read in highbd d117 predictor Neon
The load of `left[bs]` in the standard bitdepth d117 Neon implementation
triggered an address-sanitizer failure.

The highbd equivalent does not appear to trigger any asan failures when
running the VP9/ExternalFrameBufferMD5Test or
VP9/TestVectorTest.MD5Match tests, but for consistency with the standard
bitdepth implementation we adjust it to avoid the over-read.

Performance is roughly identical, with a 0.8% performance improvement on
average over the previous optimised code.

Change-Id: I05dc4d43f244f4915c0ccc52cc0af999bbacb018
2023-03-06 13:34:35 +00:00
George Steed 872476c66b Implement d207_predictor using Neon
Add Neon implementations of the d207 predictor for 4x4, 8x8, 16x16 and
32x32 block sizes. Also update tests to add new corresponding cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.72
Neoverse N1 |  LLVM 15 |   8x8 |    5.68
Neoverse N1 |  LLVM 15 | 16x16 |   12.30
Neoverse N1 |  LLVM 15 | 32x32 |   16.70
Neoverse N1 |   GCC 12 |   4x4 |    1.71
Neoverse N1 |   GCC 12 |   8x8 |    6.01
Neoverse N1 |   GCC 12 | 16x16 |   12.40
Neoverse N1 |   GCC 12 | 32x32 |    6.71
Neoverse V1 |  LLVM 15 |   4x4 |    1.99
Neoverse V1 |  LLVM 15 |   8x8 |    8.28
Neoverse V1 |  LLVM 15 | 16x16 |   14.36
Neoverse V1 |  LLVM 15 | 32x32 |   17.55
Neoverse V1 |   GCC 12 |   4x4 |    1.99
Neoverse V1 |   GCC 12 |   8x8 |    8.43
Neoverse V1 |   GCC 12 | 16x16 |   14.41
Neoverse V1 |   GCC 12 | 32x32 |    7.82

Change-Id: I250ab56edab3390b0bac9dc96995a4bf9a4da641
2023-03-06 13:34:35 +00:00
George Steed 7e88600bf9 Implement d117_predictor using Neon
Add Neon implementations of the d117 predictor for 4x4, 8x8, 16x16 and
32x32 block sizes. Also update tests to add new corresponding cases.

This re-lands commit 360e9069b6,
previously reverted in commit 394de691a0.

The implementation is mostly identical to the original but with an
adjustment to how data is loaded from the `left` array. In particular
the left array cannot be guaranteed to be larger than the block size, so
the read of e.g. `left[32]` in the `bs=32` case is not valid. This turns
out to be not a problem since the last lane loaded in this case is
unused. I have added comments in the code to explain why this is the
case.

Since we cannot load the last element directly, we instead construct it
from the previous aligned read. This seems to have an inconsistent
affect on performance, improving by up to 10% in some cases and
regressing by up to 10% on others. Either way it is still significantly
faster than the original C code.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.88
Neoverse N1 |  LLVM 15 |   8x8 |    5.19
Neoverse N1 |  LLVM 15 | 16x16 |    9.63
Neoverse N1 |  LLVM 15 | 32x32 |   13.85
Neoverse N1 |   GCC 12 |   4x4 |    2.04
Neoverse N1 |   GCC 12 |   8x8 |    4.62
Neoverse N1 |   GCC 12 | 16x16 |    9.79
Neoverse N1 |   GCC 12 | 32x32 |    4.69
Neoverse V1 |  LLVM 15 |   4x4 |    1.75
Neoverse V1 |  LLVM 15 |   8x8 |    6.71
Neoverse V1 |  LLVM 15 | 16x16 |    9.62
Neoverse V1 |  LLVM 15 | 32x32 |   13.81
Neoverse V1 |   GCC 12 |   4x4 |    1.75
Neoverse V1 |   GCC 12 |   8x8 |    6.01
Neoverse V1 |   GCC 12 | 16x16 |    6.91
Neoverse V1 |   GCC 12 | 32x32 |    4.39

Change-Id: Ia0977ff0b0eba2c41c7884b64e7c22ff9bc9549d
2023-03-06 13:34:35 +00:00
George Steed 8b0a60f91c Implement d153_predictor using Neon
Add Neon implementations of the d153 predictor for 4x4, 8x8, 16x16 and
32x32 block sizes. Also update tests to add new corresponding cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.59
Neoverse N1 |  LLVM 15 |   8x8 |    4.46
Neoverse N1 |  LLVM 15 | 16x16 |    8.77
Neoverse N1 |  LLVM 15 | 32x32 |   15.21
Neoverse N1 |   GCC 12 |   4x4 |    1.90
Neoverse N1 |   GCC 12 |   8x8 |    4.70
Neoverse N1 |   GCC 12 | 16x16 |    9.55
Neoverse N1 |   GCC 12 | 32x32 |    5.95
Neoverse V1 |  LLVM 15 |   4x4 |    2.89
Neoverse V1 |  LLVM 15 |   8x8 |    6.94
Neoverse V1 |  LLVM 15 | 16x16 |   10.20
Neoverse V1 |  LLVM 15 | 32x32 |   15.63
Neoverse V1 |   GCC 12 |   4x4 |    4.45
Neoverse V1 |   GCC 12 |   8x8 |    7.71
Neoverse V1 |   GCC 12 | 16x16 |    9.08
Neoverse V1 |   GCC 12 | 32x32 |    7.93

Change-Id: I910692b14917cde8a8952fab5b9c78bed7f7c6ad
2023-03-06 13:34:35 +00:00
George Steed 6282757546 Implement highbd_d63_predictor using Neon
Add Neon implementations of the highbd d63 predictor for 4x4, 8x8, 16x16
and 32x32 block sizes. Also update tests to add new corresponding cases.

This re-lands commit 7cdf139e3d,
previously reverted in 7478b7e4e4.

Compared to the previous implementation attempt we now correctly match
the behaviour of the C code when handling the final element loaded from
the 'above' input array. In particular:

- The C code for a 4x4 block performs a full average of the last element
  rather than duplicating the final element from the input 'above'
  array.

- The C code for other block sizes performs a full average for the
  stride=0 and stride=1, and otherwise shifts in duplicates of the final
  element from the input 'above' array. Notably this shifting for later
  strides _replaces_ the final element which we previously performed an
  average on (see {d0,d1}_ext in the code).

It is worth noting that this difference is not caught by the existing
VP9HighbdIntraPredTest test cases since the test vector initialisation
contains this loop:

    for (int x = block_size; x < 2 * block_size; x++) {
        above_row_[x] = above_row_[block_size - 1];
    }

Since AVG2(a, a) and AVG3(a, a, a) are simply 'a', such differences in
behaviour for the final element are not observed.

Tested on AArch64 with:

- ./test_libvpx --gtest_filter="*VP9HighbdIntraPredTest*"
- ./test_libvpx --gtest_filter="*VP9/TestVectorTest.MD5Match*"
- ./test_libvpx --gtest_filter="*VP9/ExternalFrameBufferMD5Test*"

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    2.43
Neoverse N1 |  LLVM 15 |   8x8 |    3.92
Neoverse N1 |  LLVM 15 | 16x16 |    3.19
Neoverse N1 |  LLVM 15 | 32x32 |    4.13
Neoverse N1 |   GCC 12 |   4x4 |    2.92
Neoverse N1 |   GCC 12 |   8x8 |    6.51
Neoverse N1 |   GCC 12 | 16x16 |    4.55
Neoverse N1 |   GCC 12 | 32x32 |    3.18
Neoverse V1 |  LLVM 15 |   4x4 |    1.99
Neoverse V1 |  LLVM 15 |   8x8 |    3.65
Neoverse V1 |  LLVM 15 | 16x16 |    3.72
Neoverse V1 |  LLVM 15 | 32x32 |    3.26
Neoverse V1 |   GCC 12 |   4x4 |    2.39
Neoverse V1 |   GCC 12 |   8x8 |    4.76
Neoverse V1 |   GCC 12 | 16x16 |    3.24
Neoverse V1 |   GCC 12 | 32x32 |    2.44

Change-Id: Iefaa774d6a20388b523eaa7f5df6bc5f5cf249e4
2023-03-06 13:34:35 +00:00
Johann 0384a2aab7 reland: quantize: simplify 32x32_b args
Allocate mb_plane_ on the heap to ensure src is aligned.

Now that all the implementations of the 32x32 quantize are in
intrinsics we can reference struct members directly. Saves
pushing them to the stack.

n_coeffs is not used at all for this function.

Change-Id: Ib551f7f583977602504d962b72063bc6eda9dda9
2023-03-06 09:16:04 +09:00
James Zern 5fae248f2a disable vp8_sixtap_predict16x16_neon
This causes various buffer overflows in the tests:

[ RUN      ] NEON/SixtapPredictTest.TestWithPresetData/0
=================================================================
==22346==ERROR: AddressSanitizer: global-buffer-overflow on address
0x0000012b4a5b at pc 0x000000df0f60 bp 0xffffcf6e64b0 sp 0xffffcf6e64a8
READ of size 8 at 0x0000012b4a5b thread T0
    #0 0xdf0f5c in vp8_sixtap_predict16x16_neon
       vp8/common/arm/neon/sixtappredict_neon.c:1507:13
    #1 0x8819e4 in (anonymous
        namespace)::SixtapPredictTest_TestWithPresetData_Test::TestBody()
       test/predict_test.cc:293:3
    ...

0x0000012b4a5b is located 2 bytes to the right of global variable
'kTestData' defined in '../test/predict_test.cc:237:24' (0x12b48a0) of
size 441

[ RUN      ] NEON/SixtapPredictTest.TestWithRandomData/0
=================================================================
==22338==ERROR: AddressSanitizer: heap-buffer-overflow on address
0xffff8b5321fb at pc 0x000000df0f60 bp 0xfffff7e0cf30 sp 0xfffff7e0cf28
READ of size 8 at 0xffff8b5321fb thread T0
    #0 0xdf0f5c in vp8_sixtap_predict16x16_neon
       vp8/common/arm/neon/sixtappredict_neon.c:1507:13
    #1 0x87d4c0 in (anonymous
       namespace)::PredictTestBase::TestWithRandomData(void (*)(unsigned
       char*, int, int, int, unsigned char*, int))
       test/predict_test.cc:170:9
    ...

0xffff8b5321fb is located 2 bytes to the right of 441-byte region
[0xffff8b532040,0xffff8b5321f9)
allocated by thread T0 here:
    #0 0x5fd4f0 in operator new[](unsigned long) (test_libvpx+0x5fd4f0)
    #1 0x87c2e0 in (anonymous namespace)::PredictTestBase::SetUp()
       test/predict_test.cc:47:12
    #2 0x87d074 in non-virtual thunk to (anonymous
       namespace)::PredictTestBase::SetUp() test/predict_test.cc
    ...

Bug: webm:1795
Change-Id: I32213a381eef91547d00f88acf90f1cf2ec2ea75
2023-03-03 15:33:16 -08:00
James Zern f5dfa780ce disable vpx_get4x4sse_cs_neon
This function causes a heap overflow in the tests:
[ RUN      ] NEON/VpxSseTest.RefSse/0
=================================================================
==876922==ERROR: AddressSanitizer: heap-buffer-overflow on address
0xffff8949d903 at pc 0x000000dd95d4 bp 0xfffffdd7f260 sp 0xfffffdd7f258
READ of size 8 at 0xffff8949d903 thread T0
    #0 0xdd95d0 in vpx_get4x4sse_cs_neon
       vpx_dsp/arm/variance_neon.c:556:10
    #1 0x9d4894 in (anonymous namespace)::MainTestClass<unsigned int
       (*)(unsigned char const*, int, unsigned char const*,
           int)>::RefTestSse() test/variance_test.cc:531:5
    #2 0x9d4894 in (anonymous
       namespace)::VpxSseTest_RefSse_Test::TestBody()
           test/variance_test.cc:772:30
    ...

0xffff8949d903 is located 3 bytes to the right of 16-byte region
[0xffff8949d8f0,0xffff8949d900)
allocated by thread T0 here:
    #0 0x5fd050 in operator new[](unsigned long) (test_libvpx+0x5fd050)
    #1 0x9d3e04 in (anonymous namespace)::MainTestClass<unsigned int
       (*)(unsigned char const*, int, unsigned char const*,
           int)>::SetUp() test/variance_test.cc:299:12

Bug: webm:1794
Change-Id: I4bc681eb9a436743ef8bfe2a2abae59ce754309c
2023-03-03 13:24:02 -08:00
James Zern 394de691a0 Revert "Implement d117_predictor using Neon"
This reverts commit 360e9069b6.

This causes ASan errors:
[ RUN      ] VP9/TestVectorTest.MD5Match/1
=================================================================
==837858==ERROR: AddressSanitizer: stack-buffer-overflow on address
0xffff82ecad40 at pc 0x000000c494d4 bp 0xffffe1695800 sp 0xffffe16957f8
READ of size 16 at 0xffff82ecad40 thread T0
    #0 0xc494d0 in vpx_d117_predictor_32x32_neon (test_libvpx+0xc494d0)
    #1 0x1040b34 in vp9_predict_intra_block (test_libvpx+0x1040b34)
    #2 0xf8feec in decode_block (test_libvpx+0xf8feec)
    #3 0xf8f588 in decode_partition (test_libvpx+0xf8f588)
    #4 0xf7be5c in vp9_decode_frame (test_libvpx+0xf7be5c)
    ...
Address 0xffff82ecad40 is located in stack of thread T0 at offset 64 in
frame
    #0 0x103fd3c in vp9_predict_intra_block (test_libvpx+0x103fd3c)

  This frame has 2 object(s):
    [32, 64) 'left_col.i' <== Memory access at offset 64 overflows this
                              variable
    [96, 176) 'above_data.i'

Change-Id: I058213364617dfe1036126c33a3307f8288d9ae0
2023-03-03 12:34:36 -08:00
Johann ca0c51f05f Revert "Allow macroblock_plane to have its own rounding buffer"
This reverts commit 5359ae810c.

Reason for revert: Blocks quantize cleanups

Original change's description:
> Allow macroblock_plane to have its own rounding buffer
>
> Add 8 bytes buffer to macroblock_plane to support rounding factor.
>
> Change-Id: I3751689e4449c0caea28d3acf6cd17d7f39508ed

Change-Id: Ia2424d2114207370f0b45350313a5ff8521d25a8
2023-03-03 06:24:41 +00:00
Konstantinos Margaritis 817248e1be [SSE4_1] Fix overflow in highbd temporal_filter
While porting this function to NEON, using SSE4_1 implementation
as base I noticed that both were producing files with different
checksums to the C reference implementation. After investigating
further I found that this saturating pack was the culprit. Doing
the multiplication on the 32-bit values, leads to producing the
correct results with the C implementation.

Change-Id: I40c2a36551b2db363a58ea9aa19ef327f2676de3
2023-03-02 00:02:16 +00:00
James Zern 508bfc1ff4 Revert "quantize: simplify 32x32_b args"
This reverts commit 848f6e7337.

This has alignment issues, causing crashes in the tests:
SSSE3/VP9QuantizeTest.EOBCheck/*

Change-Id: Ic12014ab0a78ed3cde02d642509061552cdc8fc9
2023-03-01 15:54:49 -08:00
James Zern e4b423e140 Revert "quantize: simplifly highbd 32x32_b args"
This reverts commit 573f5e662b.

This has alignment issues, causing crashes in the tests:
SSSE3/VP9QuantizeTest.EOBCheck/*

Change-Id: Ibf05e6b116c46f6e2c11187b3e3578bbd2d2c227
2023-03-01 15:54:48 -08:00
James Zern d98a7b8bd9 Revert "quantize: use scan_order instead of passing scan/iscan"
This reverts commit 14fc40040f.

This has alignment issues, causing crashes in the tests:
SSSE3/VP9QuantizeTest.EOBCheck/*

Change-Id: I934f9a4c3ce3db33058a65180fa645c8649c3670
2023-03-01 15:54:46 -08:00
James Zern 0e7804ca30 Merge "Optimize Neon implementation of high bitdepth MSE functions" into main 2023-03-01 23:13:34 +00:00
James Zern 7478b7e4e4 Revert "Implement highbd_d63_predictor using Neon"
This reverts commit 7cdf139e3d.

This causes failures in the VP9/ExternalFrameBufferMD5Test and
VP9/TestVectorTest.MD5Match tests in both armv7 and aarch64 builds.

Change-Id: I7ac4ba0ddc70e7e7860df9f962e6658defe1cdd5
2023-03-01 12:17:00 -08:00
Salome Thirot 096cd0ba8a Optimize Neon implementation of high bitdepth MSE functions
Currently MSE functions just call the variance helpers but don't
actually use the computed sum. This patch adds dedicated helpers to
perform the computation of sse.

Add the corresponding tests as well.

Change-Id: I96a8590e3410e84d77f7187344688e02efe03902
2023-03-01 13:35:03 +00:00
Johann 14fc40040f quantize: use scan_order instead of passing scan/iscan
further reduces the arguments for the 32x32. This will be applied to the base
version as well.

Change-Id: I25a162b5248b14af53d9e20c6a7fa2a77028a6d1
2023-03-01 07:48:01 +09:00
Johann 573f5e662b quantize: simplifly highbd 32x32_b args
Change-Id: I431a41279c4c4193bc70cfe819da6ea7e1d2fba1
2023-03-01 07:35:15 +09:00
James Zern 1ad49b2878 Merge changes I892fbd2c,Ic59df16c,I7228327b,Ib4a1a2cb into main
* changes:
  Implement highbd_d117_predictor using Neon
  Implement highbd_d63_predictor using Neon
  Implement d117_predictor using Neon
  Implement d63_predictor using Neon
2023-02-28 21:50:11 +00:00
James Zern 002ca3fc72 Merge "quantize: simplify 32x32_b args" into main 2023-02-28 21:40:26 +00:00
George Steed 74e4587c89 Implement highbd_d117_predictor using Neon
Add Neon implementations of the highbd d117 predictor for 4x4, 8x8,
16x16 and 32x32 block sizes. Also update tests to add new corresponding
cases.

An explanation of the general implementation strategy is given in the
8x8 implementation body, and is mostly identical to the non-highbd
version.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.99
Neoverse N1 |  LLVM 15 |   8x8 |    4.37
Neoverse N1 |  LLVM 15 | 16x16 |    6.81
Neoverse N1 |  LLVM 15 | 32x32 |    6.49
Neoverse N1 |   GCC 12 |   4x4 |    2.49
Neoverse N1 |   GCC 12 |   8x8 |    4.10
Neoverse N1 |   GCC 12 | 16x16 |    5.58
Neoverse N1 |   GCC 12 | 32x32 |    2.16
Neoverse V1 |  LLVM 15 |   4x4 |    1.99
Neoverse V1 |  LLVM 15 |   8x8 |    5.03
Neoverse V1 |  LLVM 15 | 16x16 |    6.61
Neoverse V1 |  LLVM 15 | 32x32 |    6.01
Neoverse V1 |   GCC 12 |   4x4 |    2.09
Neoverse V1 |   GCC 12 |   8x8 |    4.52
Neoverse V1 |   GCC 12 | 16x16 |    4.23
Neoverse V1 |   GCC 12 | 32x32 |    2.70

Change-Id: I892fbd2c17ac527ddc22b91acca907ffc84c5cd2
2023-02-28 11:46:40 +00:00
George Steed 7cdf139e3d Implement highbd_d63_predictor using Neon
Add Neon implementations of the highbd d63 predictor for 4x4, 8x8, 16x16
and 32x32 block sizes. Also update tests to add new corresponding cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    2.43
Neoverse N1 |  LLVM 15 |   8x8 |    4.03
Neoverse N1 |  LLVM 15 | 16x16 |    3.07
Neoverse N1 |  LLVM 15 | 32x32 |    4.11
Neoverse N1 |   GCC 12 |   4x4 |    2.92
Neoverse N1 |   GCC 12 |   8x8 |    7.20
Neoverse N1 |   GCC 12 | 16x16 |    4.43
Neoverse N1 |   GCC 12 | 32x32 |    3.18
Neoverse V1 |  LLVM 15 |   4x4 |    1.99
Neoverse V1 |  LLVM 15 |   8x8 |    3.66
Neoverse V1 |  LLVM 15 | 16x16 |    3.60
Neoverse V1 |  LLVM 15 | 32x32 |    3.29
Neoverse V1 |   GCC 12 |   4x4 |    2.39
Neoverse V1 |   GCC 12 |   8x8 |    4.76
Neoverse V1 |   GCC 12 | 16x16 |    3.29
Neoverse V1 |   GCC 12 | 32x32 |    2.43

Change-Id: Ic59df16ceeb468003754b4374be2f4d9af6589e4
2023-02-28 11:46:34 +00:00
George Steed 360e9069b6 Implement d117_predictor using Neon
Add Neon implementations of the d117 predictor for 4x4, 8x8, 16x16 and
32x32 block sizes. Also update tests to add new corresponding cases.

An explanation of the general implementation strategy is given in the
8x8 implementation body.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.73
Neoverse N1 |  LLVM 15 |   8x8 |    5.24
Neoverse N1 |  LLVM 15 | 16x16 |    9.77
Neoverse N1 |  LLVM 15 | 32x32 |   14.13
Neoverse N1 |   GCC 12 |   4x4 |    2.04
Neoverse N1 |   GCC 12 |   8x8 |    4.70
Neoverse N1 |   GCC 12 | 16x16 |    8.64
Neoverse N1 |   GCC 12 | 32x32 |    4.57
Neoverse V1 |  LLVM 15 |   4x4 |    1.75
Neoverse V1 |  LLVM 15 |   8x8 |    6.79
Neoverse V1 |  LLVM 15 | 16x16 |    9.16
Neoverse V1 |  LLVM 15 | 32x32 |   14.47
Neoverse V1 |   GCC 12 |   4x4 |    1.75
Neoverse V1 |   GCC 12 |   8x8 |    6.00
Neoverse V1 |   GCC 12 | 16x16 |    7.63
Neoverse V1 |   GCC 12 | 32x32 |    4.32

Change-Id: I7228327b5be27ee7a68deecafa05be0bd2a40ff4
2023-02-28 11:33:21 +00:00
George Steed a7ab16aed1 Implement d63_predictor using Neon
Add Neon implementations of the d63 predictor for 4x4, 8x8, 16x16 and
32x32 block sizes. Also update tests to add new corresponding cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    2.10
Neoverse N1 |  LLVM 15 |   8x8 |    4.45
Neoverse N1 |  LLVM 15 | 16x16 |    4.74
Neoverse N1 |  LLVM 15 | 32x32 |    2.27
Neoverse N1 |   GCC 12 |   4x4 |    2.46
Neoverse N1 |   GCC 12 |   8x8 |   10.37
Neoverse N1 |   GCC 12 | 16x16 |   11.46
Neoverse N1 |   GCC 12 | 32x32 |    6.57
Neoverse V1 |  LLVM 15 |   4x4 |    2.24
Neoverse V1 |  LLVM 15 |   8x8 |    3.53
Neoverse V1 |  LLVM 15 | 16x16 |    4.44
Neoverse V1 |  LLVM 15 | 32x32 |    2.17
Neoverse V1 |   GCC 12 |   4x4 |    2.25
Neoverse V1 |   GCC 12 |   8x8 |    7.67
Neoverse V1 |   GCC 12 | 16x16 |    8.97
Neoverse V1 |   GCC 12 | 32x32 |    4.77

Change-Id: Ib4a1a2cb5a5c4495ae329529f8847664cbd0dfe0
2023-02-28 11:32:32 +00:00
Johann 848f6e7337 quantize: simplify 32x32_b args
Now that all the implementations of the 32x32 quantize are in
intrinsics we can reference struct members directly. Saves
pushing them to the stack.

n_coeffs is not used at all for this function.

Change-Id: I2104fea3fa20c455087e21b347d6abd7ea1f3e1e
2023-02-28 18:46:16 +09:00
James Zern 372989240d Merge "Add Neon implementations of standard bitdepth MSE functions" into main 2023-02-28 02:44:28 +00:00
James Zern c70d57c71a Merge "Optimize transpose_neon.h helper functions" into main 2023-02-28 02:36:41 +00:00
James Zern 112945ac7b tools_common,VpxInterface: remove unneeded const
Change-Id: Ic309aab2ff1750bdbcc36e8aafe05d52930ba694
2023-02-27 13:48:47 -08:00
James Zern 0824c8c556 Merge "tools_common,VpxInterface: fix interface fn ptr proto" into main 2023-02-27 19:52:18 +00:00
Salome Thirot ccc101e6bb Add Neon implementations of standard bitdepth MSE functions
Currently only vpx_mse16x16 has a Neon implementation. This patch adds
optimized Armv8.0 and Armv8.4 dot-product paths for all block sizes:
8x8, 8x16, 16x8 and 16x16.

Add the corresponding tests as well.

Change-Id: Ib0357fdcdeb05860385fec89633386e34395e260
2023-02-27 18:03:22 +00:00
Jonathan Wright b25cca8c2e Optimize transpose_neon.h helper functions
1) Use vtrn[12]q_[su]64 in vpx_vtrnq_[su]64* helpers on AArch64
   targets. This produces half as many TRN1/2 instructions compared to
   the number of MOVs that result from vcombine.

2) Use vpx_vtrnq_[su]64* helpers wherever applicable.

3) Refactor transpose_4x8_s16 to operate on 128-bit vectors.

Change-Id: I9a8b1c1fe2a98a429e0c5f39def5eb2f65759127
2023-02-27 09:49:02 +00:00
James Zern 5b2d3d5e42 tools_common,VpxInterface: fix interface fn ptr proto
Use (void) to indicate an empty parameter list and match the declaration
of vpx_codec_vp[89]_[cd]x. This fixes a cfi sanitizer error.

Change-Id: I190f432eea4d1765afffd84c7458ec44d863f90c
2023-02-24 19:25:39 -08:00
James Zern 45dc0d34d2 Merge changes I65d86038,If3299fe5,I3ef1ff19 into main
* changes:
  Add Neon implementation of high bitdepth 32x32 hadamard transform
  Add Neon implementation of high bitdepth 16x16 hadamard transform
  Add Neon implementation of high bitdepth 8x8 hadamard transform
2023-02-24 17:58:15 +00:00
James Zern 3cf0568ace Merge changes Ia64d175a,Ie4ea8f0a into main
* changes:
  vp9_loop_filter_alloc: clear -Wshadow warnings
  vp9_adapt_mode_probs: clear -Wshadow warning
2023-02-24 17:49:25 +00:00
Salome Thirot 111068923b Add Neon implementation of high bitdepth 32x32 hadamard transform
Add Neon implementation of vpx_highbd_hadamard_32x32 as well as the
corresponding tests.

Change-Id: I65d8603896649de1996b353aa79eee54824b4708
2023-02-24 11:10:14 +00:00
Salome Thirot 6ec45f933c Add Neon implementation of high bitdepth 16x16 hadamard transform
Add Neon implementation of vpx_highbd_hadamard_16x16 as well as the
corresponding tests.

Change-Id: If3299fe556351dfe3db994ac171d83a95ea1504b
2023-02-24 11:09:57 +00:00
Jerome Jiang 1614895e06 Merge "vp9 rc test: change param type to bool" into main 2023-02-24 01:45:54 +00:00
Jerome Jiang 221d76ab9c vp9 rc test: change param type to bool
Change-Id: Ib45522e32d9137678da9062830044e9dd87537e5
2023-02-23 14:28:30 -05:00
Chi Yo Tsai 49807f88d6 Merge "Disable some intra modes for TX_32X32" into main 2023-02-23 18:01:05 +00:00
Salome Thirot aab93ee6b6 Add Neon implementation of high bitdepth 8x8 hadamard transform
Add Neon implementation of vpx_highbd_hadamard_8x8 as well as the
corresponding tests.

Change-Id: I3ef1ff199d76b6b010591ef15a81b0f36c9ded03
2023-02-23 17:09:52 +00:00
James Zern 76389886ee vp9_loop_filter_alloc: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ia64d175aa69dc2ecde2babf64bde04f02b32795b
2023-02-22 22:13:02 -08:00
James Zern f569a4d68c vp9_adapt_mode_probs: clear -Wshadow warning
Bug: webm:1793
Change-Id: Ie4ea8f0a3295e6f58dc6f7d5c61d46700c539d40
2023-02-22 22:08:36 -08:00
James Zern 03b97add02 Merge "vp9_block.h: rename diff struct to Diff" into main 2023-02-23 06:07:25 +00:00
chiyotsai 4ba3be9324 Disable some intra modes for TX_32X32
Performance:
| SPD_SET | TESTSET | AVG_PSNR | OVR_PSNR |  SSIM   | ENC_T |
|---------|---------|----------|----------|---------|-------|
|    0    | hdres2  | +0.036%  | +0.032%  | +0.014% | -3.9% |
|    0    | lowres2 | -0.002%  | -0.011%  | +0.020% | -3.6% |
|    0    | midres2 | +0.045%  | +0.025%  | -0.007% | -4.0% |

STATS_CHANGED

Change-Id: I75a927333d26f2a37f0dda57a641b455b845f5b9
2023-02-22 14:36:21 -08:00
James Zern 3712a5869c vpx_subpixel_8t_intrin_avx2: clear -Wshadow warnings
no changes to assembly

Bug: webm:1793
Change-Id: I6a82290cafee7f4a7909d497ccfdefd5a78fb8ed
2023-02-22 12:54:54 -08:00
James Zern 46add73f7e vp9_block.h: rename diff struct to Diff
This matches the style guide and fixes some -Wshadow warnings related to
variables with the same name. Something similar was done in libaom in:
863b04994b Fix warnings reported by -Wshadow: Part2: av1 directory

Bug: webm:1793
Change-Id: I4df1bbc8d079a3174d75f0d35d54c200ffdbb677
2023-02-22 11:59:02 -08:00
Yunqing Wang 910245f1fe Merge "Skip redundant iterations in joint motion search " into main 2023-02-22 19:28:17 +00:00
Jerome Jiang f7ca33c46c Merge "vp9 rc: Make it work for SVC parallel encoding" into main 2023-02-22 14:59:49 +00:00
Salome Thirot 6ed9639e43 Optimize Neon implementation of high bitpdeth variance functions
Specialize implementation of high bitdepth variance functions such that
we only widen data processing element types when absolutely necessary.

Change-Id: If4cc3fea7b5ab0821e3129ebd79ff63706a512bf
2023-02-21 20:03:56 +00:00
Deepa K G c4ee2b2f03 Skip redundant iterations in joint motion search
In joint_motion_search, there are four iterations.
Even iterations search in the first reference frame
and odd iterations search in the second. The last two
iterations use the search result of the first two
iterations as the start point. If the search result does
not change,last two iterations are not necessary and can
be skipped.

          Instruction Count
cpu-used   Reduction(%)
  0          1.411

Change-Id: Ie583c9f75dd0a22bbdfb432ccdd62eea6ec4fce8
2023-02-21 18:05:23 +05:30
Jerome Jiang 0f888815c5 vp9 rc: Make it work for SVC parallel encoding
Added unit test.

Keep track of spatial layer id and frame type in case where spatial
layers are encoded parallel by the hardware encoder.

ComputeQP() / PostEncodeUpdate() doesn't need to be called sequentially
when there is no inter layer prediction.

Bug: b/257368998
Change-Id: I50beaefcfc205d3f9a9d3dbe11fead5bfdc71489
2023-02-17 20:44:22 -05:00
Jerome Jiang 32c1a4bf3f Merge "vp9 rc: Verify QP for all spatial layers" into main 2023-02-17 02:11:31 +00:00
Jerome Jiang be2fd0c740 vp9 rc: Verify QP for all spatial layers
Change-Id: Ic669c96d25d7c039d370e9acd00dc45e09054552
2023-02-16 19:23:42 -05:00
chiyotsai b737865480 Relax frame recode tolerance on speed 0 to 1 above 480p
Performance:
| SPD_SET | TESTSET | AVG_PSNR | OVR_PSNR |  SSIM   | ENC_T |
|---------|---------|----------|----------|---------|-------|
|    0    | hdres2  | -0.028%  | +0.030%  | -0.408% | -2.0% |
|    0    | lowres2 | +0.000%  | +0.000%  | +0.000% | +0.0% |
|    0    | midres2 | -0.138%  | +0.042%  | -0.427% | -2.5% |
|---------|---------|----------|----------|---------|-------|
|    1    | hdres2  | -0.032%  | +0.018%  | -0.342% | -1.1% |
|    1    | lowres2 | +0.000%  | +0.000%  | +0.000% | +0.0% |
|    1    | midres2 | +0.050%  | +0.060%  | -0.257% | -1.6% |

Rate Error:
|         |         |     AVG_RC_ERROR    |     MAX_RC_ERROR    |
|         |         |---------------------|---------------------|
| SPD_SET | TESTSET |   BASE   |   TEST   |   BASE   |   TEST   |
|---------|---------|----------|----------|----------|----------|
|    0    | hdres2  |  33.044% |  33.065% | 149.903% | 149.903% |
|    0    | midres2 |  59.632% |  59.566% |  79.091% |  79.249% |
|---------|---------|----------|----------|----------|----------|
|    1    | hdres2  |  33.050% |  33.057% | 151.278% | 151.278% |
|    1    | midres2 |  59.640% |  59.614% |  78.707% |  78.842% |

STATS_CHANGED

Change-Id: I5d09601fede3912d5173717ce9dd070df3a97ec8
2023-02-16 13:25:06 -08:00
chiyotsai 660031ccf3 Enable some more speed features on speed 0 to 2
Performance:
| SPD_SET | TESTSET | AVG_PSNR | OVR_PSNR |  SSIM   | ENC_T |
|---------|---------|----------|----------|---------|-------|
|    0    | hdres2  | +0.034%  | +0.030%  | +0.033% | -3.7% |
|    0    | lowres2 | +0.012%  | +0.017%  | +0.044% | -2.1% |
|    0    | midres2 | +0.030%  | +0.035%  | +0.060% | -1.9% |
|---------|---------|----------|----------|---------|-------|
|    1    | hdres2  | +0.027%  | +0.036%  | +0.030% | -2.7% |
|    1    | lowres2 | -0.006%  | -0.002%  | +0.006% | -1.0% |
|    1    | midres2 | -0.006%  | -0.012%  | -0.010% | -1.0% |
|---------|---------|----------|----------|---------|-------|
|    2    | hdres2  | -0.006%  | -0.001%  | -0.020% | -2.4% |
|    2    | lowres2 | -0.010%  | -0.015%  | -0.001% | -0.9% |
|    2    | midres2 | +0.006%  | -0.005%  | +0.009% | -1.0% |

STATS_CHANGED

Change-Id: I1431ac07215bb844739a410697387b9aead82792
2023-02-14 10:11:58 -08:00
James Zern bc2965ff72 Merge changes Id74a6d9c,I5c31e0e9,Id5a2b2d9,I73182c97,I2f5916d5, ... into main
* changes:
  Optimize vpx_highbd_comp_avg_pred_neon
  Add Neon AvgPredTestHBD test suite
  Specialize Neon high bitdepth avg subpel variance by filter value
  Specialize Neon high bitdepth subpel variance by filter value
  Refactor Neon high bitdepth avg subpel variance functions
  Optimize Neon high bitdepth subpel variance functions
2023-02-14 02:46:51 +00:00
Salome Thirot ed68c267cf Optimize vpx_highbd_comp_avg_pred_neon
Optimize the implementation of vpx_highbd_comp_avg_pred_neon by making
use of the URHADD instruction to compute the average.

Change-Id: Id74a6d9c33e89bc548c3c7ecace59af69051b4a7
2023-02-13 20:23:14 +00:00
Salome Thirot b17993ca67 Add Neon AvgPredTestHBD test suite
Add test suite for vpx_highbd_comp_avg_pred_neon.

Change-Id: I5c31e0e990661ee3b8030bb517829c088fceae4d
2023-02-13 20:23:09 +00:00
Salome Thirot e03217c9d5 Specialize Neon high bitdepth avg subpel variance by filter value
Use the same specialization as for standard bitdepth. The rationale for
the specialization is as follows:

The optimal implementation of the bilinear interpolation depends on the
filter values being used. For both horizontal and vertical interpolation
this can simplify to just taking the source values, or averaging the
source and reference values - which can be computed more easily than a
bilinear interpolation with arbitrary filter values.

This patch introduces tests to find the most optimal bilinear
interpolation implementation based on the filter values being used.
This new specialization is only used for larger block sizes.

Change-Id: Id5a2b2d9fac6f878795a6ed9de2bc27d9e62d661
2023-02-13 20:23:02 +00:00
Salome Thirot c113d6b027 Specialize Neon high bitdepth subpel variance by filter value
Use the same specialization as for standard bitdepth. The rationale for
the specialization is as follows:

The optimal implementation of the bilinear interpolation depends on the
filter values being used. For both horizontal and vertical interpolation
this can simplify to just taking the source values, or averaging the
source and reference values - which can be computed more easily than a
bilinear interpolation with arbitrary filter values.

This patch introduces tests to find the most optimal bilinear
interpolation implementation based on the filter values being used.
This new specialization is only used for larger block sizes.

Change-Id: I73182c979255f0332a274f2e5907df7f38c9eeb3
2023-02-13 20:22:56 +00:00
Salome Thirot 7343d56c1b Refactor Neon high bitdepth avg subpel variance functions
Use the same general code style as in the standard bitdepth Neon
implementation - merging the computation of vpx_highbd_comp_avg_pred
with the second pass of the bilinear filter to avoid storing and loading
the block again.

Also move vpx_highbd_comp_avg_pred_neon to its own file (like the
standard bitdepth implementation) since we're no longer using it for
averaging sub-pixel variance.

Change-Id: I2f5916d5b397db44b3247b478ef57046797dae6c
2023-02-13 20:22:50 +00:00
Salome Thirot 42cb3dbf94 Optimize Neon high bitdepth subpel variance functions
Use the same general code style as in the standard bitdepth Neon
implementation. Additionally, do not unnecessarily widen to 32-bit data
types when doing bilinear filtering - allowing us to process twice as
many elements per instruction.

Change-Id: I1e178991d2aa71f5f77a376e145d19257481e90f
2023-02-13 20:19:30 +00:00
James Zern b5e1945af0 README: update release version to 1.13.0
this was missed in the v1.13.0 tag

Bug: webm:1780
Change-Id: I3044534123bf67861174970e6241f6586055358e
(cherry picked from commit 184a886917)
2023-02-13 18:35:10 +00:00
James Zern 184a886917 README: update release version to 1.13.0
this was missed in the v1.13.0 tag

Bug: webm:1780
Change-Id: I3044534123bf67861174970e6241f6586055358e
2023-02-10 19:04:41 -08:00
Chi Yo Tsai 5595e18870 Merge "Remove CONFIG_CONSISTENT_RECODE flag" into main 2023-02-10 22:13:50 +00:00
chiyotsai 086f0e6538 Remove CONFIG_CONSISTENT_RECODE flag
Currently, libvpx does not properly clear and re-initialize the memories
when it re-encodes a frame. As a result, out-of-date values are used in
the encoding process, and re-encoding a frame with the same parameter
will give different outputs.

This commit enables the code under CONFIG_CONSISTENT_RECODE to correct
this behavior. This change has minor effect on the coding performance,
but it ensures valid values are used in the encoding process.

Furthermore, the flag is removed as it is now always turned on.

Performance:
| SPD_SET | TESTSET | AVG_PSNR | OVR_PSNR |  SSIM   | ENC_T |
|---------|---------|----------|----------|---------|-------|
|    0    | hdres2  | -0.012%  | -0.021%  | -0.030% | +0.1% |
|    0    | lowres2 | +0.029%  | +0.019%  | +0.047% | +0.1% |
|    0    | midres2 | -0.004%  | +0.009%  | +0.026% | +0.1% |
|---------|---------|----------|----------|---------|-------|
|    1    | hdres2  | +0.032%  | +0.032%  | -0.000% | -0.0% |
|    1    | lowres2 | -0.005%  | -0.011%  | -0.014% | +0.0% |
|    1    | midres2 | +0.004%  | +0.020%  | +0.027% | +0.2% |
|---------|---------|----------|----------|---------|-------|
|    2    | hdres2  | +0.048%  | +0.056%  | +0.057% | +0.1% |
|    2    | lowres2 | +0.007%  | +0.002%  | -0.016% | -0.0% |
|    2    | midres2 | -0.015%  | -0.008%  | -0.002% | +0.1% |
|---------|---------|----------|----------|---------|-------|
|    3    | hdres2  | +0.010%  | +0.014%  | +0.004% | -0.0% |
|    3    | lowres2 | +0.000%  | -0.021%  | -0.001% | +0.0% |
|    3    | midres2 | +0.007%  | -0.038%  | +0.012% | -0.2% |
|---------|---------|----------|----------|---------|-------|
|    4    | hdres2  | +0.107%  | +0.136%  | +0.124% | -0.0% |
|    4    | lowres2 | -0.012%  | -0.024%  | -0.020% | -0.0% |
|    4    | midres2 | +0.055%  | -0.004%  | +0.048% | -0.1% |
|---------|---------|----------|----------|---------|-------|
|    5    | hdres2  | +0.026%  | +0.027%  | +0.020% | -0.0% |
|    5    | lowres2 | +0.009%  | -0.008%  | +0.028% | +0.1% |
|    5    | midres2 | -0.025%  | +0.021%  | -0.020% | -0.1% |

STATS_CHANGED

Change-Id: I3967aee8c8e4d0608a492e07f99ab8de9744ba57
2023-02-10 13:06:51 -08:00
James Zern 924716523e Merge "Optimize Neon high bitdepth convolve copy" into main 2023-02-10 03:35:22 +00:00
Jerome Jiang f903d99650 Merge "Merge tag 'v1.13.0'" into main 2023-02-09 22:07:28 +00:00
Jerome Jiang e8bd0842c5 Merge "Remove onyx_int.h from vp8 rc header" into main 2023-02-09 21:27:59 +00:00
Jerome Jiang 5edaa583e1 Remove onyx_int.h from vp8 rc header
Also move the FRAME_TYPE declaration to common.h

Bug: webm:1766

Change-Id: Ic3016bd16548a5d2e0ae828a7fd7ad8adda8b8f6
2023-02-09 15:15:59 -05:00
Jerome Jiang 121dc7513f Merge tag 'v1.13.0'
Release v1.13.0 Ugly Duckling

2023-01-31 v1.13.0 "Ugly Duckling"

  This release includes more Neon and AVX2 optimizations, adds a new codec
  control to set per frame QP, upgrades GoogleTest to v1.12.1, and includes
  numerous bug fixes.

- Upgrading:
    This release is ABI incompatible with the previous release.

    New codec control VP9E_SET_QUANTIZER_ONE_PASS to set per frame QP.

    GoogleTest is upgraded to v1.12.1.

    .clang-format is upgraded to clang-format-11.

    VPX_EXT_RATECTRL_ABI_VERSION was bumped due to incompatible changes to the
    feature of using external rate control models for vp9.

- Enhancement:
    Numerous improvements on Neon optimizations.
    Numerous improvements on AVX2 optimizations.
    Additional ARM targets added for Visual Studio.

- Bug fixes:
    Fix to calculating internal stats when frame dropped.
    Fix to segfault for external resize test in vp9.
    Fix to build system with replacing egrep with grep -E.
    Fix to a few bugs with external RTC rate control library.
    Fix to make SVC work with VBR.
    Fix to key frame setting in VP9 external RC.
    Fix to -Wimplicit-int (Clang 16).
    Fix to VP8 external RC for buffer levels.
    Fix to VP8 external RC for dynamic update of layers.
    Fix to VP9 auto level.
    Fix to off-by-one error of max w/h in validate_config.
    Fix to make SVC work for Profile 1.

Bug: webm:1780

Change-Id: I371fc1444ead56f8d7fc510e05582b6415c3ddb1
2023-02-09 20:00:46 +00:00
Jonathan Wright 459cfc8bae Optimize Neon high bitdepth convolve copy
Use standard loads and stores instead of the significantly slower
interleaving/de-interleaving variants. Also move all loads in loop
bodies above all stores as a mitigation against the compiler thinking
that the src and dst pointers alias (since we can't use restrict in
C89.)

Change-Id: Idd59dca51387f553f8db27144a2b8f2377c937d3
2023-02-09 12:14:18 +00:00
Chi Yo Tsai d3275163c1 Merge "Copy BLOCK_8X8's mi to PICK_MODE_CONTEXT::mic" into main 2023-02-08 23:16:48 +00:00
chiyotsai b6951d2b0f Copy BLOCK_8X8's mi to PICK_MODE_CONTEXT::mic
STATS_CHANGED

BUG=webm:1789

Change-Id: I74efe28bdf90a179c59fe3d1f5a15d497f57080d
2023-02-08 14:01:19 -08:00
Salome Thirot bb065c6c6d Add missing high bitdepth Neon subpel variance tests
Add missing 4x4 and 4x8 tests for both high bitdepth sub-pixel variance
and high bitdepth averaging sub-pixel variance.

Change-Id: I042752c5b7ccc14f58075694d0bb1d36f144ad06
2023-02-08 19:28:09 +00:00
Cheng Chen d6eb9696aa Fix unsigned integer overflow in sse computation
Basically port the fix from libaom:
https://aomedia-review.googlesource.com/c/aom/+/169361

Change-Id: Id06a5db91372037832399200ded75d514e096726
(cherry picked from commit a94cdd57ff)
2023-02-08 01:33:51 +00:00
Chi Yo Tsai 73cdc9fd1e Merge "Enable some speed features on speed 0" into main 2023-02-08 00:44:46 +00:00
chiyotsai 03ddac40df Enable some speed features on speed 0
Performance:
| SPD_SET | TESTSET | AVG_PSNR | OVR_PSNR |  SSIM   | ENC_T |
|---------|---------|----------|----------|---------|-------|
|    0    | hdres2  | +0.069%  | +0.067%  | +0.100% | -8.6% |
|    0    | midres2 | +0.116%  | +0.103%  | +0.062% | -9.6% |
|    0    | lowres2 | +0.276%  | +0.283%  | +0.214% |-11.9% |

STATS_CHANGED

Change-Id: I8b26c0be2312fcd0f8c9e889367682e80ea8de4b
2023-02-07 15:06:06 -08:00
Salome Thirot 25a6b2b181 Use 4D reduction Neon helper for standard bitdepth SAD4D
Move the 4D reduction helper function to sum_neon.h and use this for
both standard and high bitdepth SAD4D paths. This also removes the
AArch64 requirement for using the UDOT Neon SAD4D paths.

Change-Id: I207f76b3d42aa541809b0672c3b3d86e54d133ff
2023-02-07 17:08:48 +00:00
Yunqing Wang 9b910a65ed Merge "Move TPL to a new file" into main 2023-02-07 04:22:40 +00:00
James Zern 9a26870002 Merge changes Ica45c44f,I75c5f099,I9e626d7f into main
* changes:
  Optimize Neon implementation of high bitdepth SAD4D functions
  Optimize Neon implementation of high bitdepth avg SAD functions
  Optimize Neon implementation of high bitdepth SAD functions
2023-02-07 01:32:03 +00:00
Yunqing Wang ec8e2fe1cf Move TPL to a new file
This is a refactoring CL.

Change-Id: Ic8c1575601d27f14ecd1b1bf0a038e447eaae458
2023-02-06 16:34:08 -08:00
Jerome Jiang d2557313d2 Merge "Remove duplicated VPX_SCALING declaration" into main 2023-02-06 22:16:41 +00:00
Salome Thirot 6b8e9e1f3e Optimize Neon implementation of high bitdepth SAD4D functions
Optimizations take a similar form to those implemented for Armv8.0
standard bitdepth SAD4D:

- Use ABD, UADALP instead of ABAL, ABAL2 (double the throughput on
  modern out-of-order Arm-designed cores.)
- Use more accumulator registers to make better use of Neon pipeline
  resources on Arm CPUs that have four Neon pipes.
- Compute the four SAD sums in parallel so that we only load the source
  block once - instead of four times.

Change-Id: Ica45c44fd167e5fcc83871d8c138fc72ed3a9723
2023-02-06 21:04:52 +00:00
Jerome Jiang 5eea5c7666 Remove duplicated VPX_SCALING declaration
Use VPX_SCALING_MODE instead

Change-Id: Iab9d29f20838703e00bd9f7641035d8ebd69af53
2023-02-06 13:32:37 -05:00
Salome Thirot 9a5cbfbc08 Optimize Neon implementation of high bitdepth avg SAD functions
Optimizations take a similar form to those implemented for standard
bitdepth averaging SAD:

- Use ABD, UADALP instead of ABAL, ABAL2 (double the throughput on
  modern out-of-order Arm-designed cores.)
- Use more accumulator registers to make better use of Neon pipeline
  resources on Arm CPUs that have four Neon pipes.

Change-Id: I75c5f09948f6bf17200f82e00e7a827a80451108
2023-02-06 15:54:57 +00:00
Salome Thirot e3028ddbb4 Optimize Neon implementation of high bitdepth SAD functions
Optimizations take a similar form to those implemented for standard
bitdepth SAD:

- Use ABD, UADALP instead of ABAL, ABAL2 (double the throughput on
  modern out-of-order Arm-designed cores.)
- Use more accumulator registers to make better use of Neon pipeline
  resources on Arm CPUs that have four Neon pipes.

Change-Id: I9e626d7fa0e271908dc43448405a7985b80e6230
2023-02-06 15:51:43 +00:00
Yunqing Wang a77c7a78ae Merge "Fix uninitialized mesh feature for BEST mode" into main 2023-02-03 23:22:58 +00:00
Wan-Teh Chang 18a3421b7d Set _img->bit_depth in y4m_input_fetch_frame()
This is a port of
https://aomedia-review.googlesource.com/c/aom/+/169961.

Change-Id: I2aa0d12cafde0c73448bf8c57eab0cd92e846468
2023-02-03 14:07:09 -08:00
Yunqing Wang d6382e4469 Fix uninitialized mesh feature for BEST mode
At BEST encoding mode, the mesh search range wasn't initialized for
non FC_GRAPHICS_ANIMATION content type, which actually/mistakenly
used speed 0's setting. Fixed it by adding the initialization.

There were 2 ways to fix this. Patchset 1 set to use speed 0's setting
for non FC_GRAPHICS_ANIMATION type. This didn't change BEST mode's
encoding results much, and only a couple of clips' results were changed.

Borg result for BEST mode:
         avg_psnr:  ovr_psnr:  ssim:  encoding_spdup:
lowres2:  -0.004     -0.003   -0.000    0.030
midres2:  -0.006     -0.009   -0.012    0.033
hdres2:    0.002      0.002    0.004    0.015

Patchset 2 set to use BEST's setting for non FC_GRAPHICS_ANIMATION type.
However, the majority of test clips' BDrate got changed up to
~0.5% (gain or loss), and overall it didn't give better performance
than patchset 1. So, we chose to use patchset 1.

Change-Id: Ibbf578dad04420e6ba22cb9a3ddec137a7e4deef
2023-02-03 12:02:11 -08:00
James Zern 858a8c611f vp9_diamond_search_sad_neon: use DECLARE_ALIGNED
rather than the gcc specific __attribute__((aligned())); fixes build
targeting ARM64 windows.

Bug: webm:1788
Change-Id: I2210fc215f44d90c1ce9dee9b54888eb1b78c99e
2023-02-01 14:50:01 -08:00
Jerome Jiang b5a2b3a929 Update AUTHORS .mailmap and version
Bug: webm:1780
Change-Id: I75a24bdd076dc1746b23bababfaafccbce3b4214
2023-02-01 16:30:35 -05:00
Jerome Jiang aa5b62236a Fix per frame qp for temporal layers
Also add tests with fixed temporal layering mode.

Change-Id: If516fe94e3fb7f5a745821d1788bfe6cf90edaac
(cherry picked from commit db69ce6aea)
2023-01-31 18:58:26 -05:00
Jerome Jiang 3f109f786a Update CHANGELOG
Bug: webm:1780
Change-Id: I3ab4729bff1d27ef7127ef26e780a469e9278c21
2023-01-31 18:58:21 -05:00
James Zern 50e1b76e32 Merge "Use load_unaligned mem_neon.h helpers in SAD and SAD4D" into main 2023-01-31 21:20:16 +00:00
Jonathan Wright 472c839c9f Use load_unaligned mem_neon.h helpers in SAD and SAD4D
Use the load_unaligned helper functions in mem_neon.h to load strided
sequences of 4 bytes where alignment is not guaranteed in the Neon
SAD and SAD4D paths.

Change-Id: I941d226ef94fd7a633b09fc92165a00ba68a1501
2023-01-31 15:39:21 +00:00
Cheng Chen a94cdd57ff Fix unsigned integer overflow in sse computation
Basically port the fix from libaom:
https://aomedia-review.googlesource.com/c/aom/+/169361

Change-Id: Id06a5db91372037832399200ded75d514e096726
2023-01-30 23:02:22 +00:00
James Zern 7a8ba7ea02 Merge "Refactor 8x8 16-bit Neon transpose functions" into main 2023-01-30 19:30:45 +00:00
Salome Thirot 8047e6f2b3 Refactor Neon implementation of SAD4D functions
Refactor and optimize the Neon implementation of SAD4D functions -
effectively backporting these libaom changes[1,2].

[1] https://aomedia-review.googlesource.com/c/aom/+/162181
[2] https://aomedia-review.googlesource.com/c/aom/+/162183

Change-Id: Icb04bd841d86f2d0e2596aa7ba86b74f8d2d360b
2023-01-30 13:14:54 +00:00
Yunqing Wang 698392d7fe Merge "Add encoder component timing information" into main 2023-01-28 00:27:57 +00:00
Yunqing Wang 5dd3d70a4f Add encoder component timing information
Change-Id: Iaa5b73a9593ecfd74b6426ed47d2b529ec7ae2b5
2023-01-27 11:33:29 -08:00
Gerda Zsejke More 5e92d6d103 Refactor 8x8 16-bit Neon transpose functions
Refactor the Neon implementation of transpose_s16_8x8(q) and
transpose_u16_8x8 so that the final step compiles to 8 ZIP1/ZIP2
instructions as opposed to 8 EXT, MOV pairs. This change removes 8
instructions per call to transpose_s16_8x8(q), transpose_u16_8x8
where the result stays in registers for further processing - rather
than being stored to memory - like in vpx_hadamard_8x8_neon, for
example.

This is a backport of this libaom patch[1].
[1] https://aomedia-review.googlesource.com/c/aom/+/169426

Change-Id: Icef3e51d40efeca7008e1c4fc701bf39bd319c88
2023-01-27 17:13:30 +01:00
Jerome Jiang 5c38ffbfa3 Merge "Fix per frame qp for temporal layers" into main 2023-01-26 21:31:14 +00:00
Jerome Jiang db69ce6aea Fix per frame qp for temporal layers
Also add tests with fixed temporal layering mode.

Change-Id: If516fe94e3fb7f5a745821d1788bfe6cf90edaac
2023-01-26 14:53:40 -05:00
James Zern ade7b131cc Merge "Refactor Neon implementation of SAD functions" into main 2023-01-26 03:26:38 +00:00
James Zern 2f24e444dd Merge "[NEON] Add Highbd FHT 8x8/16x16 functions" into main 2023-01-26 03:23:31 +00:00
Salome Thirot 7fed9187c4 Refactor Neon implementation of SAD functions
Refactor and optimize the Neon implementation of SAD functions -
effectively backporting these libaom changes[1,2,3].

[1] https://aomedia-review.googlesource.com/c/aom/+/161921
[2] https://aomedia-review.googlesource.com/c/aom/+/161923
[3] https://aomedia-review.googlesource.com/c/aom/+/166963

Change-Id: I2d72fd0f27d61a3e31a78acd33172e2afb044cb8
2023-01-25 15:35:51 +00:00
Konstantinos Margaritis 3384b83da0 [NEON] Add Highbd FHT 8x8/16x16 functions
In total this gives about 9% extra performance for both rt/best
profiles.
Furthermore, add transpose_s32 16x16 function

Change-Id: Ib6f368bbb9af7f03c9ce0deba1664cef77632fe2
2023-01-24 20:56:02 +00:00
Jerome Jiang 72cfcdd95a Skip calculating internal stats when frame dropped
Bug: webm:1771
Change-Id: I30cd5b7ec0945b521a1cc03999d39ec6a25f1696
2023-01-24 14:08:17 -05:00
Salome Thirot 67abc67389 Specialize Neon averaging subpel variance by filter value
Use the same specialization for averaging subpel variance functions
as used for the non-averaging variants. The rationale for the
specialization is as follows:

The optimal implementation of the bilinear interpolation depends on
the filter values being used. For both horizontal and vertical
interpolation this can simplify to just taking the source values, or
averaging the source and reference values - which can be computed
more easily than a bilinear interpolation with arbitrary filter
values.

This patch introduces tests to find the most optimal bilinear
interpolation implementation based on the filter values being used.
This new specialization is only used for larger block sizes

This is a backport of this libaom change[1].

After this change, the only differences between the code in libvpx and
libaom are due to libvpx being compiled with ISO C90, which forbids
mixing declarations and code [-Wdeclaration-after-statement].

[1] https://aomedia-review.googlesource.com/c/aom/+/166962

Change-Id: I7860c852db94a7c9c3d72ae4411316685f3800a4
2023-01-23 15:06:28 +00:00
Salome Thirot b7f6c64139 Refactor Neon averaging subpel variance functions
Merge the computation of vpx_comp_avg_pred into the second pass of the
bilinear filter - avoiding the overhead of loading and storing the
entire block again.

This is a backport of this libaom change[1].

[1] https://aomedia-review.googlesource.com/c/aom/+/166961

Change-Id: I9327ff7382a46d50c42a5213a11379b957146372
2023-01-23 15:06:20 +00:00
Salome Thirot ae5b60cb47 Specialize Neon subpel variance by filter value for large blocks
The optimal implementation of the bilinear interpolation depends on
the filter values being used. For both horizontal and vertical
interpolation this can simplify to just taking the source values, or
averaging the source and reference values - which can be computed
more easily than a bilinear interpolation with arbitrary filter
values.

This patch introduces tests to find the most optimal bilinear
interpolation implementation based on the filter values being used.
This new specialization is only used for larger block sizes
(>= 16x16) as we need to be doing enough work to make the cost of
finding the optimal implementation worth it.

This is a backport of this libaom change[1].

After this change, the only differences between the code in libvpx and
libaom are due to libvpx being compiled with ISO C90, which forbids
mixing declarations and code [-Wdeclaration-after-statement].

[1] https://aomedia-review.googlesource.com/c/aom/+/162463

Change-Id: Ia818e148f6fd126656e8411d59c184b55dd43094
2023-01-23 13:11:59 +00:00
Salome Thirot fcfb471ce2 Refactor Neon subpel variance functions
Refactor the Neon implementation of the sub-pixel variance bilinear
filter helper functions - effectively backporting this libaom patch[1].

[1] https://aomedia-review.googlesource.com/c/aom/+/162462

Change-Id: I3dee32e8125250bbeffeb63d1fef5da559bacbf1
2023-01-23 12:03:20 +00:00
Jerome Jiang b7c22b3a95 Merge "Add codec control to set per frame QP" into main 2023-01-20 17:14:04 +00:00
Jerome Jiang ae4240edc7 Add codec control to set per frame QP
Use case is for 1 pass encoding.
Forces max_quantizer = min_quantizer and aq-mode = 0.
Applicalble to spatial layers, where user may set
the QP per spatial layer.

Change-Id: Idfcb7daefde94c475ed1bc0eb8af47c9f309110b
2023-01-19 20:38:44 -05:00
James Zern 308d8638aa Merge "Refactor Neon implementation of variance functions" into main 2023-01-19 19:44:43 +00:00
James Zern 5e86179533 */Android.mk: add a check for NDK_ROOT
This simplifies integration with the Android platform and avoids the
files from being used when a non-NDK build is performed. In that case
Android.bp is preferred.

Change-Id: I803912146dac788b7f0af27199c7613cabbc9fa0
2023-01-18 19:19:01 -08:00
Salome Thirot 0ce866562f Refactor Neon implementation of variance functions
Refactor and optimize the Neon implementation of variance functions -
effectively backporting these libaom changes[1,2].

After this change, the only differences between the code in libvpx and
libaom are due to libvpx being compiled with ISO C90, which forbids
mixing declarations and code [-Wdeclaration-after-statement].

[1] https://aomedia-review.googlesource.com/c/aom/+/162241
[2] https://aomedia-review.googlesource.com/c/aom/+/162262

Change-Id: Ia4e8fff4d53297511d1a1e43bca8053bf811e551
2023-01-18 21:35:33 +00:00
Marco Paniconi 7a8052ccda Merge "Fix to segfault for external resize test in vp9" into main 2023-01-18 02:04:18 +00:00
Marco Paniconi 71d01660cc Fix to segfault for external resize test in vp9
Failure occurs for 1 pass non-realtime mode at speed 0.
Due to speed feautre rd_ml_partition.var_pruning, which
doesn't check for scaled reference in simple_motion_search().

Bug: webm:1768

Change-Id: Iddcb56033bac042faebb5196eed788317590b23f
2023-01-13 20:21:12 -08:00
Scott LaVarnway 59d4a68616 variance_test.cc: Enable HBDMse speed test.
Change-Id: If0226307a6efd704f8a35cb986f570304d698b95
2023-01-13 07:39:41 -08:00
Scott LaVarnway 35008f5e13 Merge "variance_test.cc: Enable VpxHBDMseTest for C and SSE2." into main 2023-01-13 13:36:15 +00:00
Scott LaVarnway 32878bb1f3 variance_test.cc: Enable VpxHBDMseTest for C and SSE2.
Change-Id: I66c0db6c605876d6757684fd715614881ca261e7
2023-01-12 13:29:49 -08:00
James Zern e9cc25c51c Merge changes Ifbf46768,If19f5872 into main
* changes:
  Implement vertical convolutions using Neon USDOT instruction
  Implement horizontal convolutions using Neon USDOT instruction
2023-01-12 18:41:27 +00:00
Jonathan Wright 5645938c36 Implement vertical convolutions using Neon USDOT instruction
Add additional AArch64 paths for vpx_convolve8_vert_neon and
vpx_convolve8_avg_vert_neon that use the Armv8.6-A USDOT (mixed-sign
dot-product) instruction. The USDOT instruction takes an 8-bit
unsigned operand vector and a signed 8-bit operand vector to produce
a signed 32-bit result. This is helpful because convolution filters
often have both positive and negative values, while the 8-bit pixel
channel data being filtered is all unsigned. As a result, the USDOT
convolution paths added here do not have to do the "transform the
pixel channel data to [-128, 128) and correct for it later" dance
that we have to do with the SDOT paths.

The USDOT instruction is optional from Armv8.2 to Armv8.5 but
mandatory from Armv8.6 onwards. The availability of the USDOT
instruction is indicated by the feature macro
__ARM_FEATURE_MATMUL_INT8. The SDOT paths are retained for use on
target CPUs that do not implement the USDOT instructions.

Change-Id: Ifbf467681dd53bb1d26e22359885e6edde3c5c72
2023-01-12 10:43:13 +00:00
Jonathan Wright f952068691 Implement horizontal convolutions using Neon USDOT instruction
Add additional AArch64 paths for vpx_convolve8_horiz_neon and
vpx_convolve8_avg_horiz_neon that use the Armv8.6-A USDOT (mixed-sign
dot-product) instruction. The USDOT instruction takes an 8-bit
unsigned operand vector and a signed 8-bit operand vector to produce
a signed 32-bit result. This is helpful because convolution filters
often have both positive and negative values, while the 8-bit pixel
channel data being filtered is all unsigned. As a result, the USDOT
convolution paths added here do not have to do the "transform the
pixel channel data to [-128, 128) and correct for it later" dance
that we have to do with the SDOT paths.

The USDOT instruction is optional from Armv8.2 to Armv8.5 but
mandatory from Armv8.6 onwards. The availability of the USDOT
instruction is indicated by the feature macro
__ARM_FEATURE_MATMUL_INT8. The SDOT paths are retained for use on
target CPUs that do not implement the USDOT instructions.

Change-Id: If19f5872c3453458a8cfb7c7d2be82a2c0eab46a
2023-01-11 12:18:45 +00:00
James Zern e067469e77 build: replace egrep with grep -E
avoids a warning on some platforms:
egrep: warning: egrep is obsolescent; using grep -E

Bug: webm:1786
Change-Id: Ia434297731303aacb0b02cf3dcbfd8e03936485d
Fixed: webm:1786
2023-01-10 13:49:15 -08:00
Jonathan Wright 708c4aa854 Use Neon load/store helper functions consistently
Define all Neon load/store helper functions in mem_neon.h and use
them consistently in Neon convolution functions.

Change-Id: I57905bc0a3574c77999cf4f4a73442c3420fa2be
2023-01-05 17:34:56 +00:00
Jonathan Wright ab1192c290 Use lane-referencing intrinsics in Neon convolution kernels
The Neon convolution helper functions take a pointer to a filter and
load the 8 values into a single Neon register. For some reason,
filter values 3 and 4 are then duplicated into their own separate
registers.

This patch modifies these helper functions so that they access filter
values 3 and 4 via the lane-referencing versions of the various Neon
multiply instructions. This reduces register pressure and tidies up
the source code quite a bit.

Change-Id: Ia4aeee8b46fe218658fb8577dc07ff04a9324b3e
2023-01-05 12:20:03 +00:00
Jerome Jiang 11151943b1 Remove references to deprecated NumPy type aliases
This change replaces references to a number of deprecated NumPy type
aliases (np.bool, np.int, np.float, np.complex, np.object, np.str)
with their recommended replacement
(bool, int, float, complex, object, str).

NumPy 1.24 drops the deprecated aliases
so we must remove uses before updating NumPy.

Change-Id: I9f5dfcbb11fe6534fce358054f210c7653f278c3
2022-12-21 11:17:04 -05:00
Scott LaVarnway e022d5b71f [x86]: Add vpx_highbd_comp_avg_pred_sse2().
C vs SSE2

4x4: 3.38x
8x8: 3.45x
16x16: 2.06x
32x32: 2.19x
64x64: 1.39x

Change-Id: I46638fe187b49a78fee554114fac51c485d74474
2022-12-20 15:59:20 -08:00
Scott LaVarnway 8838630016 Add vpx_highbd_comp_avg_pred_c() test.
Change-Id: I6b2c3379c49a62e56e5ac56fd4782a50b3c4e12a
2022-12-16 13:49:38 -08:00
Marco Paniconi 6eb4e9fcb3 Merge "rc-svc: Add tests for dynamic svc in external RC" into main 2022-12-14 17:08:21 +00:00
Marco Paniconi 55d3184503 rc-svc: Add tests for dynamic svc in external RC
Test to verify RC for going down and back up in
spatial layers. Going back up has an issue so added
a TODO.

Make the test more flexible to handle dynamic layers.
Test for dyanmic change in temporal layers to follow.

Change-Id: Ic5542f7b274135277429e116f56ba54e682e96a0
2022-12-14 00:19:47 -08:00
Anton Venema 89b8032ff5 Add additional ARM targets for Visual Studio.
configure: Add an armv7-win32-vs16 target
configure: Add an armv7-win32-vs17 target
configure: Add an arm64-win64-vs16 target
configure: Add an arm64-win64-vs17 target

Change-Id: I11d6cd6e51f7703939d6fd3fc6a7469591e3b09d
2022-12-13 18:09:56 -08:00
Cheng Chen 58e69d6c6a Merge "L2E: Add a new interface to control rdmult" into main 2022-12-13 01:24:00 +00:00
Scott LaVarnway a7bb04b435 [x86]: Add vpx_highbd_subtract_block_avx2().
Up to 4x faster than "sse2 vectorized C".

Change-Id: Ie9b3c12a437c5cddf92c4d5349c4f659ca6b82ea
2022-12-08 12:04:53 -08:00
Scott LaVarnway 1450ec46e2 Add vpx highbd subtract test.
Change-Id: I069ae0fe22bfc82ad5083df85a7fdf9058a285eb
2022-12-07 15:56:55 -08:00
Cheng Chen 5887bd234e L2E: Add a new interface to control rdmult
Allow external model to control frame rdmult.

A function is called per frame to get the value of rdmult from
the external model.

The external rdmult will overwrite libvpx's default rdmult unless
a reserved value is selected.

A unit test is added to test when the default rdmult value is set.

Change-Id: I2f17a036c188de66dc00709beef4bf2ed86a919a
2022-12-07 14:13:06 -08:00
Marco Paniconi cbb780ab0b rc-rtc: Test for periodic key in SVC external RC
This test catches the fix merged in here:
https://chromium-review.googlesource.com/c/webm/libvpx/+/4022904

Change-Id: Ib68fbcba694b5d465a9faf3ca7d6880bfe8eabb3
2022-12-05 14:34:58 -08:00
Marco Paniconi 2a8a25cf44 rc-rtc: Remove frame_flags_ change in svc ratectril rtc test
SVC test is only in CBR and the frame_flags are
set by the SVC pattern, so we shouldn't undo them
for svc mode.

Change-Id: I5ffa65dd58a7b47f287d124d9e71ba1dc7c5a549
2022-12-05 11:57:35 -08:00
Marco Paniconi d998bd8237 Merge "vp9/rate_ctrl_rtc: Improve get cyclic refresh data" into main 2022-11-18 04:16:26 +00:00
Hirokazu Honda 3fa698a6e8 vp9/rate_ctrl_rtc: Improve get cyclic refresh data
A client of the vp9 rate controller needs to know whether the
segmentation is enabled and the size of delta_q. It is also nicer to
know the size of map. This CL changes the interface to achieve these.

Bug: b:259487065
Test: Build

Change-Id: If05854530f97e1430a7b97788910f277ab673a87
2022-11-18 11:43:34 +09:00
Marco Paniconi 605350bd5b Merge "vp9-svc: Fixes to make SVC work with VBR" into main 2022-11-15 21:45:07 +00:00
Marco Paniconi 76e9bf7a18 vp9-svc: Fixes to make SVC work with VBR
Prior to this CL SVC with VBR mode was broken.
Fixes made here to make VBR rate control work for SVC.
Rename is_one_pass_cbr_svc() --> is_one_pass_svc(),
as it can be used now for both CBR and VBR.

Added rate targetting unittest for (2SL, 3TL).

Bug: chromium:1375111
Change-Id: I5a62ffe7fbea29dc5949c88a284768386b1907a9
2022-11-15 11:48:21 -08:00
James Zern c1406fc267 Merge "[NEON] Optimize FHT functions, add highbd FHT 4x4" into main 2022-11-15 19:19:43 +00:00
Johann aeb6ae7393 quantize: remove vp9_regular_quantize_b_4x4
This was just a helper function which called vpx_quantize_b or
vpx_highbd_quantize_b. It also checked for skip_block, which was
necessary when webm:1439 was filed but does not appear to be
necessary now.

Removes a quantize variant and makes subsequent cleanups easier.

Change-Id: Ibe545eccd19370f07ff26c8e151f290c642efd2a
2022-11-14 17:59:45 +09:00
Konstantinos Margaritis f951514a40 [NEON] Optimize FHT functions, add highbd FHT 4x4
Refactor & optimize FHT functions further, use new butterfly functions
4x4 5% faster, 8x8 & 16x16 10% faster than previous versions.
Highbd 4x4 FHT version 2.27x faster than C version for --rt.

Change-Id: I3ebcd26010f6c5c067026aa9353cde46669c5d94
2022-11-11 13:53:54 +00:00
Marco Paniconi 78ac7af95c vp9-rc: Fix key frame setting in external RC
Bug: b/257368998

Change-Id: I03e35915ac99b50cb6bdf7bce8b8f9ec5aef75b7
2022-11-10 22:10:07 -08:00
James Zern fb2d1616f6 Merge "Add Neon implementation of vpx_hadamard_32x32" into main 2022-11-07 21:48:50 +00:00
Sam James 62dee8012e build: fix -Wimplicit-int (Clang 16)
Clang 16 will make -Wimplicit-int error by default which can, in addition to
other things, lead to some configure tests silently failing/returning the wrong result.

Fixes this error:
```
+/var/tmp/portage/media-libs/libvpx-1.12.0/temp/vpx-conf-1802-30624.c:1:15: error: type specifier missing, defaults to 'int'; ISO C99 and later do not support implicit int [-Wimplicit-int]
```

For more information, see LWN.net [0] or LLVM's Discourse [1], gentoo-dev@ [2],
or the (new) c-std-porting mailing list [3].

[0] https://lwn.net/Articles/913505/
[1] https://discourse.llvm.org/t/configure-script-breakage-with-the-new-werror-implicit-function-declaration/65213
[2] https://archives.gentoo.org/gentoo-dev/message/dd9f2d3082b8b6f8dfbccb0639e6e240
[3] hosted at lists.linux.dev.

Bug: https://bugs.gentoo.org/879705
Change-Id: Id73a98944ab3c99a368b9da7a5e902ddff9d937f
Signed-off-by: Sam James <sam@gentoo.org>
2022-11-06 04:17:20 +00:00
Andrew Salkeld 5d26626e7a Add Neon implementation of vpx_hadamard_32x32
Add an Arm Neon implementation of vpx_hadamard_32x32 and use it
instead of the scalar C implementation.

Also add test coverage for the new Neon implementation.

Change-Id: Iccc018eec4dbbe629fb0c6f8ad6ea8554e7a0b13
2022-11-04 23:05:36 +00:00
Konstantinos Margaritis 3f08aa0d0b [NEON] Optimize highbd 32x32 DCT
For --best quality, resulting function
vpx_highbd_fdct32x32_rd_neon takes 0.27% of cpu time in
profiling, vs 6.27% for the sum of scalar functions:
vpx_fdct32, vpx_fdct32.constprop.0, vpx_fdct32x32_rd_c for rd.
For --rt quality, the function takes 0.19% vs 4.57% for the scalar
version.
Overall, this improves encoding time by ~6% compared for highbd
for --best and ~9% for --rt.

Change-Id: I1ce4bbef6e364bbadc76264056aa3f86b1a8edc5
2022-11-03 17:55:13 +00:00
James Zern f02a119100 Merge "[NEON] Optimize and homogenize Butterfly DCT functions" into main 2022-11-02 02:21:18 +00:00
Konstantinos Margaritis 3121783fec [NEON] Optimize and homogenize Butterfly DCT functions
Provide a set of commonly used Butterfly DCT functions for use in
DCT 4x4, 8x8, 16x16, 32x32 functions. These are provided in various
forms, using vqrdmulh_s16/vqrdmulh_s32 for _fast variants, which
unfortunately are only usable in pass1 of most DCTs, as they do not
provide the necessary precision in pass2.
This gave a performance gain ranging from 5% to 15% in 16x16 case.
Also, for 32x32, the loads were rearranged, along with the butterfly
optimizations, this gave 10% gain in 32x32_rd function.
This refactoring was necessary to allow easier porting of highbd
32x32 functions -follows this patchset.

Change-Id: I6282e640b95a95938faff76c3b2bace3dc298bc3
2022-11-01 23:07:27 +00:00
Johann Koenig ddca3dec36 Merge "MacOS 13 is darwin22" into main 2022-10-27 08:38:48 +00:00
Johann Koenig 17fbc6cfa1 Merge "rtcd: allow disabling neon on armv8" into main 2022-10-27 08:38:18 +00:00
Johann ebf22e2e8d MacOS 13 is darwin22
Bug: webm:1783
Change-Id: I97d94ab8c8aebe13aedb58e280dc37474814ad5d
2022-10-27 05:14:35 +00:00
Johann 9e1bdd12c7 rtcd: allow disabling neon on armv8
Change-Id: Idef943775456eb95b46be5c92c114c1d215f38d7
2022-10-27 09:19:45 +09:00
Johann 4b659f3c34 mailmap: add johann@duck.com
Change-Id: I3b48951e69ba1f4a9fafdbb81fac48f79587a342
2022-10-26 17:14:21 +09:00
James Zern dcb566e69f Merge changes I36545ff4,Id1aa29da into main
* changes:
  vp9_highbd_quantize_fp*_neon: normalize fn param name
  highbd_sad_avx2: normalize function param names
2022-10-25 19:16:46 +00:00
James Zern 2b99ea1b43 Merge "SAD*Test: mark virtual Run() as overridden" into main 2022-10-25 19:16:08 +00:00
Johann Koenig fc6d0d74ba Merge "quantize: consolidate sse2 conditionals" into main 2022-10-25 13:26:37 +00:00
Johann Koenig b14bb4c87a Merge "vp9 quantize: rewrite ssse3 in intrinsics" into main 2022-10-25 13:26:22 +00:00
James Zern ee12bc390d SAD*Test: mark virtual Run() as overridden
this comes from AbstractBench

Change-Id: Ie0b5a26a68bfbffd80f132125d15a1bdfc990c22
2022-10-24 15:37:26 -07:00
James Zern d667193e6a vp9_highbd_quantize_fp*_neon: normalize fn param name
count -> n_coeffs. aligns the name with the rtcd header; clears a
clang-tidy warning

Change-Id: I36545ff479df92b117c95e494f16002e6990f433
2022-10-24 15:32:10 -07:00
James Zern 228d8a4fed highbd_sad_avx2: normalize function param names
(src|ref)8_ptr -> (src|ref)_ptr. aligns the names with the rtcd header;
clears some clang-tidy warnings

Change-Id: Id1aa29da8c0fa5860b46ac902f5b2620c0d3ff54
2022-10-24 15:30:10 -07:00
Marco Paniconi 5245f6e9cb Fix to VP8 external RC for buffer levels
On a dynamic change of temporal layers:
starting/maimum/optimal were being set twice,
causing incorrect large values.

Bug: b/253927937
Change-Id: I204e885cff92530336a9ed9a4363c486c5bf80ae
2022-10-18 00:12:20 -07:00
Johann e8fc52ada4 quantize: consolidate sse2 conditionals
Change-Id: I43de579e30f2967b97064063e29676e0af1a498f
2022-10-17 16:22:23 +09:00
Johann 828d05d4a4 vp9 quantize: rewrite ssse3 in intrinsics
Change-Id: I3177251a5935453a23a23c39ea5f6fd41254775e
2022-10-17 12:23:24 +09:00
Marco Paniconi 79b718abdb Merge "Fix to VP8 external RC for dynamic update of layers" into main 2022-10-15 01:56:46 +00:00
Marco Paniconi 4007a057fc Fix to VP8 external RC for dynamic update of layers
On change/update of rc_cfg: when number of temporal
layers change call vp8_reset_temporal_layer_change(),
which in turn will call vp8_init_temporal_layer_context()
only for the new layers.

Bug:b/249644737

Change-Id: Ib20d746c7eacd10b78806ca6a5362c750d9ca0b3
2022-10-14 12:21:10 -07:00
Konstantinos Margaritis 124e57be95 [NEON] fix clang compile warnings
Change-Id: Ib7ce7a774ec89ba51169ea64d24c878109ef07d1
2022-10-13 16:49:15 +00:00
Scott LaVarnway 06a9d0e5dc Merge "Add vpx_highbd_sad64x{64,32}_avg_avx2." into main 2022-10-13 11:31:51 +00:00
Konstantinos Margaritis 45b280eb0f [NEON] Add highbd FDCT 16x16 function
90-95% faster than C version in best/rt profiles

Change-Id: I41d5e9acdc348b57153637ec736498a25ed84c25
2022-10-12 21:10:46 +00:00
James Zern e36c0a9495 Merge "[NEON] Add highbd FDCT 8x8 function" into main 2022-10-12 20:07:51 +00:00
Scott LaVarnway 5145c09c94 Merge "Add vpx_highbd_sad32x{64,32,16}_avg_avx2." into main 2022-10-12 19:50:55 +00:00
Scott LaVarnway 8e2ba7750e Merge "Add vpx_highbd_sad16x{32,16,8}_avg_avx2." into main 2022-10-12 19:44:44 +00:00
Konstantinos Margaritis a49f896352 [NEON] Add highbd FDCT 8x8 function
50% faster than C version in best/rt profiles

Change-Id: I0f9504ed52b5d5f7722407e91108ed4056d66bc2
2022-10-12 18:59:52 +00:00
Scott LaVarnway 7142689f00 Add vpx_highbd_sad64x{64,32}_avg_avx2.
~2.8x faster than the sse2 version.

Bug: b/245917257

Change-Id: Ib727ba8a8c8fa4df450bafdde30ed99fd283f06d
2022-10-12 11:43:39 -07:00
Konstantinos Margaritis 165935a1b6 [NEON] Add highbd FDCT 4x4 function
~80% faster than C version for both best/rt profiles.

Change-Id: Ibb3c8e1862131d2a020922420d53c66b31d5c2c3
2022-10-12 17:43:33 +00:00
Scott LaVarnway 50d5093a4f Add vpx_highbd_sad32x{64,32,16}_avg_avx2.
2.1x to 2.8x faster than the sse2 version.

Bug: b/245917257

Change-Id: I1aaffa4a1debbe5559784e854b8fc6fba07e5000
2022-10-12 07:00:31 -07:00
Scott LaVarnway 85484d5960 Add vpx_highbd_sad16x{32,16,8}_avg_avx2.
1.6x to 2.1x faster than the sse2 version.

Bug: b/245917257

Change-Id: I56c467a850297ae3abcca4b4843302bb8d5d0ac1
2022-10-12 03:28:53 -07:00
Konstantinos Margaritis f538a02244 [NEON] Move helper functions for reuse
Move all butterfly functions to fdct_neon.h
Slightly optimize load/scale/cross functions
in fdct 16x16.
These will be reused in highbd variants.

Change-Id: I28b6e0cc240304bab6b94d9c3f33cca77b8cb073
2022-10-12 08:11:53 +00:00
Scott LaVarnway 5c9d20cf44 Merge "SADavgTest: Add speed test." into main 2022-10-10 20:34:02 +00:00
Scott LaVarnway af274914f2 SADavgTest: Add speed test.
Change-Id: Ie14c0f6d15f410adf749f7ab74cf9f2bf35f3d5f
2022-10-10 12:20:37 -07:00
Konstantinos Margaritis 6f8537c4c8 [NEON] move transpose_8x8 to reuse
Change-Id: I3915b6c9971aedaac9c23f21fdb88bc271216208
2022-10-10 18:43:27 +00:00
James Zern 46bd6574aa Merge "[NEON] highbd partial DCT functions" into main 2022-10-10 18:37:05 +00:00
Konstantinos Margaritis 2d87b886a3 [NEON] highbd partial DCT functions
Change-Id: I7dd4e698469562f5b1f948cc36f8403b490dcb6a
2022-10-10 11:47:39 +00:00
Scott LaVarnway 06b09ebd35 Add vpx_highbd_sad64x{64,32}_avx2.
~2.8x faster than the sse2 version.

Bug: b/245917257

Change-Id: Ibc8e5d030ec145c9a9b742fff98fbd9131c9ede4
2022-10-07 09:47:01 -07:00
Johann Koenig 4cca8b1c8c Merge "vp9 quantize: change index" into main 2022-10-07 08:17:03 +00:00
Scott LaVarnway 4955b945d8 Add vpx_highbd_sad32x{64,32,16}_avx2.
2.7x to 3.1x faster than the sse2 version.

Bug: b/245917257

Change-Id: Idff3284932f7ee89d036f38893205bf622a159a3
2022-10-06 05:33:40 -07:00
Scott LaVarnway c03c882785 Add vpx_highbd_sad16x{32,16,8}_avx2.
1.9x to 2.4x faster than the sse2 version.

Bug: b/245917257

Change-Id: I686452772f9b72233930de2207af36a0cd72e0bb
2022-10-05 10:04:30 -07:00
Cheng Chen dca6dcef0a Merge "L2E: Rework recode decisions for external max frame size and q" into main 2022-10-04 16:15:49 +00:00
Johann eeea3daacb vp9 quantize: change index
In assembly it made sense to iterate using n_coeffs.
In intrinsics it's just as fast to use index and
easier to read.

Change-Id: I403c959709309dad68123d0a3d0efe183874543d
2022-10-01 11:50:46 +09:00
Scott LaVarnway 87c7da21c2 vpx_subpixel_8t_intrin_avx2.c: quiet -Wuninitialized
warning: ‘s2[3]’ may be used uninitialized
and
warning: ‘s1[3]’ may be used uninitialized

The warnings exposed unused code.

Change-Id: I75cf1f9db75e811cb42e2f143be1ad76f3e4dee9
2022-09-30 07:32:48 -07:00
Scott LaVarnway 381a8c9e01 Merge "vp9_rd.c quiet -Wstringop-overflow" into main 2022-09-26 23:18:04 +00:00
Johann f74ce37a3a quantize: standardize vp9_quantize_fp_sse2
Match style for vpx_quantize_b_sse2 and prepare to rewrite
ssse3 version in intrinsics.

Need to evaluate the value of threshold breakout before
going further.

Change-Id: I9cfceb1bb0dc237cd6b73fc8d41d78bba444a15b
2022-09-26 22:10:35 +00:00
Scott LaVarnway a1ba7188a8 vp9_rd.c quiet -Wstringop-overflow
../libvpx/vp9/encoder/vp9_rd.c:594:20: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
  594 |         t_above[i] = !!*(const uint32_t *)&above[i];
      |         ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../libvpx/vp9/encoder/vp9_rd.c:572:47: note: at offset [64, 254] into destination object ‘t_above’ of size [0, 16]
  572 |                               ENTROPY_CONTEXT t_above[16],
      |                               ~~~~~~~~~~~~~~~~^~~~~~~~~~~

Change-Id: Ie9ef24e685af417cdd35f6aa7284805e422b6ae2
2022-09-26 14:24:40 -07:00
Johann 00608eb1de quantize: add untested function
vp9_quantize_fp_sse2 was only tested in non-hbd
configuration. Missed when fixing this for
vpx_quantize_b_sse2.

Change-Id: Ide346e5727d74281c774f605c90d280050e0bf62
2022-09-24 10:56:17 +09:00
Johann c8874f74a7 quantize: increase iscan by 1
All of the assembly adds 1 to iscan to convert from
a 0 based array to the EOB value.

Add 1 to all iscan values and remove the extra
instructions from the assembly.

Change-Id: I219dd7f2bd10533ab24b206289565703176dc5e9
2022-09-23 21:17:34 +09:00
Scott LaVarnway 5820823ef8 Merge "resize_test.cc: quiet -Wmaybe-uninitialized" into main 2022-09-21 23:41:42 +00:00
Scott LaVarnway 8b0c92ebdf resize_test.cc: quiet -Wmaybe-uninitialized
warning: ‘expected_w’ may be used uninitialized
Change-Id: I915efd82d3263250cea90391345f7683c1330fc8
2022-09-21 14:16:56 -07:00
Scott LaVarnway 729739b36b Merge "post_proc_sse2.c: quiet -Wuninitialized" into main 2022-09-21 20:53:07 +00:00
Scott LaVarnway f6939699b6 post_proc_sse2.c: quiet -Wuninitialized
In file included from ../libvpx/vpx_dsp/x86/post_proc_sse2.c:12:
In function ‘_mm_add_epi16’,
    inlined from ‘vpx_mbpost_proc_down_sse2’ at ../libvpx/vpx_dsp/x86/post_proc_sse2.c:88:13:
/usr/lib/gcc/x86_64-linux-gnu/12/include/emmintrin.h:1060:35: warning: ‘below_context’ may be used uninitialized [-Wmaybe-uninitialized]
 1060 |   return (__m128i) ((__v8hu)__A + (__v8hu)__B);
      |                                   ^~~~~~~~~~~
../libvpx/vpx_dsp/x86/post_proc_sse2.c: In function ‘vpx_mbpost_proc_down_sse2’:
../libvpx/vpx_dsp/x86/post_proc_sse2.c:39:13: note: ‘below_context’ was declared here
   39 |     __m128i below_context;

Change-Id: I2fc592f121c4e85d0aff1640014c3444f5eb09fd
2022-09-21 11:37:04 -07:00
James Zern 417adb3aca Merge "CHECK_MEM_ERROR: add an assert for a valid jmp target" into main 2022-09-20 23:24:44 +00:00
Johann Koenig d678c1a170 Merge "quantize: test lowbd in highbd builds" into main 2022-09-20 00:12:13 +00:00
Johann 884837a580 quantize: test lowbd in highbd builds
Change-Id: I7af273e979415a8b8cafb7494728d2736862f4a5
2022-09-18 10:26:00 +09:00
Johann 3cd417b6d2 fwd_txfm: remove avx2 file from non-hbd
Resolves warning on OS X:
file: libvpx_g.a(fwd_txfm_avx2.c.o) has no symbols

Change-Id: Ie8b290bb3ed329656beb883d552c98353f1ed5e5
2022-09-17 07:54:40 +09:00
Cheng Chen 7ed6b47c60 L2E: Rework recode decisions for external max frame size and q
Allow to handle external q and external max frame size separately.
Rely on libvpx's decision to catch overshoot/undershoot and recode frames.

Previously, when external max frame size is set, we didn't handle
undershoot cases, and now we fall back to libvpx's decision to
recode a frame if overshoot/undershoot is seen.

Change-Id: Ic3eee042cfe104b528c5f2c6c82b98dd5d8fa8ca
2022-09-14 14:42:07 -07:00
Scott LaVarnway 34284e930a Add vpx_highbd_sad64x{64,32}x4d_avx2.
~2x faster than the sse2 version.

Bug: b/245917257

Change-Id: I4742950ab7b90d7f09e8d4687e1e967138acee39
2022-09-14 14:08:16 -07:00
Scott LaVarnway b39722f851 Add vpx_highbd_sad32x{64,32,16}x4d_avx2.
~2.4x faster than the sse2 version.

Bug: b/245917257

Change-Id: I6df2bd62b46e5e175c8ad80daa6de3a1c313db0f
2022-09-13 04:24:43 -07:00
James Zern fd615b4348 CHECK_MEM_ERROR: add an assert for a valid jmp target
callers of CHECK_MEM_ERROR() expect failures to not return

tested with:
configure --enable-debug --enable-vp9-postproc --enable-postproc \
  --enable-multi-res-encoding --enable-vp9-temporal-denoising \
  --enable-error-concealment

--enable-internal-stats has unrelated assertion failures currently

Change-Id: Ic12073b1ae80a6f434f14d24f652e64d30f63eea
2022-09-12 19:00:47 -07:00
Scott LaVarnway 6f1fa67010 Merge "Add vpx_highbd_sad16x{32,16,8}x4d_avx2." into main 2022-09-12 12:18:19 +00:00
Wan-Teh Chang 33c43c14ee Update third_party/googletest to v1.12.1
See https://github.com/google/googletest/releases/tag/release-1.12.1.

Modeled after https://aomedia-review.googlesource.com/c/aom/+/162601.

Change-Id: If0ced3097b4c8490985e3381aaac9b3266d52ae7
2022-09-09 14:38:15 -07:00
Scott LaVarnway 0d734728f6 Add vpx_highbd_sad16x{32,16,8}x4d_avx2.
1.98x to 2.3x faster than the sse2 version.

Bug: b/245917257

Change-Id: Ie4f9bb942ffaf4af7d395fb5a5978b41aabfc93c
2022-09-09 09:47:50 -07:00
James Zern a46ca4b6bd vp8_decode: declare 2 variables volatile
fixes -Wclobbered warnings with gcc 12.1.0:
vp8/vp8_dx_iface.c|278 col 16| warning: variable 'w' might be clobbered
by 'longjmp' or 'vfork' [-Wclobbered]
vp8/vp8_dx_iface.c|278 col 19| warning: variable 'h' might be clobbered
by 'longjmp' or 'vfork' [-Wclobbered]

Change-Id: Ib2c606a3450188d7869c066cacaf5615d9746181
2022-09-07 18:43:22 -07:00
James Zern a3abfc9874 Merge "x86,cosmetics: prefer _mm_setzero_si128/_mm256_setzero_si256" into main 2022-09-06 22:23:30 +00:00
James Zern a7527a26e8 sad_neon: enable UDOT implementation w/aarch32
Change-Id: Ia28305ec5c61518b732cbacbd102acd2cb7f9d82
2022-09-02 18:34:59 -07:00
James Zern b3317970e7 variance_neon.cc: simplify __ARM_FEATURE_DOTPROD check
missed in
447e27588 vpx_dsp,neon: simplify __ARM_FEATURE_DOTPROD check

+ fix #if comments

only check that the macro is defined, the value doesn't have any effect.

from https://arm-software.github.io/acle/main/acle.html:

5.5.7.7.  Dot Product extension
  __ARM_FEATURE_DOTPROD is defined if the dot product data manipulation
  instructions are supported and the vector intrinsics are available.
  Note that this implies:
    - __ARM_NEON == 1

Change-Id: I098b96421b7de5928bb3b11612ca1f32e7b6cbc4
2022-09-02 16:44:14 -07:00
James Zern 2faa4bfc5c x86,cosmetics: prefer _mm_setzero_si128/_mm256_setzero_si256
over *_set1_*(0)

Change-Id: I136e1798a2ce286480ebb9418db67a2f1e92b9a2
2022-09-02 16:17:52 -07:00
James Zern 447e275880 vpx_dsp,neon: simplify __ARM_FEATURE_DOTPROD check
only check that the macro is defined, the value doesn't have any effect.

from https://arm-software.github.io/acle/main/acle.html:

5.5.7.7.  Dot Product extension
  __ARM_FEATURE_DOTPROD is defined if the dot product data manipulation
  instructions are supported and the vector intrinsics are available.
  Note that this implies:
    - __ARM_NEON == 1

Change-Id: I164fe121ccefda99050a9b6a99738a2b518520f3
2022-09-02 12:21:17 -07:00
Matt Oliver 4df285b2c6 project: Add TargetPlatformMinVersion. 2022-09-03 00:53:21 +10:00
James Zern 281dfae835 neon,load_unaligned_*: use dup for lane 0
this produces better assembly with gcc (11.3.0-3); no change in assembly
using clang from the r24 android sdk (Android (8075178, based on
r437112b) clang version 14.0.1
(https://android.googlesource.com/toolchain/llvm-project
8671348b81b95fc603505dfc881b45103bee1731)

Change-Id: Ifec252d4f499f23be1cd94aa8516caf6b3fbbc11
2022-09-01 18:47:50 -07:00
James Zern 028fc1b50f test/*,cosmetics: normalize void parameter lists
replace (void) with (); use of this synonym is more common in C++ code.

Change-Id: I9813e82234dc9caa7115918a0491b0040f6afaf4
2022-08-31 16:35:08 -07:00
Yaowu Xu 9d6d0624d7 Remove const for pass-by-value parameters
This also fixes MSVC compiler warnings.

Change-Id: I20dc9ac821275ba95598f3016fc6b23e884e13b7
2022-08-30 09:09:18 -07:00
Cheng Chen ac76d3ccda Merge "L2E: Add gop size and ARF existence to frame info" into main 2022-08-30 04:25:33 +00:00
James Zern 27fd546079 highbd_variance_neon,cosmetics: reorder a few lines
Change-Id: Ia6fa54652d7f94687e64108482bb0f28ca06cf49
2022-08-26 22:12:44 -07:00
Cheng Chen fd45d11380 L2E: Add gop size and ARF existence to frame info
Pass the encode frame info to external ml model, with the information
of gop size and whether alt ref is used.

Change-Id: I55be2d3de83d7182c1a1a174e44ead7e19045c9d
2022-08-26 14:32:17 -07:00
James Zern 4bfab03e81 Merge "[NEON] Add highbd *variance* functions" into main 2022-08-26 02:07:34 +00:00
James Zern df1979e245 Merge "vpx_encoder.h: note VPX_ERROR_RESILIENT_PARTITIONS is VP8-only" into main 2022-08-26 02:01:55 +00:00
Konstantinos Margaritis 13970b7eca [NEON] Add highbd *variance* functions
Total gain for 12-bit encoding:
        * ~7.2% for best profile
        * ~5.8% for rt profile

Change-Id: I5b70415fb89d1bbb02a0c139eb317ba6b08adede
2022-08-25 21:58:34 +00:00
James Zern b0b66556dc Merge "vp9: fix ubsan sub-overflows" into main 2022-08-25 20:44:41 +00:00
James Zern 722d4daf35 vpx_encoder.h: note VPX_ERROR_RESILIENT_PARTITIONS is VP8-only
Change-Id: If71b2ec766f9f41253ce5a34987ffd208f9c8381
2022-08-25 10:50:16 -07:00
James Zern 4d24a5bca9 Merge "vp8_ratectrl_rtc_test.cc: ensure frame_type is initialized" into main 2022-08-25 16:55:42 +00:00
James Zern 7663fcb467 libs.doxy_template: remove obsolete CLASS_DIAGRAMS
This was reported with doxygen 1.9.4.

Also update the comment for CLASS_GRAPH by running "doxygen -u" because
the original comment for CLASS_GRAPH mentions the obsolete tag
'CLASS_DIAGRAMS',

Change-Id: I3bca547201f794d363bd814b7c7f7c9d7088797a
2022-08-24 18:52:10 -07:00
James Zern 2c7657202e vp8_ratectrl_rtc_test.cc: ensure frame_type is initialized
this fixes a valgrind failure:
==1095597== Conditional jump or move depends on uninitialised value(s)
==1095597==    at 0x12E0CC: (anonymous
namespace)::Vp8RcInterfaceTest::PreEncodeFrameHook(libvpx_test::VideoSource*,
libvpx_test::  > Encoder*) (vp8_ratectrl_rtc_test.cc:131)
==1095597==    by 0x1255A9:
libvpx_test::EncoderTest::RunLoop(libvpx_test::VideoSource*)
(encode_test_driver.cc:205)

Bug: webm:1776
Change-Id: Id3b40f62573ee513e79c74b6315c71b6ecd22c9a
Fixed: webm:1776
2022-08-24 15:57:02 -07:00
James Zern cef289f4cd Merge "[NEON] Improve vpx_quantize_b* functions" into main 2022-08-24 19:18:25 +00:00
clang-format a3c9b9126d .clang-format: update to clang-format-11
only store the deltas from --style Google in the file and reapply using
Debian clang-format version 11.1.0-6+build1

Bug: b/229626362
Change-Id: I3e18a2e7c17a90a48405b3cf1b37ebc652aba0db
2022-08-23 15:39:57 -07:00
Konstantinos Margaritis daae445b2a [NEON] Improve vpx_quantize_b* functions
Slight optimization, prefetch gives a 1% improvement in 1st pass

Change-Id: Iba4664964664234666406ab53893e02d481fbe61
2022-08-23 10:29:01 +00:00
James Zern a689fe68a3 vp9_ratectrl_rtc_test: initialize loopfilter_ctrl[]
this was added in:
  7beafefd1 vp9: Allow for disabling loopfilter per spatial layer
but the test doesn't zero initialize its svc_params_ member.

fixes the use of an uninitialized value, reported by valgrind and
integer sanitizer:
[ RUN      ] VP9/RcInterfaceSvcTest.Svc/0
==1064682== Conditional jump or move depends on uninitialised value(s)
==1064682==    at 0x1C5624: loopfilter_frame (vp9_encoder.c:3285)
==1064682==    by 0x1C9B54: encode_frame_to_data_rate (vp9_encoder.c:5595)
==1064682==    by 0x1CA2EE: SvcEncode (vp9_encoder.c:5789)
==1064682==    by 0x1CEA01: vp9_get_compressed_data (vp9_encoder.c:7891)
==1064682==    by 0x185F0E: encoder_encode (vp9_cx_iface.c:1437)
==1064682==    by 0x1503BB: vpx_codec_encode (vpx_encoder.c:208)

vp9/encoder/vp9_svc_layercontext.c:362:26: runtime error: implicit
conversion from type 'int' of value -1 (32-bit, signed) to type
'LOOPFILTER_CONTROL' changed the value to 4294967295 (32-bit, unsigned)
    #0 0x558925f45377 in vp9_restore_layer_context vp9/encoder/vp9_svc_layercontext.c:362:26
    #1 0x558925ef89fd in vp9_get_compressed_data vp9/encoder/vp9_encoder.c:7781:5
    #2 0x558925e3ef3e in encoder_encode vp9/vp9_cx_iface.c:1437:20

Bug: b/229626362
Change-Id: I33d244be7752c68b71efa9c62ca45d6b202ec761
2022-08-22 19:33:26 -07:00
James Zern f88dae639c Merge "vp9.read_inter_block_mode_info: return on corruption" into main 2022-08-22 22:36:09 +00:00
James Zern 7ab8fa2ac0 Merge "highbd_quantize_neon.c: remove unneeded assert.h" into main 2022-08-22 22:21:46 +00:00
James Zern 431ab4e626 Merge "vp9,search_new_mv: descale rather than scale sse" into main 2022-08-22 22:21:28 +00:00
James Zern 249d93e147 Merge changes Iabed118b,I60a384b2 into main
* changes:
  use VPX_NO_UNSIGNED_SHIFT_CHECK with entropy functions
  compiler_attributes.h: add VPX_NO_UNSIGNED_SHIFT_CHECK
2022-08-22 22:21:00 +00:00
Konstantinos Margaritis a6d95698fe [NEON] Add vpx_highbd_subtract_block function
Total gain for 12-bit encoding:
    * ~1% for best and rt profile

Change-Id: I4039120dc570baab1ae519a5e38b1acff38d81f0
2022-08-22 19:54:43 +00:00
Konstantinos Margaritis d050161f0d [NEON] Added vpx_highbd_sad* functions
Total gain for 12-bit encoding:
    * ~7.8% for best profile
    * ~10% for rt profile

Change-Id: I89eda5c4372a5b628c9df84cdeb4c8486fc44789
2022-08-22 18:09:35 +00:00
James Zern a8980078a1 highbd_quantize_neon.c: remove unneeded assert.h
Change-Id: I041f5fb23b856a2b519669b5bf8a40d3772b4a6e
2022-08-22 10:48:40 -07:00
James Zern b652b1da51 Merge "[NEON] Added vpx_highbd_quantize_b* functions" into main 2022-08-22 17:45:52 +00:00
Scott LaVarnway 0154634923 Merge "Fix TEST_P(SADx4Test, DISABLED_Speed)" into main 2022-08-22 10:36:09 +00:00
Konstantinos Margaritis ebf4caa857 [NEON] Added vpx_highbd_quantize_b* functions
Total gain for 12-bit encoding:
    * ~4.8% for best profile
    * ~6.2% for rt profile

Change-Id: I61e646ab7aedf06a25db1365d6d1cf7b05101c21
2022-08-20 19:37:58 +00:00
James Zern f66f2b9312 Merge "loopfilter.c: normalize flat func param type" into main 2022-08-20 00:00:06 +00:00
James Zern 595bf7022a vp9.read_inter_block_mode_info: return on corruption
with block sizes < 8x8 previously only the inner loop was aborted. this
could cause propagation of invalid motion vectors to scale_mv().

this quiets integer sanitizer warnings of the form:
vp9/common/vp9_mvref_common.h:239:18: runtime error: implicit conversion
from type 'int' of value 32768 (32-bit, signed) to type 'int16_t' (aka
'short') changed the value to -32768 (16-bit, signed)

Bug: b/229626362
Change-Id: I58b5a425adf21542cbf4cc4dd5ab3cc5ed008264
2022-08-19 09:58:39 -07:00
James Zern b55ef982b0 use VPX_NO_UNSIGNED_SHIFT_CHECK with entropy functions
these shift values off the most significant bit as part of the process;
vp8_regular_quantize_b_sse4_1 is included here for a special case of
mask creation

quiets warnings of the form:
vp8/decoder/dboolhuff.h:81:11: runtime error: left shift of
2373679303235599696 by 3 places cannot be represented in type
'VP8_BD_VALUE' (aka 'unsigned long')

vp8/encoder/bitstream.c:257:18: runtime error: left shift of 2147493041
by 1 places cannot be represented in type 'unsigned int'

vp8/encoder/x86/quantize_sse4.c:114:18: runtime error: left shift of
4294967294 by 1 places cannot be represented in type 'unsigned int'

vp9/encoder/vp9_pickmode.c:1632:41: runtime error: left shift of
4294967295 by 1 places cannot be represented in type 'unsigned int'

Bug: b/229626362
Change-Id: Iabed118b2a094232783e5ad0e586596d874103ca
2022-08-18 19:12:59 -07:00
James Zern 002b6b1ce0 compiler_attributes.h: add VPX_NO_UNSIGNED_SHIFT_CHECK
and use it on MD5Transform(); this behavior is well defined and is only
a warning with -fsanitize=integer, not -fsanitize=undefined.

quiets warnings of the form:
md5_utils.c:163:3: runtime error: left shift of 143704723 by 7 places
cannot be represented in type 'unsigned int'

Bug: b/229626362
Change-Id: I60a384b2c2556f5ce71ad8ebce050329aba0b4e4
2022-08-18 19:12:51 -07:00
James Zern c7358d8016 vp9,search_new_mv: descale rather than scale sse
this changes from scaling best sse to downscaling base sse in
comparisons.

this quiets an integer sanitizer warning of the form:
vp9/encoder/vp9_pickmode.c:1632:41: runtime error: left shift of
4294967295 by 1 places cannot be represented in type 'unsigned int'

Bug: b/229626362
Change-Id: Iee2920474ba700a46177d4514ba6ef7691958069
2022-08-18 18:23:41 -07:00
James Zern 2694c7bc92 update_thresh_freq_fact_row_mt: normalize param types
make source_variance unsigned; this matches update_thresh_freq_fact()
and the type of the MACROBLOCK member.

quiets integer sanitizer warnings of the form:
vp9/encoder/vp9_pickmode.c:2710:58: runtime error: implicit conversion
from type 'unsigned int' of value 4294967295 (32-bit, unsigned) to type
'int' changed the value to -1 (32-bit, signed)

Bug: b/229626362
Change-Id: I812c6ca914507bf25cad323dea3d91a3a2ea4f1d
2022-08-18 18:23:30 -07:00
James Zern df619cb823 loopfilter.c: normalize flat func param type
flat/flat2 are stored as int8_t as returned by the filter_mask*
functions.

this quiets integer sanitizer warnings of the form:
vpx_dsp/loopfilter.c:197:28: runtime error: implicit conversion from
type 'int8_t' (aka 'signed char') of value -1 (8-bit, signed) to type
'uint8_t' (aka 'unsigned char') changed the value to 255 (8-bit,
unsigned)

Bug: b/229626362
Change-Id: Iacb6ae052d4cb2b6e0ebccbacf59ece9501d3b5f
2022-08-18 18:23:22 -07:00
James Zern cf5ef2b985 Merge changes Icfc59932,I3d1ca618,Id3966912,I56f74981,Ia9a5dc5e, ... into main
* changes:
  vpx_encoder.h: make flag constants unsigned
  vp8,VP8_COMP: normalize segment_encode_breakout type
  webmdec,WebmInputContext: make timestamp_ns signed
  highbd_quantize_intrin_sse2: quiet int sanitizer warnings
  load_unaligned_u32: use an int w/_mm_cvtsi32_si128
  variance_sse2.c: add some missing casts
2022-08-18 23:16:33 +00:00
Scott LaVarnway 7a0e2bf1bc Fix TEST_P(SADx4Test, DISABLED_Speed)
The reference code was being timed instead of the optimized code.

Change-Id: I67eb08dcda80e20eaa075dc2c91b7e8ef5c0cdfb
2022-08-18 12:19:08 -07:00
Scott LaVarnway 37dcf75bb9 Merge "Add vp9_highbd_quantize_fp_32x32_neon()." into main 2022-08-17 10:59:57 +00:00
James Zern 9db0ec67e3 vpx_encoder.h: make flag constants unsigned
this matches the type for vpx_codec_frame_flags_t and
vpx_codec_er_flags_t and quiets int sanitizer warnings of the form:

implicit conversion from type 'int' of value -9 (32-bit, signed) to type
'unsigned int' changed the value to 4294967287 (32-bit, unsigned)

Bug: b/229626362
Change-Id: Icfc5993250f37cedb300c7032cab28ce4bec1f86
2022-08-16 21:59:20 -07:00
James Zern a76a022835 vp8,VP8_COMP: normalize segment_encode_breakout type
use unsigned int as the API value is of this type; this quiets some
integer sanitizer warnings of the form:
implicit conversion from type 'unsigned int' of value 2147483648
(32-bit, unsigned) to type 'int' changed the value to -2147483648
(32-bit, signed)

Bug: b/229626362
Change-Id: I3d1ca618bf1b3cd57a5dca65a3067f351c1473f8
2022-08-16 18:25:14 -07:00
James Zern cb18d72c30 webmdec,WebmInputContext: make timestamp_ns signed
this matches the type returned from libwebm, which uses -1 as an error;
quiets integer sanitizer warnings of the form:
implicit conversion from type 'long long' of value -1 (64-bit, signed)
to type 'uint64_t' (aka 'unsigned long') changed the value to
18446744073709551615 (64-bit, unsigned)

Bug: b/229626362
Change-Id: Id3966912f802aee3c0f7852225b55f3057c3e76a
2022-08-16 18:25:14 -07:00
James Zern b77b6b68d3 highbd_quantize_intrin_sse2: quiet int sanitizer warnings
add a missing cast in ^ operations; quiets warnings of the form:
implicit conversion from type 'int' of value -1 (32-bit, signed) to type
'unsigned int' changed the value to 4294967295 (32-bit, unsigned)

Bug: b/229626362
Change-Id: I56f74981050b2c9d00bad20e68f1b73ce7454729
2022-08-16 18:25:14 -07:00
James Zern d939886809 load_unaligned_u32: use an int w/_mm_cvtsi32_si128
this matches the type of the function parameter; quiets integer
sanitizer warnings of the form:
implicit conversion from type 'uint32_t' (aka 'unsigned int') of value
3215646151 (32-bit, unsigned) to type 'int' changed the value to
-1079321145 (32-bit, signed)

Bug: b/229626362
Change-Id: Ia9a5dc5e1f57cbf4f8f8fa457bb674ef43369d37
2022-08-16 18:25:08 -07:00
James Zern c1fb6c6624 variance_sse2.c: add some missing casts
quiets integer sanitizer warnings of the form:
../vpx_dsp/x86/variance_sse2.c:100:10: runtime error: implicit
conversion from type 'unsigned int' of value 4294966272 (32-bit,
unsigned) to type 'int' changed the value to -1024 (32-bit, signed)

Bug: b/229626362
Change-Id: I150cc0a6a6b85143c3bf96886686fe3a40897db5
2022-08-16 18:25:02 -07:00
James Zern d22e5a49e3 configure: add -Wno-pass-failed for libyuv
with certain optimization flags or sanitizers enabled some code may fail
to vectorize:
third_party/libyuv/source/row_common.cc:3178:7: warning: loop not
vectorized: the optimizer was unable to perform the requested
transformation; the transformation might be disabled or specified as
part of an unsupported transformation ordering
[-Wpass-failed=transform-warning]

this was observed with integer/undefined sanitizers using clang 11/13

Bug: b/229626362
Change-Id: I01595c641763c4cd4242e02f2cc5cbabfe69d03e
2022-08-16 13:58:46 -07:00
James Zern b6f289e90a Merge "configure: add -Wc++{14,17,20}-extensions" into main 2022-08-16 20:04:09 +00:00
Scott LaVarnway 37a3999f5a Add vp9_highbd_quantize_fp_32x32_neon().
Up to 2.6x faster than vp9_highbd_quantize_fp_32x32_c() for full
calculations.

Bug: b/237714063

Change-Id: Icfeff2ad4dcd57d0ceb47fe04789710807b9cbad
2022-08-16 08:36:04 -07:00
James Zern fecec6293f simple_encode.cc: clear -Wextra-semi-stmt warnings
fixes warnings of the form:
../vp9/simple_encode.cc:755:48: warning: empty expression statement has
no effect; remove unnecessary ';' to silence this warning
[-Wextra-semi-stmt]
  SET_STRUCT_VALUE(config, oxcf, ret, key_freq);

Bug: b/229626362
Change-Id: I1c9b0ae9927cdd7c31da000633bcb6e2b8242cd4
2022-08-15 17:27:52 -07:00
James Zern 2f2ede692e Merge "examples/svc_encodeframe.c: rm empty {}s in switch" into main 2022-08-15 23:21:53 +00:00
Scott LaVarnway f73d07dfd6 Merge "VPX: Add vp9_highbd_quantize_fp_neon()." into main 2022-08-15 21:34:42 +00:00
Scott LaVarnway c13788bae9 vp9_quantize_fp_32x32_neon() cleanup.
No change in performance.

Bug: b/237714063

Change-Id: If6ad5fc27de4babe0bfff3fdbb4b7fd99a0544ab
2022-08-15 12:01:41 -07:00
Scott LaVarnway 763167aac7 VPX: Add vp9_highbd_quantize_fp_neon().
Up to 4.1x faster than vp9_highbd_quantize_fp_c() for full
calculations.

~1.3% overall encoder improvement for the test clip used.

Bug: b/237714063

Change-Id: I8c6466bdbcf1c398b1d8b03cab4165c1d8556b0c
2022-08-15 10:45:43 -07:00
James Zern 14b8eaf7da examples/svc_encodeframe.c: rm empty {}s in switch
these have been unnecessary since:
0e97e7049 remove spatial svc experiment

Bug: b/229626362
Change-Id: I57528af4dcb9092b752161c8eaba2e2808c29c5f
2022-08-14 15:58:09 -07:00
James Zern 33b385ec4e configure: add -Wc++{14,17,20}-extensions
the snapshot of googletest and the test files themselves are targeting
c++11 currently; these warnings are supported by recent versions of
clang

Change-Id: I5d36b3bd4058ba1610f0c8b27cad27aadee85939
2022-08-12 22:24:39 -07:00
Scott LaVarnway c082f0ca15 Merge "VPX: vp9_quantize_fp_neon() cleanup." into main 2022-08-12 00:27:02 +00:00
Scott LaVarnway 1e07619a0a VPX: vp9_quantize_fp_neon() cleanup.
No change in performance.

Bug: b/237714063

Change-Id: I868cda7acb0de840fbc85b23f3e36c50b39c331b
2022-08-11 04:33:04 -07:00
James Zern dcb2a300ff Merge "vp9_cx_iface,encoder_encode: only calc ts when img!=NULL" into main 2022-08-10 22:38:45 +00:00
Scott LaVarnway 8786aee582 Merge "VPX: Fix vp9_quantize_fp_avx2() VS build error." into main 2022-08-09 20:20:52 +00:00
Scott LaVarnway 3c2b21c22e VPX: Fix vp9_quantize_fp_avx2() VS build error.
Add build fix for _mm256_extract_epi16() being undefined.

Bug: b/237714063

Change-Id: I855b1828ce1b6b2b2f063fe097999481881bf074
2022-08-09 10:00:33 -07:00
Cheng Chen ec4aa6d191 Use level defined min gf interval
Assume the level definition of min_gf_interval is the minimum allowed
gf_interval. We take this level comformant min_gf_interval instead of
+1.

Change-Id: I9c7e62f210c95b356e9716579ee4c19638de8e35
2022-08-08 17:35:52 -07:00
Cheng Chen 3cf0a24156 L2E: Add target level in GOP unit tests
Change-Id: Icecc3031e1052bb5a94f6c5957ec5190aae990ba
2022-08-08 17:35:52 -07:00
Cheng Chen 1560454474 Merge "Fix VP9 auto level" into main 2022-08-09 00:33:51 +00:00
James Zern 4355a392e6 vp9_cx_iface,encoder_encode: only calc ts when img!=NULL
avoid calculating the end timestamp when performing a flush to prevent
an implicit conversion warning when applying a non-zero offset to a 0
pts used in that case:
vp9/vp9_cx_iface.c:1361:50: runtime error: implicit conversion from type
'vpx_codec_pts_t' (aka 'long') of value -15 (64-bit, signed) to type
'unsigned long' changed the value to 18446744073709551601 (64-bit,
unsigned)

Bug: b/229626362
Change-Id: I68ba19b7d6de35cc185707dfb6b43406b7165035
2022-08-08 11:28:27 -07:00
Cheng Chen eaf0b5b47e Fix VP9 auto level
The iteration index is wrong, causing the starting level to be chosen
is "LEVEL_5_2", which is intended for videos of a large resolution.

Change-Id: Id88836981bdcbd7494bd06193d6a433ac75a6d2e
2022-08-08 17:46:03 +00:00
Scott LaVarnway c9f049fd91 VPX: Add vpx_subtract_block_avx2().
~1.3x faster than vpx_subtract_block_sse2().

Based on aom_subtract_block_avx2().

Bug: b/241580104

Change-Id: I17da036363f213d53c6546c3e858e4c3cba44a5b
2022-08-05 16:02:38 -07:00
James Zern aa2dc0cc72 Merge "vp9_active_[hv]_edge: add missing vpx_clear_system_state" into main 2022-08-04 18:09:02 +00:00
Scott LaVarnway 2e61a623d4 VPX: Add vp9_highbd_quantize_fp_32x32_avx2().
~4x faster than vp9_highbd_quantize_fp_32x32_c() for full
calculations.

Bug: b/237714063

Change-Id: Iff2182b8e7b1ac79811e33080d1f6cac6679382d
2022-08-03 16:45:42 -07:00
Scott LaVarnway a55e248349 VPX: Add vp9_highbd_quantize_fp_avx2().
Up to 5.37x faster than vp9_highbd_quantize_fp_c() for full
calculations.

~1.6% overall encoder improvement for the test clip used.

Bug: b/237714063

Change-Id: I584fd1f60a3e02f1ded092de98970725fc66c5b8
2022-08-03 05:39:49 -07:00
Scott LaVarnway 34ee3843e6 Merge "VPX: Add vp9_quantize_fp_32x32_avx2()." into main 2022-08-02 15:09:40 +00:00
James Zern 822bfca8e5 Merge "Provide Arm SDOT optimizations for SAD functions" into main 2022-08-01 18:21:32 +00:00
Scott LaVarnway 29db7fe975 VPX: Add vp9_quantize_fp_32x32_avx2().
Up to 1.80x faster than vp9_quantize_fp_32x32_ssse3() for full
calculations.

Bug: b/237714063

Change-Id: Ic4ae4724fce7ac85c7a089535b16a999e02f0a10
2022-08-01 11:12:38 -07:00
Wan-Teh Chang 0b3d411446 Fix off-by-one error of max w/h in validate_config
Fix the off-by-one errors of maximum g_w and g_h in validate_config().

Bug: webm:1774
Change-Id: I343783d06c1f53222be2366be79171b214486201
2022-07-29 15:46:42 -07:00
Konstantinos Margaritis b3536cfafe Provide Arm SDOT optimizations for SAD functions
Change-Id: I497ee1c45d1fc4d643cefad7d87e5aaacd77869c
2022-07-29 12:49:39 +00:00
James Zern 59acf6739c Merge changes I0c6604ef,Id7e13b3d,I7291d9bd,Ic7c0a2e7,Ic7ce0fd9, ... into main
* changes:
  x86: normalize type with _mm_cvtsi128_si32
  vp9_filter_block_plane_non420: fix implicit conversion warnings
  variance_avx2.c: fix implicit conversion warnings
  vp8,read_mb_modes_mv: fix implicit conversion warnings
  vp8_find_near_mvs: fix implicit conversion warnings
  encode_test_driver: normalize frame_flags type
  vp9,decoder_decode: fix ubsan null/zero offset warning
  y4m_input_fetch_frame: fix ubsan null/zero offset warning
2022-07-29 02:05:22 +00:00
James Zern 1ce49998f7 vp9_active_[hv]_edge: add missing vpx_clear_system_state
this fixes runtime errors with clang -fsanitize=integer in x86 builds:

../vp9/encoder/vp9_rdopt.c:3250:17: runtime error: signed integer
  overflow: 18 - -2147483648 cannot be represented in type 'int'
../vp9/encoder/vp9_rdopt.c:3277:16: runtime error: signed integer
  overflow: 26 - -2147483648 cannot be represented in type 'int'

Bug: b/229626362
Change-Id: Ic9a5063c840b4fce7056f61362234721add056a6
2022-07-27 18:56:22 -07:00
James Zern 4667992d8b x86: normalize type with _mm_cvtsi128_si32
prefer int in most cases

w/clang -fsanitize=integer fixes warnings of the form:
implicit conversion from type 'int' of value -809931979 (32-bit, signed)
to type 'uint32_t' (aka 'unsigned int') changed the value to 3485035317
(32-bit, unsigned)

Bug: b/229626362
Change-Id: I0c6604efc188f2660c531eddfc7aa10060637813
2022-07-27 16:59:21 -07:00
James Zern e533d989ea vp9_filter_block_plane_non420: fix implicit conversion warnings
w/clang -fsanitize=integer fixes warnings of the form:
implicit conversion from type 'int' of value -2 (32-bit, signed) to type
'unsigned int' changed the value to 4294967294 (32-bit, unsigned)

Bug: b/229626362
Change-Id: Id7e13b3d494ccd1a2351db8fab6fdb6a9a771d51
2022-07-27 16:59:21 -07:00
James Zern aecf7ba51a variance_avx2.c: fix implicit conversion warnings
w/clang -fsanitize=integer fixes warnings of the form:
implicit conversion from type 'int' of value -1323 (32-bit, signed) to
type 'unsigned int' changed the value to 4294965973 (32-bit, unsigned)

Bug: b/229626362
Change-Id: I7291d9bd5cacea0d88d9f4c4624c096764f4a472
2022-07-27 16:59:21 -07:00
James Zern b6d06a6e26 vp8,read_mb_modes_mv: fix implicit conversion warnings
w/clang -fsanitize=integer fixes warnings of the form:
implicit conversion from type 'uint32_t' (aka 'unsigned int') of value
4294443008 (32-bit, unsigned) to type 'int' changed the value to -524288
(32-bit, signed)

Bug: b/229626362
Change-Id: Ic7c0a2e7b64a1dd6fd5cc64adcd5765318c2a956
2022-07-27 16:59:21 -07:00
James Zern 9763f3c549 vp8_find_near_mvs: fix implicit conversion warnings
unsigned -> int and vice versa

reported by clang -fsanitize=integer

vp8/common/findnearmv.c:108:11: runtime error: implicit conversion from
type 'uint32_t' (aka 'unsigned int') of value 4294443008 (32-bit,
unsigned) to type 'int' changed the value to -524288 (32-bit, signed)
vp8/common/findnearmv.c:110:33: runtime error: implicit conversion from
type 'int' of value -524288 (32-bit, signed) to type 'uint32_t' (aka
'unsigned int') changed the value to 4294443008 (32-bit, unsigned)

Bug: b/229626362
Change-Id: Ic7ce0fd98255ccf9307ac73e9fb6a8189b268214
2022-07-27 15:37:11 -07:00
James Zern 7dab508cd9 encode_test_driver: normalize frame_flags type
use vpx_enc_frame_flags_t; this avoids int -> unsigned conversion
warnings; reported w/clang -fsanitize=integer:

test/error_resilience_test.cc:95:9: runtime error: implicit conversion
from type 'int' of value -12845057 (32-bit, signed) to type 'unsigned
long' changed the value to 4282122239 (32-bit, unsigned)

Bug: b/229626362
Change-Id: I0fc1dbe44a258f397cf1a05347d8cb86ee70b1b8
2022-07-27 15:37:11 -07:00
James Zern ed78231aa5 vp9,decoder_decode: fix ubsan null/zero offset warning
reported under clang-13. null data may be passed as a flush; move
data_end after that check

vp9/vp9_dx_iface.c:337:40: runtime error: applying zero offset to null
pointer

Bug: b/229626362
Change-Id: I845726fd6eb6ac7a776e49272c6477a5ad30ffdf
2022-07-27 15:37:11 -07:00
James Zern 1c0c4d51b4 y4m_input_fetch_frame: fix ubsan null/zero offset warning
reported under clang-13; use a while loop in file_read() to force a size
check before attempting to read. buf (aux_buf) may be may be null when
no conversion is necessary.

y4minput.c:29:43: runtime error: applying zero offset to null pointer

Bug: b/229626362
Change-Id: Ia3250d6ff9c325faf48eaa31f4399e20837f8f7b
2022-07-27 15:37:06 -07:00
Scott LaVarnway ce484db211 VPX: vp9_quantize_fp_avx2() cleanup.
No change in performance.

Bug: b/237714063

Change-Id: I8ea42759cc4dc57be6a29c23784997cb90ad4090
2022-07-27 10:59:49 -07:00
Scott LaVarnway 9c09d36ee7 Merge "VPX: Add vpx_highbd_quantize_b_32x32_avx2()." into main 2022-07-27 12:54:09 +00:00
James Zern ea13f315c9 highbd_temporal_filter_sse4: remove unused function params
this clears warnings under clang-13 of the form:
vp9/encoder/x86/highbd_temporal_filter_sse4.c|196 col 63| warning:
parameter 'v_pre' set but not used [-Wunused-but-set-parameter]

this is the high-bitdepth version of:
73b8aade8 temporal_filter_sse4: remove unused function params

Change-Id: I9b2c9bf27c16975e4855df6a2c967da4c8c63a3a
2022-07-26 18:17:39 -07:00
Scott LaVarnway 90ef3906a2 VPX: Add vpx_highbd_quantize_b_32x32_avx2().
Up to 11.78x faster than vpx_quantize_b_32x32_sse2() for full
calculations.

~1.7% overall encoder improvement for the test clip used.

Bug: b/237714063

Change-Id: Ib759056db94d3487239cb2748ffef1184a89ae18
2022-07-26 04:41:17 -07:00
Scott LaVarnway 90c5493ff5 VPX: Add vpx_highbd_quantize_b_avx2().
Up to 3.61x faster than vpx_highbd_quantize_b_sse2() for full
calculations.

~2.3% overall encoder improvement for the test clip used.

Bug: b/237714063
Change-Id: I23f88d2a7f96aaa4103778372f4f552207f73cee
2022-07-25 17:16:44 +00:00
Scott LaVarnway 78c9a61f79 Merge "VPX: Add vpx_quantize_b_32x32_avx2()." into main 2022-07-25 11:38:13 +00:00
Cheng Chen 6c007885b4 Merge "L2E: Add more unit tests for GOP API" into main 2022-07-22 20:27:51 +00:00
Cheng Chen 4e504233f8 L2E: Add more unit tests for GOP API
Add unit tests for a 4 frame video, which could be considered as a
corner case.

Three different GOP settings are tested and verified as valid.
(1). The first GOP has 3 coding frames, no alt ref.
     The second GOP has 1 coding frame, no alt ref.
     The numer of coding frames is 4.
     Their frame types are: keyframe, inter_frame, inter_frame,
     golden_frame.

(2). The first GOP has 4 coding frames, use alt ref.
     The second GOP has 1 coding frame, which is the overlay of
     the first GOP's alt ref frame.
     The numer of coding frames is 5.
     Their types are: keyframe, alt_ref, inter_frame, inter_frame,
     overlay_frame.

(3). Only one GOP with 4 coding frames, do not use alt ref.
     The numer of coding frames is 4.
     Their types are: keyframe, inter_frame, inter_frame, inter_frame.

Change-Id: I4079ff5065da79834b363b1e1976f65efed3f91f
2022-07-21 22:46:08 -07:00
James Zern 59b27f758c avg_intrin_avx2: rm dead store in highbd_hadamard_8x8
missed in:
53dd1e8e7 avg_intrin_{sse2,avg2}: rm dead store in hadamard_8x8

Change-Id: I378e4a388ceb193a4cfee4d9d317fc62fcc4b39e
2022-07-20 09:59:43 -07:00
James Zern a36d42f8bd pp_filter_test: quiet static analysis warning
in CheckLowFilterOutput(); use std::unique_ptr to avoid spurious memory
leak warning:

test/pp_filter_test.cc|466 col 3| warning: Potential leak of memory
pointed to by 'expected_output' [cplusplus.NewDeleteLeaks]
  ASSERT_NE(expected_output, nullptr);

Bug: b/229626362
Change-Id: Ie9e06c9b9442ffa134e514d2aee70841d19c8ecb
2022-07-20 09:59:38 -07:00
James Zern cc8610e189 encode_api_test: quiet static analysis warning
in ConfigChangeThreadCount(); initialize cfg as the static analyzer can
assume AlwaysTrue() within EXPECT_NO_FATAL_FAILURE may return false
causing InitCodec() not to be called.

test/encode_api_test.cc|321 col 3| warning: 1st function call argument
is an uninitialized value [core.CallAndMessage]
  video.SetSize(cfg.g_w, cfg.g_h);

Bug: b/229626362
Change-Id: I54899ed0a207ca685416bed3a0e9c9644668e163
2022-07-19 16:36:15 -07:00
Scott LaVarnway 414b4f0512 VPX: Add vpx_quantize_b_32x32_avx2().
Up to 1.36x faster than vpx_quantize_b_32x32_avx() for full
calculations. Up to 1.29x faster for VP9_HIGHBITDEPTH builds.

Bug: b/237714063

Change-Id: I97aa6a18d4dc2f3187b76800f91bbba7be447ef1
2022-07-19 06:13:34 -07:00
James Zern 53dd1e8e78 avg_intrin_{sse2,avg2}: rm dead store in hadamard_8x8
this quiets a couple static analysis warnings with clang 11:

vpx_dsp/x86/avg_intrin_sse2.c:278:45: warning: Although the value stored
to 'src_diff' is used in the enclosing expression, the value is never
actually read from 'src_diff' [deadcode.DeadStores]
  src[7] = _mm_load_si128((const __m128i *)(src_diff += src_stride));
                                            ^           ~~~~~~~~~~
vpx_dsp/x86/avg_intrin_avx2.c:307:49: warning: Although the value stored
to 'src_diff' is used in the enclosing expression, the value is never
actually read from 'src_diff' [deadcode.DeadStores]
  src[7] = _mm256_loadu_si256((const __m256i *)(src_diff += src_stride));
                                                ^           ~~~~~~~~~~

Bug: b/229626362
Change-Id: I4b0201bd39775885df0afc03fa5da70910b9dad6
2022-07-18 21:50:48 -07:00
James Zern a5ead0427c vpx_int_pro_row_c: add an assert for height
this quiets a static analysis warning with clang 11:

vpx_dsp/avg.c:353:15: warning: Assigned value is garbage or undefined
[core.uninitialized.Assign]
    hbuf[idx] /= norm_factor;
              ^  ~~~~~~~~~~~

the same fix was applied in libaom:
1ad0889bc aom_int_pro_row_c: add an assert for height

Bug: b/229626362
Change-Id: Ic8a249f866b33b02ec9f378581e51ac104d97169
2022-07-18 19:11:28 -07:00
Matt Oliver a656a12ab2 project: Update for 1.12.0 merge. 2022-07-16 17:21:29 +10:00
Matt Oliver 5f243f4891 Merge commit '03265cd42b3783532de72f2ded5436652e6f5ce3'
# Conflicts:
#	vpx_dsp/x86/sad_sse3.asm
#	vpx_dsp/x86/sad_sse4.asm
#	vpx_dsp/x86/sad_ssse3.asm
2022-07-16 15:17:59 +10:00
Cheng Chen 68d9e7aa2f L2E: Update the description of allow_alt_ref
It is fixed per each encoding and can not be changed per GOP.

Change-Id: I5905b712437142f2274bfa674ceef6093495457f
2022-07-14 16:29:35 -07:00
James Zern 168b312774 vpxenc: fix --disable-loopfilter help alignment
Change-Id: I34444e6437ca0e735d6db07bf98bfa4741ad2c01
2022-07-13 21:56:29 -07:00
Konstantinos Margaritis cc8236f1d2 Actually include the fix for commit 8f4d1890c.
Change-Id: I6780f610151f2e092da525ff064d4b69f74fa61b
2022-07-13 17:21:31 +00:00
Scott LaVarnway 68c68ae959 Merge "VPX: Add vpx_quantize_b_avx2()." into main 2022-07-11 21:48:55 +00:00
James Zern fefc38a0a5 Merge "vp8_macros_msa.h: avoid shadowing variables in defines" into main 2022-07-11 20:23:28 +00:00
Scott LaVarnway e2603ead67 VPX: Add vpx_quantize_b_avx2().
Up to 1.58x faster than vpx_quantize_b_avx() depending
on the size.

Bug: b/237714063

Change-Id: I595a6bb32ebee63f69f27b5a15322fdeae1bf70e
2022-07-11 13:12:05 -07:00
James Zern 5b3cdae38c Merge "Revert "Revert "[NEON] Optimize vp9_diamond_search_sad() for NEON""" into main 2022-07-11 19:58:21 +00:00
James Zern 873aab02ad vp8_macros_msa.h: avoid shadowing variables in defines
this avoids a warning with certain versions of gcc; observed with:
mipsisa32r6el-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110

Change-Id: I8999f487a79a9d53133816d572054b2423330bcf
2022-07-09 14:49:54 -07:00
Konstantinos Margaritis 8f4d1890cb Revert "Revert "[NEON] Optimize vp9_diamond_search_sad() for NEON""
This reverts commit 9f1329f8ac
and fixes a dumb mistake in evaluation of vfcmv. Used vdupq_n_s16,
instead of vdupq_n_s32.

Change-Id: Ie236c878c166405c49bc0f93f6d63a6715534a0a
2022-07-08 21:37:46 +00:00
Scott LaVarnway db2cafa7d9 Merge "VPX: Add quantize speed test for ref vs opt." into main 2022-07-07 22:15:45 +00:00
Scott LaVarnway ba56eafb57 VPX: Add quantize speed test for ref vs opt.
Bug: b/237714063

Change-Id: I4304ba8d976fed3613e28442983b04a9cfc15b79
2022-07-07 13:09:43 -07:00
James Zern 933b6b90a5 Revert "Fix bug with smaller width bigger size"
This reverts commit 5b530fc962.

This fixes memory related fuzzer failures in the decoder.

Bug: webm:1642
Bug: oss-fuzz:48609
Bug: oss-fuzz:48629
Bug: oss-fuzz:48632
Bug: oss-fuzz:48638
Bug: oss-fuzz:48639
Bug: oss-fuzz:48651
Bug: oss-fuzz:48657
Bug: oss-fuzz:48659
Bug: oss-fuzz:48660
Bug: oss-fuzz:48661
Bug: oss-fuzz:48680
Bug: oss-fuzz:48686
Bug: oss-fuzz:48697
Bug: oss-fuzz:48706
Bug: oss-fuzz:48712
Bug: oss-fuzz:48717
Bug: oss-fuzz:48728
Bug: oss-fuzz:48732
Bug: oss-fuzz:48780
Bug: oss-fuzz:48781
Bug: oss-fuzz:48782
Bug: oss-fuzz:48785
Change-Id: I67a8539a3083f00eec1164fef5c6a8bc209f91fc
2022-07-06 15:10:39 -07:00
Jerome Jiang 7b65e46983 Merge "Fix bug with smaller width bigger size" into main 2022-07-01 14:37:32 +00:00
Jerome Jiang 5b530fc962 Fix bug with smaller width bigger size
Bug: webm:1642

Change-Id: I831b7701495eebeeff6bdc0b570f737bb6d536c6
2022-06-30 18:43:38 -04:00
Jerome Jiang dbac8e01e0 ABI compatibility to CHANGELOG for prev releases.
Bug: webm:1757
Change-Id: I19576aa0bc065045dcb0eaf770ae5b0d9ac9d684
2022-06-30 16:22:12 -04:00
Marco Paniconi 711bef6740 rtc: Add svc test for profile 2 10/12 bit
Add TODO to fix the superframe parser
for 10/12 bit.

Change-Id: Ib76c4daa0ff2f516510829ead6a397c89abba2f3
2022-06-29 14:43:17 -07:00
Marco Paniconi 896b59f44d rtc-svc: Fix to make SVC work for Profile 1
Added datarate unittest for 4:4:4 and 4:2:2 input,
for spatial and temporal layers.

Fix is needed in vp9_set_literal_size():
the sampling_x/y should be passed into update_inital_width(),
othewise sampling_x/y = 1/1 (4:2:0) was forced.
vp9_set_literal_size() is only called by the svc and
on dynamic resize.

Fix issue with the normative optimized scaler:
UV width/height was assumed to be 1/2 of Y, for
the ssse and neon code.

Also fix to assert for the scaled width/height:
in case scaled width/height is odd it should be
incremented by 1 (make it even).

Change-Id: I3a2e40effa53c505f44ef05aaa3132e1b7f57dd5
2022-06-28 19:29:20 -07:00
Jerome Jiang f43636a3e0 Merge "Merge tag 'v1.12.0' into main" into main 2022-06-28 22:59:24 +00:00
Jerome Jiang b355ab5046 Add vp8_ prefix for quantize_lsx.c
Duplicate name as vpx_dsp/loongarch/quantize_lsx.c
Chromium update script fails.

Bug: webm:1755
Change-Id: Ifb956c2292d909496eb2b9e1833993f1b021b07e
2022-06-28 21:19:54 +00:00
Jerome Jiang 90a63a0de0 Merge tag 'v1.12.0' into main
Release v1.12.0 Torrent Duck

2022-06-17 v1.12.0 "Torrent Duck"

  This release adds optimizations for Loongarch, adds support for vp8 in the
  real-time rate control library, upgrades GoogleTest to v1.11.0, updates
  libwebm to libwebm-1.0.0.28-20-g206d268, and includes numerous bug fixes.

- Upgrading:
    This release is ABI compatible with the previous release.
    vp8 support in the real-time rate control library.
    New codec control VP8E_SET_RTC_EXTERNAL_RATECTRL is added.
    Configure support for darwin21 is added.
    GoogleTest is upgraded to v1.11.0.
    libwebm is updated to libwebm-1.0.0.28-20-g206d268.

- Enhancement:
    Numerous improvements on checking memory allocations.
    Optimizations for Loongarch.
    Code clean-up.

- Bug fixes:
    Fix to a crash related to {vp8/vp9}_set_roi_map.
    Fix to compiling failure with -Wformat-nonliteral.
    Fix to integer overflow with vp9 with high resolution content.
    Fix to AddNoiseTest failure with ARMv7.
    Fix to libvpx Null-dereference READ in vp8.

Change-Id: I6964e96bccf016f977cc6e83dc0a192d66a19618
2022-06-28 17:07:31 -04:00
Jerome Jiang 03265cd42b Replace date with version and release from README
CHANGELOG has the date.

Bug: webm:1752
Change-Id: I2888ce2afed8619f043eee1e9ca23bdf9d75e607
2022-06-28 15:00:48 -04:00
Cheng Chen ec58d55c3a L2E: Distinguish fixed and active gf_interval
min/max_gf_interval is fixed and can be passed from the command line.
It must satisfy the level constraints.

active_min/max_gf_interval might be changing based on
min/max_gf_interval. It is determined per GOP.

Change-Id: If456c691c97a8b4c946859c05cedd39ca7defa9c
2022-06-27 13:58:54 -07:00
James Zern 10178e6161 vp9_encode_sb_row: remove a branch w/CONFIG_REALTIME_ONLY
replace the check on use_nonrd_pick_mode with an assert. this is only a
start, there are many branches that could be removed that check mode ==
REALTIME, etc. with this configuration.

Bug: webm:1773
Change-Id: I38cf9f83e7c085eb8e87d5cf6db7dc75359b611b
(cherry picked from commit 08b86d7622)
2022-06-21 16:03:10 -04:00
James Zern 1584682025 vp9_cx_iface: set default cpu_used=5 w/CONFIG_REALTIME_ONLY
this avoids a crash if cpu-used is not explicitly set as there are some
(unnecessary) checks against use_nonrd_pick_mode which would cause
encoding to be skipped if the old default of 0 were used

Bug: webm:1773
Change-Id: I62fba5fb51d8afa422689b7de3f03e8f7570e50b
Fixed: webm:1773
(cherry picked from commit 95d196fdf4)
2022-06-21 16:02:50 -04:00
Jerome Jiang 5df4da4026 Update CHANGELOG for L2E
Bug: webm:1752
Change-Id: I5335e0360501503d5c162be4bbdef3ad73151e9f
2022-06-21 14:54:16 -04:00
Jerome Jiang 03d4c6fed9 Update CHANGELOG and version info
A stale codec control was removed, but compatibility was restored.

New codec control was added.

Bump *current* and *age*, and keep *revision* as 0.

Bug: webm:1752
Bug: webm:1757

Change-Id: I76179f129a10c06d897b5c62462808ed9b9c2923
2022-06-17 18:35:21 -04:00
James Zern 08b86d7622 vp9_encode_sb_row: remove a branch w/CONFIG_REALTIME_ONLY
replace the check on use_nonrd_pick_mode with an assert. this is only a
start, there are many branches that could be removed that check mode ==
REALTIME, etc. with this configuration.

Bug: webm:1773
Change-Id: I38cf9f83e7c085eb8e87d5cf6db7dc75359b611b
2022-06-17 10:07:28 -07:00
James Zern 95d196fdf4 vp9_cx_iface: set default cpu_used=5 w/CONFIG_REALTIME_ONLY
this avoids a crash if cpu-used is not explicitly set as there are some
(unnecessary) checks against use_nonrd_pick_mode which would cause
encoding to be skipped if the old default of 0 were used

Bug: webm:1773
Change-Id: I62fba5fb51d8afa422689b7de3f03e8f7570e50b
Fixed: webm:1773
2022-06-17 10:07:08 -07:00
James Zern 638e0b7bba Merge "vp9,encoder: fix some integer sanitizer warnings" into main 2022-06-14 01:24:23 +00:00
Jerome Jiang 027d710a6e Restore backward compatibility
This CL breaks the backward compatibility:

1365e7e1a vp9-svc: Remove VP9E_SET_TEMPORAL_LAYERING_MODE

Forcing the value of the next element

Bug: webm:1752
Change-Id: I83c774b3aa6cca25f2f14995590fb20c0a1668d4
(cherry picked from commit 013ec5722c)
2022-06-13 18:24:55 -04:00
Jerome Jiang 878266136b Update AUTHORS
Bug: webm:1752
Change-Id: I08b4100a0e8c003cd9a7bdaf72926c268e02d53c
2022-06-13 16:36:07 -04:00
Jerome Jiang 013ec5722c Restore backward compatibility
This CL breaks the backward compatibility:

1365e7e1a vp9-svc: Remove VP9E_SET_TEMPORAL_LAYERING_MODE

Forcing the value of the next element

Bug: webm:1752
Change-Id: I83c774b3aa6cca25f2f14995590fb20c0a1668d4
2022-06-13 16:29:31 -04:00
Wan-Teh Chang 46bfeed2c9 Convert EncoderTest::last_pts_ to a local variable
Convert the data member EncoderTest::last_pts_ to a local variable in
the EncoderTest::RunLoop() and VP9FrameSizeTestsLarge::RunLoop()
methods. EncoderTest::last_pts_ is only used in these two methods, and
these two methods first set EncoderTest::last_pts_ to 0 before using it.
So EncoderTest::last_pts_ is effectively a local variable in these two
methods.

Note that several subclasses of EncoderTest declare their own last_pts_
data member and use it to calculate the data rate. Apparently their own
last_pts_ data member hides the same-named data member in the base
class. Although this is allowed by C++, this is very confusing.

Change-Id: I55ce1cf8cc62e07333d8a902d65b46343a3d5881
2022-06-10 16:08:20 -07:00
Cheng Chen 7b1b9f7cd2 L2E: Use libvpx's default q in case of invalid external value
If the external model recommends an invalid q value, we use the
default q selected by libvpx's rate control strategy.

We update the test so that when the external model wants to control
GOP decision, it could get per frame information and just recommend
an invalid q.

Change-Id: I69be4b0ee0800e7ab0706d305242bb87f001b1f7
2022-06-07 11:42:25 -07:00
Cheng Chen 7bb4bd3612 L2E: rename 'gop_index' to 'gop_global_index'
'gop_index' has already been used in vpx_rc_encodeframe_info_t,
which represents the frame index inside the current
group of picture (gop).

We therefore use 'gop_global_index' to represent the index of
the current gop to avoid duplicate names.

Change-Id: I3eb8987dd878f650649b013e0036e23d0846b5f0
2022-06-06 22:38:10 -07:00
Cheng Chen b216340736 L2E: send first pass stats before gop decisions
This change let the encoder send first pass stats before gop
decisioins so that external models could make use of it.

Change-Id: Iafc7eddab93aa77ceaf8e1f7663a52b27d94af80
2022-06-06 14:53:46 -07:00
Cheng Chen 3c5529e313 L2E: Use bit mask to represent control type
The bit mask allows us to easily add an additional control mode
which both the QP and GOP are controlled by an external model.

Change-Id: I49f676f622a6e70feb2a39dc97a4e5050b7f4760
2022-06-06 12:45:25 -07:00
James Zern 883292030b Merge "libs.mk,build/make/Makefile: make test targets ordinary rules" into main 2022-06-03 17:39:48 +00:00
James Zern db754a4532 vp9_change_config: check vp9_alloc_loop_filter return
Change-Id: I4cba67a5ab192d1cf1dbfb5c039a93a4952b071e
(cherry picked from commit 6549e76307)
2022-06-03 10:18:32 -04:00
James Zern 6e7f636396 vp9e_set_config: setjmp before calling vp9_change_config
vp9_change_config may call functions that perform allocations which
expect failures detected by CHECK_MEM_ERROR to not return.

Change-Id: I1dd1eca9c661ed157d51b4a6a77fc9f88236d794
(cherry picked from commit 3997d9bc62)
2022-06-03 10:18:21 -04:00
James Zern 386f25be53 vp8e_set_config: setjmp before calling vp8_change_config
vp8_change_config may call vp8_alloc_compressor_data which expects
failures detected by CHECK_MEM_ERROR to not return.

Change-Id: Ib7fbf4af904bd9b539402bb61c8f87855eef2ad6
(cherry picked from commit 365eebc147)
2022-06-03 10:17:58 -04:00
James Zern 9546c699fb libs.mk,build/make/Makefile: make test targets ordinary rules
this fixes a regression in make 4.2 and still present in 4.3 causing
double colon rules to be serialized which breaks sharding done by the
test and test-no-data-check rules. these targets only define one set of
rules so ordinary rules work unlike clean. install may be another
candidate, but that's left for a follow up.

Change-Id: I9f074eca2ad266eeca6e31aae2e9f31eec8680e0
Tested: make 3.81, 4.1, 4.2, 4.2.1, 4.3
2022-06-02 21:51:20 -07:00
James Zern 22610e3bbc Merge ".gitignore: add android studio / vscode folders" into main 2022-06-03 00:49:18 +00:00
James Zern 7bddf81451 Merge changes I4cba67a5,I1dd1eca9,Ib7fbf4af into main
* changes:
  vp9_change_config: check vp9_alloc_loop_filter return
  vp9e_set_config: setjmp before calling vp9_change_config
  vp8e_set_config: setjmp before calling vp8_change_config
2022-06-02 23:44:29 +00:00
James Zern abfca783ed test/*: normalize use of nullptr
this is preferred over NULL in C++11

Change-Id: Ic48ddcc6dfb8975a57f6713549ad04d93db21415
(cherry picked from commit c304ec38d0)
2022-06-02 19:34:59 -04:00
James Zern 36c9b2d690 .gitignore: add android studio / vscode folders
Change-Id: I039a96bc33f55d9ba8bca9f9f6b69135659d2351
2022-06-02 16:34:21 -07:00
Cheng Chen b0a819825e Merge "L2E: Return error when GOP model is not set" into main 2022-06-02 20:38:43 +00:00
Cheng Chen 4e7c56332d L2E: Return error when GOP model is not set
- Return error instead of OK when GOP model is not set.
- Update descriptions for a few variables.

Change-Id: I213f6b7085c487507c3935e7ce615e807f4474cc
2022-06-01 22:18:45 -07:00
James Zern 3dc6aa01ba vp9,encoder: fix some integer sanitizer warnings
the issues fixed in this change are related to implicit conversions
between int / unsigned int:
vp9/encoder/vp9_segmentation.c:42:36: runtime error: implicit conversion
  from type 'int' of value -9 (32-bit, signed) to type 'unsigned int'
  changed the value to 4294967287 (32-bit, unsigned)
vpx_dsp/x86/sum_squares_sse2.c:36:52: runtime error: implicit conversion
  from type 'unsigned int' of value 4294967295 (32-bit, unsigned) to type
  'int' changed the value to -1 (32-bit, signed)
vpx_dsp/x86/sum_squares_sse2.c:36:67: runtime error: implicit conversion
  from type 'unsigned int' of value 4294967295 (32-bit, unsigned) to type
  'int' changed the value to -1 (32-bit, signed)
vp9/encoder/x86/vp9_diamond_search_sad_avx.c:81:45: runtime error:
  implicit conversion from type 'uint32_t' (aka 'unsigned int') of value
  4290576316 (32-bit, unsigned) to type 'int' changed the value to
  -4390980 (32-bit, signed)
vp9/encoder/vp9_rdopt.c:3472:31: runtime error: implicit conversion from
  type 'int' of value -1024 (32-bit, signed) to type 'uint16_t' (aka
  'unsigned short') changed the value to 64512 (16-bit, unsigned)

unsigned is forced for masks and int is used with intel intrinsics

Bug: webm:1767
Change-Id: Icfa4179e13bc98a36ac29586b60d65819d3ce9ee
Fixed: webm:1767
2022-06-01 19:03:44 -07:00
James Zern 6549e76307 vp9_change_config: check vp9_alloc_loop_filter return
Change-Id: I4cba67a5ab192d1cf1dbfb5c039a93a4952b071e
2022-05-31 22:15:55 -07:00
James Zern 3997d9bc62 vp9e_set_config: setjmp before calling vp9_change_config
vp9_change_config may call functions that perform allocations which
expect failures detected by CHECK_MEM_ERROR to not return.

Change-Id: I1dd1eca9c661ed157d51b4a6a77fc9f88236d794
2022-05-31 22:15:55 -07:00
James Zern 365eebc147 vp8e_set_config: setjmp before calling vp8_change_config
vp8_change_config may call vp8_alloc_compressor_data which expects
failures detected by CHECK_MEM_ERROR to not return.

Change-Id: Ib7fbf4af904bd9b539402bb61c8f87855eef2ad6
2022-05-31 22:15:43 -07:00
James Zern 9d279c88c3 resize_test: add TODO for ResizeTest instantiation for VP9
this should match VP8 and use ONE_PASS_TEST_MODES, but currently the
code will produce integer sanitizer warnings and may segfault under
certain conditions

Bug: webm:1767,webm:1768
Change-Id: I6482ff1862f19716fde3d57522591bc61d76a84f
2022-05-31 22:04:54 -07:00
James Zern 8f56e1c074 resize_test: add TODO for test failure
DISABLED_TestExternalResizeSmallerWidthBiggerSize was added for
webm:1642, but never fixed

Bug: webm:1642
Change-Id: I0fa368a44dda550241ea997068c58eaff551233c
2022-05-31 22:52:17 +00:00
James Zern d353916ab5 libs.doxy_template: remove some obsolete variables
- COLS_IN_ALPHA_INDEX
  this was unused given ALPHABETICAL_INDEX = NO
- PERL_PATH / MSCGEN_PATH
  these were unused

quiets warnings with doxygen 1.9.1:
warning: Tag 'COLS_IN_ALPHA_INDEX' at line 1110 of file 'doxyfile' has
become obsolete.
warning: Tag 'PERL_PATH' at line 1105 of file 'doxyfile' has become
obsolete.
warning: Tag 'MSCGEN_PATH' at line 1126 of file 'doxyfile' has become
obsolete

Change-Id: I6229311afaa3318a3f9bcaf40fafcc5ea71ae271
2022-05-31 18:16:02 +00:00
Cheng Chen f05314d634 Merge "L2E: Add vp9 GOP decision helper function" into main 2022-05-31 17:30:48 +00:00
Cheng Chen 305f0eeacf Merge "L2E: Add control type for the external rate control API" into main 2022-05-31 17:30:31 +00:00
James Zern c304ec38d0 test/*: normalize use of nullptr
this is preferred over NULL in C++11

Change-Id: Ic48ddcc6dfb8975a57f6713549ad04d93db21415
2022-05-27 21:57:11 -07:00
Cheng Chen 3e7685cf62 L2E: Add vp9 GOP decision helper function
Add a helper function to call the external rate control model.

The helper function is placed in the function where vp9 determines
GOP decisions.

The helper function passes frame information, including current
frame show index, coding index, etc to the external rate control
model, and then receives GOP decisions.

The received GOP decisions overwrites the default GOP decision, only
when the external rate control model is set to be active via
the codec control.

The decision should satisfy a few constraints, for example, larger
than min_gf_interval; smaller than max_gf_interval. Otherwise,
return error.

Unit tests are added to test the new functionality.

Change-Id: Id129b4e1a91c844ee5c356a7801c862b1130a3d8
2022-05-27 15:02:32 -07:00
Cheng Chen 4832bcff20 L2E: Add control type for the external rate control API
Two control types are defined: QP and GOP control.
Now the API only supports the QP model.

Change-Id: Ib3a712964b9d2282c93993ee56e0558e4795fb46
2022-05-27 14:36:13 -07:00
Jerome Jiang 9f1329f8ac Revert "[NEON] Optimize vp9_diamond_search_sad() for NEON"
This reverts commit 258affdeab.

Reason for revert:

Not bitexact with C version

Original change's description:
> [NEON] Optimize vp9_diamond_search_sad() for NEON
>
> About 50% improvement in comparison to the C function.
> I have followed the AVX version with some simplifications.
>
> Change-Id: I72ddbdb2fbc5ed8a7f0210703fe05523a37db1c9

Change-Id: I5c210b3dfe1f6dec525da857dd8c83946be566fc
2022-05-26 14:20:14 +00:00
James Zern 52f3bef481 Merge changes Iecb26f38,Ib3ee9b59 into main
* changes:
  GetTempOutFile(): use testing::TempDir()
  y4m_test: check temp file ptr
2022-05-26 05:03:17 +00:00
James Zern 58919dd7f1 GetTempOutFile(): use testing::TempDir()
rather than tmpfile(). this allows for setting the path with TEST_TMPDIR
and provides a valid default for android.

Change-Id: Iecb26f381b6a6ec97da62cfa0b7200f427440a2f
2022-05-26 02:36:42 +00:00
yuanhecai 44874ab879 loongarch: Remove redundant code
Simplify architecture support code and remove redundant code
to improve efficiency.

Bug: webm:1755

Change-Id: I03bc251aca115b0379fe19907abd165e0876355b
2022-05-25 11:20:13 +08:00
James Zern c0cee345a3 y4m_test: check temp file ptr
GetTempOutFile() and TempOutFile::file() may return null if the open
fails

Change-Id: Ib3ee9b592140d30d12aecefa7dfc5f569fa28a34
2022-05-24 17:25:51 -07:00
James Zern b163db1a6a tools/*.py: update to python3
only lint-hunks.py is tested as part of the presubmit; the rest may
need further changes as they're used.

Bug: b/229626362
Change-Id: I2fd6e96deab8d892d34527e484ea65e3df86d162
2022-05-23 15:06:17 -07:00
yuanhecai f92c451e6c loongarch: Modify the representation of macros
Some macros have been changed to "#define do {...} While (0)",
change the rest to "static INLINE ..."

Bug: webm:1755

Change-Id: I445ac0c543f12df38f086b479394b111058367d0
2022-05-20 10:58:38 +08:00
yuanhecai 63378a94f9 loongarch: Reduce the number of instructions
Replace some redundant instructions to improve the efficiency
of the program.

1. txfm_macros_lsx.h
2. vpx_convolve8_avg_lsx.c
3. vpx_convolve8_horiz_lsx.c
4. vpx_convolve8_lsx.c
5. vpx_convolve8_vert_lsx.c
6. vpx_convolve_copy_lsx.c
7. vpx_convolve_lsx.h

Bug: webm:1755

Change-Id: I9b7fdf6900338a26f9b1775609ad387648684f3d
2022-05-19 14:35:22 +08:00
yuanhecai 17959f9c94 vp9[loongarch]: Optimize vpx_quantize_b/b_32x32
1. vpx_quantize_b_lsx
2. vpx_quantize_b_32x32_lsx

Bug: webm:1755

Change-Id: I476c8677a2c2aed7248e088e62c3777c9bed2adb
2022-05-18 16:19:48 +08:00
yuanhecai 1c39c62526 vp8[loongarch]: Optimize vp8_sixtap_predict4x4
1. vp8_sixtap_predict4x4

Bug: webm:1755

Change-Id: If7d844496ef2cfe2252f2ef12bb7cded63ad03dd
2022-05-17 20:53:30 +08:00
yuanhecai 508c0aff89 vp8[loongarch]: Optimize fdct8x4/diamond_search_sad
1. vp8_short_fdct8x4_lsx
2. vp8_diamond_search_sad_lsx
3. vpx_sad8x8_lsx

Bug: webm:1755

Change-Id: Ic9df84ead2d4fc07ec58e9730d6a12ac2b2d31c1
2022-05-17 20:53:25 +08:00
yuanhecai bfbb79e252 vp8[loongarch]: Optimize sub_pixel_variance8x8/16x16
1. vpx_sub_pixel_variance8x8_lsx
1. vpx_sub_pixel_variance16x16_lsx
2. vpx_mse16x16_lsx

Bug: webm:1755

Change-Id: Iaedd8393c950c13042a0597d0d47b534a2723317
2022-05-17 20:53:14 +08:00
Hao Chen 8486953e5e vp8[loongarch]: Optimize vp8 encoding partial function
1. vp8_short_fdct4x4
2. vp8_regular_quantize_b
3. vp8_block_error
4. vp8_mbblock_error
5. vpx_subtract_block

Bug: webm:1755

Change-Id: I3dbfc7e3937af74090fc53fb4c9664e6cdda29ef
2022-05-17 20:53:06 +08:00
Marco Paniconi ca89bed50d vp9-rtc: Fix to usage of active_maps when aq_mode=0
If aq_mode=0 the segmentation feature may still be used
for active_maps, so the condition active_maps.enabled
needs to be added in two places regarding segmentation
logic in encodeframe.c. Otherwise the active_maps would
have no effect.

This also resolves why the assert in bug webm:1762 was
not triggered when aq_mode=0.

Change-Id: Ibd68e9b5c3f81728241a168d3fb3567d6845633d
2022-05-15 22:26:55 -07:00
James Zern 40a46f2034 Merge changes I7d68c7f2,If283fc08,I3b1e0a2c into main
* changes:
  vp9[loongarch]: Optimize avg_variance64x64/variance8x8
  vp9[loongarch]: Optimize fdct4x4/8x8_lsx
  vp9[loongarch]: Optimize vpx_hadamard_16x16/8x8
2022-05-13 19:15:59 +00:00
yuanhecai a44e61db29 vp9[loongarch]: Optimize avg_variance64x64/variance8x8
1. vpx_variance8x8_lsx
2. vpx_sub_pixel_avg_variance64x64_lsx

Bug: webm:1755

Change-Id: I7d68c7f2f5c8d27dc31cfd32298aeefb68f5d560
2022-05-13 15:18:14 +08:00
yuanhecai 65d9ac5b5a vp9[loongarch]: Optimize fdct4x4/8x8_lsx
1. vpx_fdct4x4_lsx
2. vpx_fdct8x8_lsx

Bug: webm:1755

Change-Id: If283fc08f9bedcbecd2c4052adb210f8fe00d4f0
2022-05-13 15:18:08 +08:00
yuanhecai 0d51bb2fc5 vp9[loongarch]: Optimize vpx_hadamard_16x16/8x8
1. vpx_hadamard_16x16_lsx
2. vpx_hadamard_8x8_lsx

Bug: webm:1755

Change-Id: I3b1e0a2c026c3806b7bbbd191d0edf0e78912af7
2022-05-13 15:18:03 +08:00
Jerome Jiang 617698706c Add aq mode 0 and 3 to active map test
Bug: webm:1762
Change-Id: Ia827f6686e8d0cdc09f3d07d07dacaa4fcd801ab
2022-05-12 23:22:20 -04:00
Marco Paniconi a6bff83a60 vp9-rtc: Fix to interp_filter for segment skip
For segment skip feature: allow for setting the
mi->interp_filter to BILINEAR, if cm->interp_filter
is set BILIENAR. This can happen at speed 9 when the
segment skip feature is used (e.g., active_maps)

Without this fix the assert can be triggered with the
active_map_test.cc for speed 9 included.
Updated the test.

Fixes the assert triggered in the issue:
Bug: webm:1762

Change-Id: I462e0bdd966e4f3cb5b7bc746685916ac8808358
2022-05-12 18:28:08 -07:00
James Zern e8579cc3d4 Merge "[NEON] Optimize vp9_diamond_search_sad() for NEON" into main 2022-05-09 17:40:49 +00:00
Konstantinos Margaritis 258affdeab [NEON] Optimize vp9_diamond_search_sad() for NEON
About 50% improvement in comparison to the C function.
I have followed the AVX version with some simplifications.

Change-Id: I72ddbdb2fbc5ed8a7f0210703fe05523a37db1c9
2022-05-07 16:38:36 +03:00
James Zern cb1abee145 add some missing realloc checks
Change-Id: I0fd1e094085c18b1d9a32333e876c2affeb6de23
2022-05-06 11:55:56 -07:00
James Zern f3b4c9a8f6 vp8[cd]x.h: document vpx_codec_vp[89]_[cd]x*
+ mark the _algo variables as deprecated.
this quiets some doxygen warnings

Bug: webm:1752
Change-Id: I53b9b796c3d8fef5c713ee4278641198f95b5864
2022-05-06 11:47:06 -07:00
Jerome Jiang 8ac72859e1 vp9 svc sample: set fps from y4m file
Change-Id: I082c0409910da4cda5bf852b20ffa11ba5c2ebd6
2022-05-03 10:46:46 -04:00
James Zern 872732b2c9 examples: add missing argv_dup alloc checks
Change-Id: Ia3080cbf50071d599c7168a20466392a963f101a
2022-04-28 17:45:47 -07:00
James Zern 2858ef9ec6 Merge changes I99ee0ef3,Ie087e8be,I6b19d016,I6fb7771d,I54f83733, ... into main
* changes:
  y4m_input_open: check allocs
  fastssim,fs_ctx_init: check alloc
  vp9_get_smooth_motion_field: check alloc
  vp9_row_mt_alloc_rd_thresh: check alloc
  simple_encode,init_encoder: check buffer_pool alloc
  VP9RateControlRTC::Create: check segmentation_map alloc
  vp9_speed_features.c: check allocations
  vp9_alloc_motion_field_info: check motion_field_array alloc
  vp9_enc_grp_get_next_job: check job queue alloc
  vp9: check postproc_state.limits allocs
  vp9,encode_tiles_buffer_alloc: fix allocation check
2022-04-28 17:03:32 +00:00
yuanhecai 1b00ad5263 vp9[loongarch]: Optimize sad8x8/32x64/64x32x4d
1. vpx_sad8x8x4d_lsx
2. vpx_sad32x64x4d_lsx
3. vpx_sad64x32x4d_lsx

Bug: webm:1755

Change-Id: I08a2b8717ec8623ffdd4451a04e68fa3a7228668
2022-04-28 09:35:30 +08:00
yuanhecai b1ed8e08a2 vp9[loongarch]: Optimize sad64x64/32x32_avg,comp_avg_pred
1. vpx_sad64x64_avg_lsx
2. vpx_sad32x32_avg_lsx
3. comp_avg_pred_lsx

Bug: webm:1755

Change-Id: I58dabdcdd4265bd6ebd5670db8a132d2e838683f
2022-04-28 09:34:51 +08:00
James Zern 8baaa7b5a3 y4m_input_open: check allocs
Change-Id: I99ee0ef3ab28a22923cb413ccf5935fdc38862be
2022-04-27 15:28:53 -07:00
James Zern c3d2df2f2f fastssim,fs_ctx_init: check alloc
Change-Id: Ie087e8be1e943b94327ed520db447a0e3a927738
2022-04-26 22:22:33 -07:00
James Zern c152584107 vp9_get_smooth_motion_field: check alloc
Change-Id: I6b19d0169d127f622abf97b3b8590eee957bdc51
2022-04-26 22:22:33 -07:00
James Zern e82c5a85c9 vp9_row_mt_alloc_rd_thresh: check alloc
Change-Id: I6fb7771d9fa6ec54d81f24a02a289e8b852e7332
2022-04-26 22:22:33 -07:00
James Zern b2d57a8808 simple_encode,init_encoder: check buffer_pool alloc
Change-Id: I54f83733260abf828166400c5fd0c4c7e3ccec2f
2022-04-26 19:18:15 -07:00
James Zern a5ad89018e VP9RateControlRTC::Create: check segmentation_map alloc
Change-Id: I17b23915c32accf834def5ab26a8e4e188f9993a
2022-04-26 19:18:14 -07:00
James Zern 58fff2f9ef vp9_speed_features.c: check allocations
Change-Id: If3b319c1ce7036c2259440f4eeb2e645bf559f4c
2022-04-26 19:18:14 -07:00
James Zern 72fa1d505e vp9_alloc_motion_field_info: check motion_field_array alloc
Change-Id: I4ae11242e645feb3b85eaea186f14b3676ae40a8
2022-04-26 19:18:14 -07:00
James Zern e93e2ca0e3 vp9_enc_grp_get_next_job: check job queue alloc
+ reverse conditional order; var == constant is more readable

Change-Id: I9f2b4394024c262fd5fe9576a8bf33afe197c050
2022-04-26 19:18:14 -07:00
James Zern 1b70db4be9 vp9: check postproc_state.limits allocs
Change-Id: I9d5df96580074375e4847d2e2f60a6a6d56eeea5
2022-04-26 19:06:29 -07:00
James Zern 19b45a26c6 vp9,encode_tiles_buffer_alloc: fix allocation check
previously vp9_bitstream_worker_data was checked after it was memset();
this change uses CHECK_MEM_ERROR for consistency to ensure the pointer
is checked first

Change-Id: I532d0eb0e746dc6b8d694b616eba693c5c0053ac
2022-04-26 18:44:06 -07:00
yuanhecai f6de5b51b8 vp9[loongarch]: Optimize fdct/get/variance16x16
1. vpx_fdct16x16_lsx
2. vpx_get16x16var_lsx
3. vpx_variance16x16_lsx

Bug: webm:1755

Change-Id: I27090406dc28cfdca64760fea4bc16ae11b74628
2022-04-26 20:54:41 +08:00
James Zern 8a29e27e17 Merge changes Ib8bcefd2,I84f789ca into main
* changes:
  register_state_check.h: add compiler barrier
  add_noise_test.cc: remove stale TODO
2022-04-26 01:15:30 +00:00
James Zern d18407a171 register_state_check.h: add compiler barrier
around ASM_REGISTER_STATE_CHECK() this helps keep the call ordering
consistent avoiding some code reordering which may affect the registers
being checked

fixes issue with armv7 and multiple versions of gcc:
[ RUN      ] C/AddNoiseTest.CheckNoiseAdded/0
test/register_state_check.h:116: Failure
Expected equality of these values:
  pre_store_[i]
    Which is: 0
  post_store[i]
    Which is: 4618441417868443648

Bug: webm:1760
Change-Id: Ib8bcefd2c4d263f9fc4d4b4d4ffb853fe89d1152
Fixed: webm:1760
2022-04-25 15:21:05 -07:00
James Zern 192c85c431 add_noise_test.cc: remove stale TODO
this was completed in:
0dc69c70f postproc : fix function parameters for noise functions.

Change-Id: I84f789ca333e9690e70e696d44475dd59339593b
2022-04-25 11:20:48 -07:00
yuanhecai 76b7350cee vp9[loongarch]: Optimize sub_pixel_variance32x32/sad16x16
1. vpx_sad16x16_lsx
2. vpx_sub_pixel_variance32x32_lsx

Bug: webm:1755

Change-Id: I9926ace710903993ccbb42caef320fa895e90127
2022-04-24 17:32:10 +08:00
yuanhecai 618739f59f vp9[loongarch]: Optimize horizontal/vertical_4/dual
1. vpx_lpf_horizontal_4_lsx
2. vpx_lpf_vertical_4_lsx
3. vpx_lpf_horizontal_4_dual_lsx
3. vpx_lpf_vertical_4_dual_lsx

Bug: webm:1755

Change-Id: I12e9f27cafd9514b24cfbf2354cc66c7d1238687
2022-04-22 15:04:53 +08:00
yuanhecai 608a28e30b vp9[loongarch]: Optimize convolve8_avg_vert/convolve_copy
1. vpx_convolve8_avg_vert_lsx
2. vpx_convolve_copy_lsx
3. vpx_idct32x32_135_add_lsx

Bug: webm:1755

Change-Id: I6bdfe5836a91a5e361ab869b26641e86c5ebb68d
2022-04-22 15:03:34 +08:00
yuanhecai 2651113a64 vp9[loongarch]: Optimize vertical/horizontal_8_dual
1. vpx_lpf_vertical_8_dual_lsx
2. vpx_lpf_horizontal_8_dual_lsx

Bug: webm:1755

Change-Id: I354df02cc215f36b4edf6558af0ff7fd6909deac
2022-04-22 15:00:40 +08:00
James Zern d1cc027ae6 Merge "vp8_decode: free mt buffers early on resolution change" into main 2022-04-22 00:16:28 +00:00
James Zern f2ef29f746 fdct16x16_neon.h,cosmetics: fix include-guard case
Change-Id: I593735bb7f88d63f2ddab57484099479c8759a3d
2022-04-19 19:26:37 -07:00
James Zern 8da05d39b9 vp8_decode: free mt buffers early on resolution change
this avoids a desynchronization of mb_rows if an allocation prior to
vp8mt_alloc_temp_buffers() fails and the decoder is then destroyed

Bug: webm:1759
Change-Id: I75457ef9ceb24c8a8fd213c3690e7c1cf0ec425f
2022-04-19 17:27:39 -07:00
James Zern 665f6a3065 webmdec: fix double free
when no frames were decoded, for example due to a decoder initialization
failure, an orphan buffer pointer from webm_guess_framerate() via
webm_read_frame() would have been freed during cleanup

Change-Id: I6ea3defdd13dd75427f79c516e207b682391e4fa
2022-04-18 19:14:30 -07:00
James Zern 6ea4ef1d24 vp9_dx_iface,init_buffer_callbacks: return on alloc failure
use an error code as a jmp target is not currently set in init_decoder()

Change-Id: If7798039439f13c739298a8a92a55aaa24e2210c
2022-04-18 19:14:30 -07:00
James Zern f1d42a92bb vp9_encoder: check context buffer allocations
previously the returns for alloc_context_buffers_ext() and
vp9_alloc_context_buffers() were ignored which would result in a NULL
access during encoding should they fail

Change-Id: Icd76576f3d5f8d57697adc9ae926a3a5be731327
2022-04-18 19:14:30 -07:00
James Zern 0ca5af7e24 vp9_alloc_internal_frame_buffers: fix num buffers assignment
avoid setting num_internal_frame_buffers until the allocation is
checked, avoiding an invalid access in vp9_free_internal_frame_buffers()

Change-Id: I28a544a2553d62a6b5cb7c45bf10591caa4ebab6
2022-04-18 19:14:30 -07:00
James Zern 45fb0161b0 vp9_alloccommon: add missing pointer checks
in vp9_free_ref_frame_buffers() and vp9_free_context_buffers(); pool and
free_mi may be NULL due to earlier allocation failures

Change-Id: I3bd26ea29b3aea6c58f33d5b7f5a280eb6250ec7
2022-04-18 18:56:49 -07:00
James Zern 8064a69aba Merge changes If8318068,I21519b5b into main
* changes:
  temporal_filter_sse4,cosmetics: fix some typos
  temporal_filter_sse4: remove unused function params
2022-04-19 00:16:29 +00:00
James Zern 90749e8663 temporal_filter_sse4,cosmetics: fix some typos
Change-Id: If8318068a32da52d15c0ba595f80092611f4c847
2022-04-18 12:57:28 -07:00
James Zern 5b6914a72d Merge "Upgrade GoogleTest to v1.11.0" into main 2022-04-18 18:10:30 +00:00
James Zern 946bcdf906 Upgrade GoogleTest to v1.11.0
The release tag is release-1.11.0.

Ref: https://aomedia-review.googlesource.com/c/aom/+/156641
79c98a122 Upgrade GoogleTest to v1.11.0

Note the tree structure differs from libaom, but is left untouched to
avoid breaking test include paths in this commit.

Change-Id: Ia3c6861d45a3befc2decb1da5b1018bcfd38f95a
2022-04-18 10:01:20 -07:00
James Zern 9750257826 vp8,get_sub_mv_ref_prob: change arguments to uint32_t
this matches the call with int_mv::as_int and fixes a warning with
clang-13 -fsanitize=integer:
vp8/decoder/decodemv.c:240:32: runtime error: implicit conversion from
type 'uint32_t' (aka 'unsigned int') of value 4282515456 (32-bit,
unsigned) to type 'int' changed the value to -12451840 (32-bit, signed)

Bug: webm:1759
Change-Id: I7c0aa72baa45421929afac26566e149adc6669d7
2022-04-15 22:33:11 -07:00
James Zern c8b9bf2b28 vp8: fix some implicit unsigned -> int conversions
fixes some warnings with clang-13 -fsanitize=integer:
vp8/decoder/threading.c:77:27: runtime error: implicit conversion
from type 'unsigned int' of value 4294967295 (32-bit, unsigned) to type
'int' changed the value to -1 (32-bit, signed)

these bitmask constants were missed in:
1676cddaa vp8: fix some implicit signed -> unsigned conv warnings

Bug: webm:1759
Change-Id: I5d894d08fd41e32b91b56a4d91276837b3415ee4
2022-04-15 22:32:51 -07:00
James Zern 36ea80f3d6 Merge changes I9c24f75e,I83bce11c into main
* changes:
  vp9[loongarch]: Optimize idct32x32_1024/1/34_add
  vp9[loongarch]: Optimize vpx_fdct32x32/32x32_rd
2022-04-16 04:51:13 +00:00
Jerome Jiang 3b5de1ee8c Merge "Fix int overflow in intermediate calculation" into main 2022-04-15 14:51:37 +00:00
yuanhecai 81e5841a16 vp9[loongarch]: Optimize idct32x32_1024/1/34_add
1. vpx_idct32x32_1024_add_lsx
2. vpx_idct32x32_34_add_lsx
3. vpx_idct32x32_1_add_lsx

Bug: webm:1755

Change-Id: I9c24f75e0d93613754d8e30da7e007b8d1374e60
2022-04-15 17:23:09 +08:00
yuanhecai a067d8a5bc vp9[loongarch]: Optimize vpx_fdct32x32/32x32_rd
1. vpx_fdct32x32_lsx
2. vpx_fdct32x32_rd_lsx

Bug: webm:1755

Change-Id: I83bce11c0d905cf137545a46cd756aef9cedce47
2022-04-15 17:22:55 +08:00
James Zern 73b8aade83 temporal_filter_sse4: remove unused function params
this clears warnings under clang-13 of the form:
../vp9/encoder/x86/temporal_filter_sse4.c:275:39: warning: parameter
'u_pre' set but not used [-Wunused-but-set-parameter]

Change-Id: I21519b5b0b9c21b04b174327415e0e73b56bdfda
2022-04-14 13:21:07 -07:00
Jerome Jiang 474a50c648 Fix int overflow in intermediate calculation
This is not a complete fix to webm:1751.

Bug: webm:1751
Change-Id: Ieed6c823744f5f0625d529db3746cfe4f549c8c0
2022-04-14 12:13:15 -07:00
James Zern 4b965fba78 Merge "vp9,update_mbgraph_frame_stats: rm unused variables" into main 2022-04-14 17:28:52 +00:00
James Zern a165f4ba64 vp9,update_mbgraph_frame_stats: rm unused variables
this quiets warnings under clang-13 of the form:
../vp9/encoder/vp9_mbgraph.c:222:42: warning: variable 'gld_y_offset'
set but not used [-Wunused-but-set-variable]

Change-Id: I32170b90c07058f780b4e8100ee5217232149db8
2022-04-13 22:16:30 -07:00
James Zern e87f6d0a2c vp8,define_gf_group: remove unused variable
this clears a warning under clang-13:
vp8/encoder/firstpass.c:1634:10: warning: variable
'mod_err_per_mb_accumulator' set but not used
[-Wunused-but-set-variable]

Change-Id: I694a99d56724be89090e01c45559237c0fda147a
2022-04-13 22:14:33 -07:00
yuanhecai d387c89e86 Update loongson_intrinsics.h from v1.0.5 to v1.2.1
Bug: webm:1755

Change-Id: Ib636d2aa521332b76b6aa1b0aa0a9005aafbf32b
2022-04-14 11:17:42 +08:00
yuanhecai caf65c14a8 vp9[loongarch]: Optimize vpx_variance64x64/32x32
1. vpx_variance64x64_lsx
2. vpx_variance32x32_lsx

Bug: webm:1755

Change-Id: I45c5aa94cbbf7128473894a990d931acaa40e102
2022-04-12 10:48:29 +08:00
yuanhecai 3a3645dbdc vp9[loongarch]: Optimize sad64x64/32x32/16x16
1. vpx_sad64x64x4d_lsx
2. vpx_sad32x32x4d_lsx
3. vpx_sad16x16x4d_lsx
4. vpx_sad64x64_lsx
5. vpx_sad32x32_lsx

Bug: webm:1755

Change-Id: Ief71c2216f697b261d7c1fc481c89c9f1a6098e6
2022-04-12 10:48:24 +08:00
James Zern fb2f58e118 Merge "vpxdec: add some allocation checks" into main 2022-04-11 20:28:44 +00:00
James Zern a3cd75e29b vpxdec: add some allocation checks
see also: https://crbug.com/aomedia/3244

Change-Id: I7d151e63a91b8c1a5ee4e861f0b8461eeece6a2f
2022-04-11 11:41:19 -07:00
James Zern d04f78b563 rate_hist,show_histogram: fix crash w/0 buckets
this can occur if 0 frames are encoded, e.g., due to --skip

see also: https://crbug.com/aomedia/3243

Change-Id: I791d5ad6611dbcb60d790e6b705298328ec48126
2022-04-11 11:08:12 -07:00
Cheng Chen 6e1b7c6c14 Merge "L2E: Make SimpleEncode take vp9 level as an input" into main 2022-04-05 18:25:23 +00:00
James Zern 96a509ee03 Merge changes I0b6520be,I1f006daa,I7ee8e367 into main
* changes:
  vp9[loongarch]: Optimize vpx_convolve8_avg_horiz_c
  vp8[loongarch]: Optimize dequant_idct_add_y/uv_block
  loongarch: Fix bugs
2022-04-05 02:52:06 +00:00
Johann Koenig 49808fa296 Merge "remove unused vp8_encode_intra parameter" into main 2022-04-04 00:59:23 +00:00
James Zern a3afed22ad Merge changes I4c63beeb,I28f3b98f into main
* changes:
  Revert "quantize: replace highbd versions"
  Revert "quantize: remove highbd version"
2022-03-31 21:54:24 +00:00
Johann Koenig 1ce16542f4 Merge "subpel variance: add speed test" into main 2022-03-31 21:43:47 +00:00
James Zern d00fd066e8 Revert "quantize: replace highbd versions"
This reverts commit 2200039d33.

This causes failures with VP9/EndToEndTestLarge.EndtoEndPSNRTest/*; it
seems the assembly does not match the C code.

Bug: webm:1586
Change-Id: I4c63beebf88d4c12789d681b0d38014510b147fe
2022-03-31 12:14:16 -07:00
James Zern 6ac395ed77 Revert "quantize: remove highbd version"
This reverts commit 89cfe3835c.

This is a prerequisite for reverting
2200039d33 which causes high bitdepth test
failures

Bug: webm:1586
Change-Id: I28f3b98f3339f3573b1492b88bf733dade133fc0
2022-03-31 12:14:09 -07:00
yuanhecai 8ff9f66b8d vp9[loongarch]: Optimize vpx_convolve8_avg_horiz_c
1. vpx_convolve8_avg_horiz_lsx

Bug: webm:1755

Change-Id: I0b6520be0afa1689da329f56ec6cd95c1730250c
2022-03-31 20:35:22 +08:00
yuanhecai d406064721 vp8[loongarch]: Optimize dequant_idct_add_y/uv_block
1. vp8_dequant_idct_add_uv_block_lsx
2. vp8_dequant_idct_add_y_block_lsx

Bug: webm:1755

Change-Id: I1f006daaefb2075b422bc72a3f69c5abee776e2e
2022-03-31 20:35:13 +08:00
yuanhecai 176acaf9f6 loongarch: Fix bugs
Fix bugs from loopfilter_filters_lsx.c, vpx_convolve8_avg_lsx.c

Bug: webm:1755

Change-Id: I7ee8e367d66a49f3be10d7e417837d3b6ef50bdb
2022-03-31 20:35:04 +08:00
Johann Koenig 81eb99386b Merge "quantize: remove highbd version" into main 2022-03-31 05:45:23 +00:00
Johann 89cfe3835c quantize: remove highbd version
The only difference between the code is the clamp. For
8 bit it is purely an optimization. The values outside
this range will still saturate.

Change-Id: I2a770b140690d99e151b00957789bd72f7a11e13
2022-03-31 13:07:35 +09:00
Johann e6ede58a5a remove unused vp8_encode_intra parameter
Follow it up and also remove it from other functions.

BUG=webm:1612

Change-Id: I9d3cb785ab0d68c6fcae185043c896d8a135e284
2022-03-31 10:59:50 +09:00
James Zern 19f222f8b6 Merge "Optimize FHT functions for NEON" into main 2022-03-31 01:53:51 +00:00
Johann 3c98caa6a4 subpel variance: add speed test
Was used to verify assembly speed versus an attempt to rewrite
in intrinsics.

Change-Id: I011fe5494334b8fcda04b9d54c6093dbcfc55710
2022-03-31 10:43:29 +09:00
Johann Koenig 6d1844e54d Merge "remove sad x3,x8 specializations" into main 2022-03-31 00:45:31 +00:00
Johann 2200039d33 quantize: replace highbd versions
The optimized quantize functions were already built to handle
highbd values. The only difference is the clamping. All highbd
functions expand to 32bits when running in highbd mode.

Removes vpx_highbd_quantize_32x32_sse2 as it is slower than the
C version in the worst case.

Bug: webm:1586
Change-Id: I49bf8a6a2041f78450bf43a4f655c67656b0f8d9
2022-03-31 00:43:52 +00:00
Cheng Chen 2c32425851 L2E: Make SimpleEncode take vp9 level as an input
Level conformance is standadized in vp9.
If a specific target level is set, the vp9 encoder is required to
produce conformant bitstream with limit on frame size, rate,
min alt-ref distance, etc.

This change makes the SimpleEncode environment take the target level
as an input.

To make existing tests pass, we set the level to 0.

Change-Id: Ia35224f75c2fe50338b5b86a50c84355f5daf6fd
2022-03-30 16:29:29 -07:00
Konstantinos Margaritis 247658efb0 Optimize FHT functions for NEON
[NEON]
Optimize vp9_fht4x4, vp9_fht8x8, vp9_fht16x16 for NEON

Following change #3516278, the improvement for these functions is:

Before:
     4.10%     0.75%  vpxenc   vpxenc              [.] vp9_fht16x16_c
     2.93%     0.65%  vpxenc   vpxenc              [.] vp9_fht8x8_c
     0.93%     0.77%  vpxenc   vpxenc              [.] vp9_fht4x4_c

And after the patch:

     0.69%     0.16%  vpxenc   vpxenc              [.] vp9_fht16x16_neon
     0.28%     0.28%  vpxenc   vpxenc              [.] vp9_fht8x8_neon
     0.54%     0.53%  vpxenc   vpxenc              [.] vp9_fht4x4_neon

Bug: webm:1634
Change-Id: I6748a0c4e0cfaafa3eefdd4848d0ac3aab6900e4
2022-03-30 11:35:35 +03:00
James Zern 1239be9e5f sad4d_avx2: fix VS 2014 build error
after:
d60b671a7 gcc 11 warning: mismatched bound

error C2719: 'sums': formal parameter with requested alignment of 32
won't be aligned

Change-Id: Iaba46d00ef2334a5e2d9ee69b5d03478fdc73a60
2022-03-29 20:01:02 -07:00
Johann 02808ecbcc remove skip_block from quantize
Whether a block is skipped is handled by mi->skip. x->skip_block
is kept exclusively to verify that the quantize functions are not
called for skip blocks.

Finishes the cleanup in 13eed991f

Bug: libvpx:1612
Change-Id: I1598c3b682d3c5e6c57a15fa4cb5df2c65b3a58a
2022-03-30 01:56:23 +00:00
Johann afd60bd07d remove sad x3,x8 specializations
These would compute the sum of absolute differences (sad) for a
group of 3 or 8 references. This was used as part of an exhaustive
search.

vp8 only uses these functions in speed 0 and best quality.

For vp9 this is only used with the --enable-non-greedy-mv
experiment.

This removes the 3- and 8-at-a-time optimized functions and uses
the fall back code which will process 1 or 4 (vpx_sadMxNx4d) at
a time.

For configure --target=x86_64-linux-gcc --enable-realtime-only:
libvpx.a
before: 3002424 after: 2937622 delta: 64802
after 'strip libvpx.a'
before: 2116998 after: 2073090 delta: 43908

Change-Id: I566d06e027c327b3bede68649dd551bba81a848e
2022-03-29 12:31:02 +09:00
Johann Koenig 64f58f5e0a Merge "gcc 11 warning: mismatched bound" into main 2022-03-29 03:18:02 +00:00
Johann d60b671a73 gcc 11 warning: mismatched bound
Clean up a new build warning with gcc11:
argument 3 of type ‘const uint8_t * const[]’ with
mismatched bound [-Warray-parameter=]

Standardize sad functions with array sizes.

Change-Id: Iea4144e61368f6a8279e2f3ae96c78aff06c8b41
2022-03-29 10:56:27 +09:00
James Zern 9c424b7556 ads2armasm_ms.pl: fix thumb::FixThumbInstructions call
broken since:
642529248 ads2gas[_apple].pl: remove unused stanzas

Change-Id: I1eac77e2fe23cc3f162251e9e0102a4909f7b997
2022-03-26 10:25:18 -07:00
James Zern b0087f6cd2 Merge "Make sure only NEON FDCT functions are called." into main 2022-03-24 19:39:10 +00:00
Johann 29cde7ec1a ads2gas: maintain whitespace
Don't use tabs during conversion. Save and restore
existing spacing.

Change-Id: Ib8f443db542c091d36e9ab9836e3e3e292d711f7
2022-03-23 14:20:29 +09:00
Johann da0cfd3d59 ads2gas: fix .size measurement
The distance between PROC and END is used to generate .size
information for debugging. When the leading underscore was
removed the pattern used to match the function name broke.

Change-Id: I90bf67d95ecdc2d214606e663773f88d2a2d6b9c
2022-03-23 13:58:46 +09:00
Johann Koenig b0ee4c21ed Merge "ads2gas*.pl: strip trailing whitespace after transforms" into main 2022-03-23 04:24:28 +00:00
James Zern f6344745d9 ads2gas*.pl: strip trailing whitespace after transforms
Change-Id: I0bea977b256e464231706c72cc14a5c8b6e90775
2022-03-22 13:51:27 -07:00
Jerome Jiang f3711cae5a Fix ClangTidy style warning
Change-Id: I6c4711e488cda6b97af96d5e1b6b249786e709de
2022-03-22 13:07:31 -07:00
Konstantinos Margaritis f79d256cb2 Make sure only NEON FDCT functions are called.
[NEON]
Added vpx_fdct4x4_pass1_neon(),
Added vpx_fdct8x8_pass1_notranspose_neon(),
Added vpx_fdct8x8_pass1_neon() to avoid code duplication
Refactored vpx_fdct4x4_neon() and vpx_dct8x8_neon() to use the above
Rename dct_body to vpx_fdct16x16_body to reuse later
Add transpose_s16_16x16()

I have run make test and all tests/configurations seem to pass.

Profiled using this command on an Ampere Altra VM:
sudo perf record -g ./vpxenc --codec=vp9 --height=1080 --width=1920 \
   --fps=25/1 --limit=20 -o output.mkv \
   ../original_videos_Sports_1080P_Sports_1080P-0063.mkv --debug –rt

Before this optimization:
1.32%     1.32%  vpxenc   vpxenc              [.] vpx_fdct4x4_neon
0.16%     0.16%  vpxenc   vpxenc              [.] vpx_fdct4x4_c
0.79%     0.79%  vpxenc   vpxenc              [.] vpx_fdct8x8_c
0.52%     0.52%  vpxenc   vpxenc              [.] vpx_fdct8x8_neon
1.23%     1.23%  vpxenc   vpxenc              [.] vpx_fdct16x16_c
0.54%     0.54%  vpxenc   vpxenc              [.] vpx_fdct16x16_neon

So, even though a _neon() version exists, the C version was called \
as well. After this patch:

1.42%     1.36%  vpxenc   vpxenc              [.] vpx_fdct4x4_neon
0.87%     0.82%  vpxenc   vpxenc              [.] vpx_fdct8x8_neon
0.74%     0.74%  vpxenc   vpxenc              [.] vpx_fdct16x16_neon

Change-Id: Id4e1dd315c67b4355fe4e5a1b59e181a349f16d0
2022-03-17 13:07:12 +02:00
yuanhecai bf672f23a5 vp8[loongarch]: Optimize idct_add, filter_bv/bh
1. vp8_dc_only_idct_add_lsx
2. vp8_loop_filter_bh_lsx
3. vp8_loop_filter_bv_lsx

Bug: webm:1755

Change-Id: I9b629767e2a4e9db8cbb3ee2369186502dc6eb00
2022-03-17 10:39:34 +08:00
yuanhecai 31441d45f7 vp9[loongarch]: Optimize convolve/convolve8_avg_c
1. vpx_convolve8_avg_lsx
2. vpx_convolve_avg_lsx

Bug: webm:1755

Change-Id: I4af5c362a94f11d0b5d1760e18326660bdbc0559
2022-03-16 12:21:21 +08:00
yuanhecai 220643c862 vp9[loongarch]: Optimize convolve8_horiz/vert/c
1. vpx_convolve8_lsx
2. vpx_convolve8_vert_lsx
3. vpx_convolve8_horiz_lsx

Bug: webm:1755

Change-Id: I9897e1ed6a904ac74d1078bd22b275af44db142d
2022-03-16 12:19:46 +08:00
Johann 4ee32be84b ads2gas_apple.pl: remove gcc-isms
The gcc assembler was incompatible for a long
time. It is now based on clang and accepts
more modern syntax, although not enough to
remove the script entirely.

Change-Id: I667d29dca005ea02a995c1025c45eb844081f64b
2022-03-13 07:50:37 +09:00
Johann 642529248f ads2gas[_apple].pl: remove unused stanzas
Many of the features in ads2gas are no longer used.
Remove all patterns which are no longer used in
libvpx.

Simplify between the two to minimize differences.

Change-Id: Ia1151eb8b694cbe51845a1374a876cc7b798899c
2022-03-13 07:50:33 +09:00
James Zern 8a50f70ffc Merge "vp9[loongarch]: Optimize horizontal/vertical_8_c" into main 2022-03-03 18:35:53 +00:00
yuanhecai 624b136700 vp9[loongarch]: Optimize horizontal/vertical_8_c
1. vpx_lpf_vertical_8_lsx
2. vpx_lpf_horizontal_8_lsx

Bug: webm:1755

Change-Id: I6b05d6b1b2ac4d2a75beb9c9ca9700976fc3af55
2022-03-03 20:37:26 +08:00
Marco Paniconi 1365e7e1a5 vp9-svc: Remove VP9E_SET_TEMPORAL_LAYERING_MODE
The control was never implemented, no need to keep this.
temporal_layering_mode is set in the config.

Bug: webm:1753
Change-Id: I9a6eb50e82344605ab62775911783af82ac2d401
2022-03-02 11:38:38 -08:00
yuanhecai 3b21aeac8b vp9[loongarch]: Optimize lpf_horizontal/vertical_16_dual with LSX
Change-Id: I82c6bc16ea57c3f7ac5f4d212a12a5f70cb55ffc
2022-02-25 11:42:22 +08:00
James Zern 2da19ac033 svc_datarate_test.cc: remove stale TODO
Bug: webm:1554
Change-Id: I547223763b86c6a24fa32851f7b30ebab4b7472a
2022-02-11 12:43:29 -08:00
Gregor Jasny cafe7cc1f1 support visual studio 2022 (vs17)
Change-Id: I8380283d09b0c90183f224399f953dcc527181c5
2022-02-10 09:01:49 +01:00
Marco Paniconi df0d06de6d Merge "rtc-vp9: Fix intra-only for bypass mode" into main 2022-02-09 02:41:23 +00:00
Marco Paniconi 232ad814de rtc-vp9: Fix intra-only for bypass mode
Allow intra-only frame in svc to also work
in bypass (flexible-svc) mode.

Added unittest for the flexible svc case.

And fix the gld_fb_idx for (SL0, TL1) in bypass/flexible
mode pattern in the sample encoder: force it to be 0
(same as lst_fb_idx), since the slot is unused on SL0.

Change-Id: Iada9d1b052e470a0d5d25220809ad0c87cd46268
2022-02-08 13:40:24 -08:00
Lu Wang b3cc4b625d vp8[loongarch]: Optimize vp8_loop/sixtap, vpx_dc with LSX.
1. vp8_loop_filter_mbh, vp8_loop_filter_mbv
2. vp8_sixtap_predict16x16, vp8_sixtap_predict8x8
3. vpx_dc_predictor_16x16, vpx_dc_predictor_8x8

./vpxdec --progress -o YUV_1920X1080.yuv original_1200f/VP8_1920X1080.webm

before: 37.77fps
after : 220.90fps

Bug: webm:1755

Change-Id: I1a3ce16f0c872261d813b6531cfdf25bd59bb774
2022-02-08 14:55:09 +08:00
Lu Wang 85a9bdc6cc vpx_util[loongarch]: Add loongson_intrinsics.h v1.0.5.
Bug: webm:1755

Change-Id: Id2fa999bdb8788bd4285114c748c547fa262a95e
2022-02-08 14:54:42 +08:00
Wan-Teh Chang 41f444bee8 Merge "Handle NV12 in vpx_img_chroma_subsampling()" into main 2022-02-08 02:20:35 +00:00
Wan-Teh Chang b22edeb26b Handle NV12 in vpx_img_chroma_subsampling()
Change-Id: Ibac9f6f8dcdcae0d0c10ae1a118d13baf2407270
2022-02-05 14:32:49 -08:00
Wan-Teh Chang e2cc35cb67 Update error messages in validate_img()
Change-Id: I4aa6d2e16e077d29e4e9eabfc7056fcfed6786d6
2022-02-05 12:14:37 -08:00
Marco Paniconi 0156be2ab3 Merge "rtc-vp9: Fix to tests for intra-only frame." into main 2022-02-03 17:04:46 +00:00
Marco Paniconi 74c0f504c4 rtc-vp9: Fix to tests for intra-only frame.
Fix some issues with the test, and add new
test that verifies that we can decode base stream
startinig at middle of sequence where intra-only
frame is inserted.

Change-Id: I398d23927113eb58ef64694feca25e60ce60a5f7
2022-02-02 19:07:46 -08:00
James Zern 847a0ef84f vp9_roi_test: apply iwyu
Change-Id: I715c27e329495940d989f95df65ac10e021261d2
2022-02-01 16:30:24 -08:00
James Zern fc2a31cfb9 vp9_thread_test: parameterize VP9DecodeMultiThreadedTest
on a per-file basis; this will make sharding more effective

Change-Id: Ib797681a7cc3bd7ec835bb0c1c7a8d9f23512a0d
2022-02-01 12:26:21 -08:00
Jerome Jiang 5eab88970a Merge "Use background segmentation mask with ROI" into main 2022-01-31 15:17:22 +00:00
Matt Oliver 9d7b81da66 Update README.markdown 2022-01-31 10:44:11 +11:00
Matt Oliver fa0122decc project: Update appveyor yasm version. 2022-01-31 10:31:19 +11:00
James Zern 8fb2686f8b Merge "vpx/vp8[cd]x.h,cosmetics: normalize ctrls to enum order" into main 2022-01-29 03:20:41 +00:00
James Zern 0494625b7b vpx/vp8[cd]x.h,cosmetics: normalize ctrls to enum order
Change-Id: I49bbd956b3a64008d1abe54de87d7831bc3eede6
2022-01-28 12:25:47 -08:00
Jin Bo 479758aeb1 libvpx[loongarch]: Add loongarch support.
LSX and LASX are enabled by default if compiler supports them.

Bug: webm:1754

Change-Id: Ic36b113bc4313c50e9d2bbab91199b3aa46d00dc
2022-01-28 16:05:51 +08:00
James Zern f7941622f2 Merge changes I2db20130,I4e643c83 into main
* changes:
  vp8dx.h,cosmetics: normalize #define/type order
  vp8dx.h: add missing define for VP9_SET_BYTE_ALIGNMENT
2022-01-27 21:16:55 +00:00
James Zern db7ef15281 Merge "fix some include guards" into main 2022-01-27 21:16:08 +00:00
Jerome Jiang 8a0af65f34 Use background segmentation mask with ROI
RTC sample encoder vpx_temporal_svc_encoder can take mask files as input
when ROI_MAP is set to 1.

Uses ROI and segmentation of vp9 to skip background encoding when
source_sad is low and the correspond block in previous frame is also
skipped.

Change-Id: I8590e6f9a88cecfa1d7f375d4cc480f0f2af87b6
2022-01-27 13:12:29 -08:00
James Zern 531c60e2a2 vp8dx.h,cosmetics: normalize #define/type order
Change-Id: I2db20130cc366bead5e576b375479917f9aee024
2022-01-26 19:44:33 -08:00
James Zern 9353509586 vp8dx.h: add missing define for VP9_SET_BYTE_ALIGNMENT
Change-Id: I4e643c837bb010bd58f4fc8179045f8df18f8ae1
2022-01-26 19:41:41 -08:00
James Zern ae5d16173d fix some include guards
Change-Id: I0233d352c134bdda3ca160d41b4671d1c45ab01c
2022-01-26 15:05:22 -08:00
James Zern a0233ebcd8 Merge "libwebm: update to libwebm-1.0.0.28-28-gee0bab5" into main 2022-01-26 18:40:27 +00:00
James Zern 395732f679 libwebm: update to libwebm-1.0.0.28-28-gee0bab5
https://chromium.googlesource.com/webm/libwebm/+log/206d268d4d8066e5a37c49025325b80c95c771dd..ee0bab576c338c9807249b99588e352b7268cb62

only one commit affects this snapshot:
ee0bab5 Revert "mkvmuxer,Cluster::Size: make uint64 conversion explicit"

Change-Id: Ib1f21fc5589098af346d110ff88c94bb1ba0a027
2022-01-25 20:06:59 -08:00
Jianhui Dai 82014b6675 Reland "Add vp9 ref frame to flag map function"
Original change's description:
> Add vp9 ref frame to flag map function
>
> Change-Id: I371c2346b9e0153c0f8053cab399ce14cd286c56

Change-Id: I04a407ee0ef66c01a0d224b4468e043213f8791f
2022-01-19 11:11:51 +08:00
Jerome Jiang 51415c4076 Revert "Set unused reference frames to first ref"
This reverts commit e7f33a53cf.

Change-Id: I54e807220885cb78af6f3c6e48b3eb2c9f1e70b4
2022-01-11 08:49:12 -08:00
Jerome Jiang 6982214de5 Revert "Add vp9 ref frame to flag map function"
This reverts commit 44e611482e.

Change-Id: Ic900cc01be4de7983fab42178a488277efab77b3
2022-01-11 08:47:34 -08:00
Jianhui Dai 44e611482e Add vp9 ref frame to flag map function
Change-Id: I371c2346b9e0153c0f8053cab399ce14cd286c56
2022-01-01 12:34:28 +08:00
James Zern 6b68c81892 Merge "vp9_prob_diff_update_savings_search_model: quiet conv warnings" into main 2021-12-22 03:03:11 +00:00
James Zern b685d6f02f vp9_prob_diff_update_savings_search_model: quiet conv warnings
under Visual Studio:
Warning C4244 '=': conversion from 'int64_t' to 'vpx_prob', possible loss of
data

after:
ea042a676 vp9 encoder: fix integer overflows

'newp' has already been range checked earlier in the loop so the cast won't
have any unexpected results

Change-Id: Ic10877db2c0633d53fffdf8852d5095403c23a02
2021-12-21 13:33:58 -08:00
James Zern 94972ca7ea vpx_int_pro_row: normalize declaration w/aom
this is a followup to:
  7fbcee49d quiet -Warray-parameter warnings
and conforms to aom in:
  06e13e817 quiet -Warray-parameter warnings

the sad functions are more varied in libvpx and will require a separate
pass

Change-Id: I765fd6704df615e836ba0b184ff8266ce926c394
2021-12-21 11:45:56 -08:00
Fyodor Kyslov e478500ff2 Merge "vp9 encoder: fix test failure on 32 bit arch" into main 2021-12-16 22:29:13 +00:00
Fyodor Kyslov 6bf761a7ef vp9 encoder: fix test failure on 32 bit arch
test fails with memory error. Reducing testing resolution

bug: webm:1750
Change-Id: I75664088022aa660bdf6e69de2d11121db44716f
2021-12-15 23:14:18 -08:00
Marco Paniconi 72d8c6616a Merge "Set unused reference frames to first ref" into main 2021-12-15 19:27:09 +00:00
Fyodor Kyslov 79d4362b12 Merge "vp9 encoder: fix integer overflows" into main 2021-12-15 02:51:31 +00:00
Fyodor Kyslov ea042a676e vp9 encoder: fix integer overflows
fixing integer overflow with 16K content and enabling the test

Bug: webm:1750
Fixed: webm:1750
Change-Id: I76eebd915bcae55bc755613251a98e1716dea4c0
2021-12-14 10:01:50 -08:00
Jianhui Dai e7f33a53cf Set unused reference frames to first ref
If a reference frame is not referenced, then set the index for that
reference to the first one used/referenced instead of unused slot.
Unused slot means key frame, as key frame resets all slots with itself.

This CL extracts `get_first_ref_frame()` from `reset_fb_idx_unused()`
with a typo fixing, and sets all unused reference frames to first ref in
vp9 uncompressed header.

Bug: webrtc:13442
Change-Id: I99523bc2ceedf27efe376d1113851ff342982181
2021-12-11 19:11:18 +08:00
James Zern 7f45e94d9a Merge "vp9_diamond_search_sad_avx: quiet -Wmaybe-uninitialized warning" into main 2021-12-10 19:42:06 +00:00
James Zern b2b4e79b75 Merge "vp[89]_initalize_enc(): protect against multiple invocations" into main 2021-12-10 18:32:44 +00:00
James Zern 03a8106846 vp[89]_initalize_enc(): protect against multiple invocations
this removes the burden from callers; the rtcd functions are left with a
mostly redundant (outside of tests) once() as top-level functions should
ensure their constraints are met

Change-Id: I5bdbcfa4671c6a1492cfe9c7d886c361c26caaa9
2021-12-09 18:37:01 -08:00
James Zern 3cff8be3d8 vp9_diamond_search_sad_avx: quiet -Wmaybe-uninitialized warning
w/gcc-11

v_these_mv_w is always initialized in this block with _mm_add_epi16();
converting this to a _mm_storeu_si32(tmp) call also works, but
introduces more stack usage

|| ../vp9/encoder/x86/vp9_diamond_search_sad_avx.c: In function
‘vp9_diamond_search_sad_avx’:
vp9/encoder/x86/vp9_diamond_search_sad_avx.c|285 col 19| warning:
‘v_these_mv_w’ may be used uninitialized [-Wmaybe-uninitialized]
||   285 |           new_bmv = ((const int_mv *)&v_these_mv_w)[local_best_idx];
||       |           ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vp9/encoder/x86/vp9_diamond_search_sad_avx.c|149 col 21| note:
‘v_these_mv_w’ declared here
||   149 |       const __m128i v_these_mv_w = _mm_add_epi16(v_bmv_w, v_ss_mv_w);
||       |                     ^~~~~~~~~~~~

Change-Id: I1cd2fcb41030db16f51c94f3a70eb8eb2a526401
2021-12-09 18:06:05 -08:00
James Zern f3e2a690cd vp9_bitstream.c: quiet -Wstringop-overflow warning
w/gcc-11

as noted in
the size of interp_filter_selected[][]'s first dimension varies between
VP9_COMP and VP9BitstreamWorkerData as noted in the latter's definition:
  // The size of interp_filter_selected in VP9_COMP is actually
  // MAX_REFERENCE_FRAMES x SWITCHABLE. But when encoding tiles, all we ever do
  // is increment the very first index (index 0) for the first dimension. Hence
  // this is sufficient.
  int interp_filter_selected[1][SWITCHABLE];

normalize the function signatures of write_modes*(), etc. to take this
into account.

vp9/encoder/vp9_bitstream.c|948 col 3| warning: ‘write_modes’ accessing
64 bytes in a region of size 16 [-Wstringop-overflow=]
||   948 |   write_modes(cpi, xd, &cpi->tile_data[data->tile_idx].tile_info,
||       |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
||   949 |               &data->bit_writer, tile_row, data->tile_idx,
||       |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
||   950 |               &data->max_mv_magnitude, data->interp_filter_selected);
||       |               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vp9/encoder/vp9_bitstream.c|948 col 3| note: referencing argument 8 of
type ‘int (*)[4]’
vp9/encoder/vp9_bitstream.c|488 col 13| note: in a call to function
‘write_modes’

Change-Id: I0898cd7c3431633c382a0c3a1be2f0a0bea8d0f9
2021-12-09 17:36:31 -08:00
James Zern 7fbcee49da quiet -Warray-parameter warnings
w/gcc-11
this matches the definition of the function with the declaration

Change-Id: I757b731b9560cb0b0ceec4ec258ec5af5a183b3d
2021-12-09 11:11:19 -08:00
James Zern fc46500115 Merge "test_intra_pred_speed: match above ext w/reconintra" into main 2021-12-09 18:03:55 +00:00
James Zern 093a8c4824 test_intra_pred_speed: match above ext w/reconintra
only 2 x block_size is needed

+ remove a related TODO; C & assembly rely on this extension

Change-Id: Iea430267624251cccbbdaec8045eb81d01ae1db1
2021-12-08 21:35:26 -08:00
James Zern 69146697b5 vp9_thread_test.cc: remove incorrect TODO
the row-based loop filter is ok (and being used) in this case; since
it's serialized the previous row will always be done

Change-Id: I024a0c78e7488178956cc22a4c4680a00dc6eade
2021-12-08 19:34:47 -08:00
James Zern ab35ee100a clear -Wextra-semi/-Wextra-semi-stmt warnings x2
some additional neon file updates after:
31b954deb clear -Wextra-semi/-Wextra-semi-stmt warnings

Bug: chromium:1257449
Change-Id: I3e2664f2bd8f6f7328ec91bf6595ba5fc09862bd
2021-12-07 13:26:30 -08:00
James Zern 31b954debe clear -Wextra-semi/-Wextra-semi-stmt warnings
Bug: chromium:1257449
Change-Id: Ia9aafccc09b611521d4a7aedfe3723393a840c62
2021-12-02 16:53:20 -08:00
Matt Oliver 60906e44b4 project: Add appveyor VS2022 build. 2021-11-25 23:09:05 +11:00
James Zern 13f984c216 Merge "vp9 encoder: fix row-mt crash w/thread config change" into main 2021-11-19 03:06:17 +00:00
James Zern 1794f6db24 vp9 encoder: fix row-mt crash w/thread config change
previously row-mt would allocate thread data once, so increasing the
number of threads with a config change would cause a heap overflow.

Bug: chromium:1261415
Bug: chromium:1270689
Change-Id: I3c5ec8444ae91964fa34a19dd780bd2cbb0368bf
2021-11-18 14:47:27 -08:00
James Zern 13b7ae5d8f Merge "vp9 encoder: fix some integer overflows" into main 2021-11-18 22:35:20 +00:00
Johann Koenig 6627a7c35b Merge "replaced bsr() with get_msb() from bitops.h" into main 2021-11-18 07:13:09 +00:00
Fyodor Kyslov 2d9e4d3c7a vp9 encoder: fix some integer overflows
cap bitrate to 1000Mbps, change bitsaving budget to int64_t

this make test coverage for 2048x2048 - same as for vp8

Bug: webm:1749
Fixed: webm:1749
Change-Id: Ic58d73cb7529b0826d1f501ad09af8e80f706a6e
2021-11-17 21:17:09 -08:00
Johann Koenig 52cfa1ca35 Merge "faster vp8_regular_quantize_b_sse4_1" into main 2021-11-18 04:41:58 +00:00
Ilya Kurdyukov 87ce2bc3e3 replaced bsr() with get_msb() from bitops.h
The modified line should now compile into two instructions instead of four.

Change-Id: Ie2eb6b13ff1e29b3107cb9e76f37ff9065504316
2021-11-18 03:24:52 +00:00
Ilya Kurdyukov 4a5a0a9a79 faster vp8_regular_quantize_b_sse4_1
Gives 10% faster VP8 encoding in simple tests.
This patch requires testing on wider datasets and encoder
settings to see if this speedup is achieved on most data.

Change-Id: If8e04819623e78fff126c413db66c964c0b4c11a
2021-11-17 04:07:50 +00:00
James Zern c0ba429863 encode_api_test.cc: unify kCodecs[] definitions
and rename the table to kCodecIfaces[] to be a little more specific and
avoid shadowing kCodecs[] in SetRoi()

Change-Id: I64905f48d8bf76e812bdba8374b82e3f7654686f
2021-11-16 19:59:58 -08:00
Johann Koenig 7a37d69cae Merge "MacOS 12 is darwin21" into main 2021-11-17 03:35:50 +00:00
Yunqing Wang 4569745514 Merge "test/DummyVideoSource::ReallocImage: check img_ alloc" into main 2021-11-16 21:47:09 +00:00
Johann c59de7bc91 MacOS 12 is darwin21
Remove -mmacosx-version-min. The library does not use
any calls which are affected by the platform version.
There is also no version 10.16 as it went from 10.15
to 11 and now to 12.

At some point it may be good to clarify that the bare
-darwin- target is for iOS and the -darwinN- targets
are for macOS.

Change-Id: I2fd5f7cae2637905acf3ab77bfddfbe367abbb68
2021-11-17 06:05:34 +09:00
Mikko Koivisto bf93b61f78 vp9: fix ubsan sub-overflows
Fix errors reported by UBSan diagnostics:
1. /vp9/encoder/vp9_pickmode.c:308:29: unsigned integer overflow:
   99 - 100 cannot be represented in type 'unsigned int'
2. /vp9/encoder/vp9_pickmode.c:330:27: unsigned integer overflow:
   21976 - 21978 cannot be represented in type 'unsigned int'
3. /vp9/encoder/vp9_pickmode.c:468:13: unsigned integer overflow:
   18852144 - 18852149 cannot be represented in type 'unsigned int'

(Notice that line numbers might vary a bit because fixes have
been applied incrementally i.e. fix for error #1 affects line
number reported in #2)

Fix by calculating difference instead of wrapping around to
a value near maximum.

Test: Cuttlefish webrtc with VP9 codec
Change-Id: I4f85712028647e915a4e2da31e4b0a266e9e2705
2021-11-16 13:11:38 +00:00
Mikko Koivisto 9fb780c5e7 vp9: Fix multiplication overflow
Fix UBSan error reported from aosp Cuttlefish device:
/vp9/encoder/vp9_ratectrl.c:238:33: unsigned integer overflow:
2500000 * 1800 cannot be represented in type 'unsigned int'

...by casting the operand and the result of multiplication
to 64bit integer.

Test: vp9 webrtc streaming with Cuttlefish
Change-Id: Id5bb3d4071a96179caffae0829d3cc4e48c7614b
2021-11-16 05:32:39 +00:00
Matt Oliver ff822c78c2 project: Support VS2022. 2021-11-14 22:11:10 +11:00
James Zern 7e4c6fed0c test/DummyVideoSource::ReallocImage: check img_ alloc
prevents a crash on the next line accessing img_ members

Bug: aomedia:3191
Change-Id: I430fb4ee662b0001629096eb8b554f8a2b30cce0
2021-11-11 13:43:29 -08:00
James Zern 2d2637547d Merge "update libwebm to libwebm-1.0.0.28-20-g206d268" into main 2021-11-11 21:09:38 +00:00
James Zern ec80f88c5d Merge changes I1425f12d,I1e9e9ffa,I6d8f676b,I92013086 into main
* changes:
  mem_sse2.h: loadu_uint32 -> loadu_int32
  mem_sse2.h: storeu_uint32 -> storeu_int32
  vp8: fix some implicit signed -> unsigned conv warnings
  video_source.h,ReallocImage: quiet implicit conv warning
2021-11-10 21:30:07 +00:00
James Zern 888bafc78d vp8 encoder: fix some integer overflows
cap the bitrate to 1000Mbps to avoid many instances of bitrate * 3 / 2
overflowing.

this adds coverage for 2048x2048 in the default test for VP8 with TODOs
for issues at that resolution for VP9 and at max resolution for both.

Bug: b/189602769
Bug: chromium:1264506
Bug: webm:1748
Bug: webm:1749
Bug: webm:1750
Bug: webm:1751
Change-Id: Iedee4dd8d3609c2504271f94d22433dfcd828429
2021-11-08 16:30:16 -08:00
James Zern 16333de289 mem_sse2.h: loadu_uint32 -> loadu_int32
this changes the return to int32_t which matches the type with usage
of this call as input to _mm_cvtsi32_si128(), _mm_set_epi32(), etc.
fixes implicit conversion warning with clang-11 -fsanitize=undefined

Change-Id: I1425f12d4f79155dd5d7af0eb00fbdb9f1940544
2021-11-08 13:43:09 -08:00
James Zern 2e73da326a mem_sse2.h: storeu_uint32 -> storeu_int32
this changes the parameter to int32_t which matches the type with usage
of this call using _mm_cvtsi128_si32() as a parameter. quiets an
implicit conversion warning with clang-11 -fsanitize=undefined

Change-Id: I1e9e9ffac5d2996962d29611458311221eca8ea0
2021-11-08 13:43:09 -08:00
James Zern 1676cddaaa vp8: fix some implicit signed -> unsigned conv warnings
and vice-versa mostly when dealing with bitmasks

w/clang-11 -fsanitize=undefined

Change-Id: I6d8f676bf87679ba1dad9cb7f55eea172103d9d3
2021-11-08 13:43:09 -08:00
James Zern 40c21ff6fe video_source.h,ReallocImage: quiet implicit conv warning
with -fsanitize=undefined

test/video_source.h:194:33: runtime error: implicit conversion from type
'int' of value -32 (32-bit, signed) to type 'unsigned int' changed the
value to 4294967264 (32-bit, unsigned)

Change-Id: I92013086d517fecf01c9e4cdfe6737b8ce733a1f
2021-11-08 13:43:09 -08:00
James Zern 23796337ce vp8,calc_pframe_target_size: fix integer overflow
this is similar to the fix for calc_iframe_target_size:
5f345a924 Avoid overflow in calc_iframe_target_size

Bug: chromium:1264506
Change-Id: I2f0e161cf9da59ca0724692d581f1594c8098ebb
2021-11-08 13:09:21 -08:00
James Zern 69d08cb9d3 vp8_update_rate_correction_factors: fix integer overflow
the intermediate value in the correction_factor calculation may exceed
integer bounds

Bug: b/189602769
Change-Id: I75726b12f3095663911d78333f3ea26eb6dee21e
2021-11-08 13:04:09 -08:00
James Zern ca93fc740e update libwebm to libwebm-1.0.0.28-20-g206d268
picks up Android.mk license updates from AOSP and fixes as part of the
1.0.0.28 release

changelog:
https://chromium.googlesource.com/webm/libwebm/+log/37d9b86..206d268

Change-Id: I18d5238f7d1aff2678d903018929da952410fa0e
2021-11-04 16:41:33 -07:00
James Zern fd2fcaecb8 Merge "update tools/cpplint.py" into main 2021-11-04 20:01:01 +00:00
James Zern f3b95b1f56 update tools/cpplint.py
https://github.com/google/styleguide.git
100755 blob 4a82bde4f95cef8103520bc2c019483397ec51f4    cpplint/cpplint.py

Bug: aomedia:3178
Change-Id: I9e11d647096fc2082b18d74731026dabb52639bb
2021-11-03 16:23:06 -07:00
James Zern dd10ac8f69 tools_common.h: add VPX_TOOLS_FORMAT_PRINTF
and use it to set the format attribute for printf like functions. this
allows the examples to be built with -Wformat-nonliteral without
producing warnings.

Bug: webm:1744
Change-Id: I26b4c41c9a42790053b1ae0e4a678af8f2cd1d82
Fixed: webm:1744
2021-11-02 17:21:56 -07:00
James Zern 340f60524f vpx_codec_internal.h: add LIBVPX_FORMAT_PRINTF
and use it to set the format attribute for the printf like function
vpx_internal_error(). this allows the main library to be built with
-Wformat-nonliteral without producing warnings; the examples will be
handled in a followup.

Bug: webm:1744
Change-Id: Iebc322e24db35d902c5a2b1ed767d2e10e9c91b9
2021-11-02 17:21:30 -07:00
Matt Oliver b78dc572e8 project: Update for 1.11.0 merge. 2021-10-20 06:49:33 +11:00
Matt Oliver a67686b374 Merge commit '626ff35955c2c35b806b3e0ecf551a1a8611cdbf' 2021-10-20 06:47:02 +11:00
Matt Oliver 97257813a2 project: Correctly export DATA types. 2021-10-20 04:32:25 +11:00
James Zern c56ab7d0c6 Merge "vp8_yv12_realloc_frame_buffer: move allocation check" into main 2021-10-15 20:08:58 +00:00
James Zern e259e6951d test/Android.mk: import LICENSE indicators from AOSP
https://android-review.googlesource.com/c/platform/external/libvpx/+/1853628
https://android.googlesource.com/platform/external/libvpx/+/e40f8afb1e51d3bd13d662c1881e3cfb616fa2b8

Change-Id: I15f185ab7c7661f4456c4ad7296fdda01dfb8d53
2021-10-12 11:57:39 -07:00
James Zern 73400ca025 Merge "Android.mk: import LICENSE indicators from AOSP" into main 2021-10-11 20:32:51 +00:00
James Zern 9039995e94 Android.mk: import LICENSE indicators from AOSP
https://android-review.googlesource.com/c/platform/external/libvpx/+/1588942
https://android.googlesource.com/platform/external/libvpx/+/099828b5c770ef8630741721be4b6c25a8394204

Change-Id: Ieca1c882f82bcbc7546944b43af7fab358f925d2
2021-10-09 10:33:37 -07:00
James Zern 27b8a778bd vp8_yv12_realloc_frame_buffer: move allocation check
to before the memset used under msan to avoid any spurious reports in
OOM conditions

Change-Id: I0c4ee92829bbcb356e94f503a4615caf891bb49d
2021-10-08 16:26:26 -07:00
Jerome Jiang f644f5b75d Merge branch 'smew' into main
Bug: webm:1732

Change-Id: Id782a897d8005d316dc5b72859657c219edabf30
2021-10-07 10:48:36 -07:00
Jerome Jiang 626ff35955 Update AUTHORS and version info in libs.mk
Bug: webm:1732
Change-Id: I29ce77c7d02bd2f5cb0ef8412333df032744b668
2021-10-06 10:41:19 -07:00
James Zern 2ea1b908d8 {vp8,vp9}_set_roi_map: fix validation with INT_MIN
previously ranges were checked with abs() whose behavior is undefined
with INT_MIN. this fixes a crash when the original value is returned and
it later used as and offset into a table.

Bug: webm:1742
Change-Id: I345970b75c46699587a4fbc4a059e59277f4c2c8
2021-10-04 11:46:48 -07:00
Jerome Jiang 7aabd69682 Merge changes If2ef4400,I345970b7 into main
* changes:
  vpx_roi_map: add delta range info
  {vp8,vp9}_set_roi_map: fix validation with INT_MIN
2021-10-04 17:43:28 +00:00
James Zern fe3b58cffa vpx_roi_map: add delta range info
Change-Id: If2ef4400562075b4e7abadc01638a46c0c7f1859
2021-10-01 15:52:43 -07:00
James Zern fccaa5fa7a {vp8,vp9}_set_roi_map: fix validation with INT_MIN
previously ranges were checked with abs() whose behavior is undefined
with INT_MIN. this fixes a crash when the original value is returned and
it later used as and offset into a table.

Bug: webm:1742
Change-Id: I345970b75c46699587a4fbc4a059e59277f4c2c8
2021-10-01 15:52:38 -07:00
Marco Paniconi fd70206fa4 Merge "vp8: Condition decimation drop logic on drop_frames_allowed" into main 2021-10-01 22:25:12 +00:00
Marco Paniconi 167de33ca8 vp8: Condition decimation drop logic on drop_frames_allowed
This allows user to make sure frame will be encoded
when drop_frames is set off (on the fly), no matter
the state of the buffer.

Change-Id: Ia7b39b93fe3721dd586bdbede72c525db87b6890
2021-10-01 13:19:29 -07:00
Marco Paniconi f8733b3fb7 vp8: For screen mode: clip buffer from below
Condition already existed for screen content mode,
but only when frame-dropper was off. Remove the
frame drop condition.

Change-Id: Ie7357041f5ca05b01e78b4bd3b40da060382591b
2021-10-01 11:58:37 -07:00
Jerome Jiang 16837ae168 CHANGELOG for Smew v1.11.0
Bug: webm:1732
Change-Id: I6038f401cf1dfdcaca85b81d0b8b2c04967b44dd
2021-09-28 15:00:17 -07:00
Jerome Jiang d00e68ad87 Cap duration to avoid overflow
Bug: webm:1728
Change-Id: Id13475660fa921e8ddcc89847e978da4c8d85886
(cherry picked from commit 09775194ff)
2021-09-27 15:18:25 -07:00
Wan-Teh Chang 5df4195b43 Define the VPX_NO_RETURN macro for MSVC
Define VPX_NO_RETURN as __declspec(noreturn) for MSVC. See
https://docs.microsoft.com/en-us/cpp/cpp/noreturn?view=msvc-160

This requires moving VPX_NO_RETURN before function declarations because
__declspec(noreturn) must be placed there. Fortunately GCC's
__attribute__((noreturn)) can be placed either before or after function
declarations.

Change-Id: Id9bb0077e2a4f16ec2ca9c913dd93673a0e385cf
(cherry picked from commit 8a6fbc0b4e)
2021-09-27 15:17:19 -07:00
Jerome Jiang b68877a7eb vp8 rc: Clear system state at the end of calls
Clear system state at the end of rc calls to make sure the state is
consistent before and after

Change-Id: I59fe9c99485b1a8603c20db37961339b7575455f
2021-09-24 14:56:00 -07:00
Jerome Jiang f84650be2d Merge "vp8 rc: support temporal layers" into main 2021-09-23 22:10:57 +00:00
Jerome Jiang 0de415cf6a vp8 rc: support temporal layers
Change-Id: I2c7d5de0e17b072cb763f1659b1badce4fe0b82b
2021-09-23 13:30:49 -07:00
Jerome Jiang b247f6e8e4 Merge "Cap duration to avoid overflow" into main 2021-09-22 17:27:26 +00:00
Jerome Jiang 09775194ff Cap duration to avoid overflow
Bug: webm:1728
Change-Id: Id13475660fa921e8ddcc89847e978da4c8d85886
2021-09-21 10:11:51 -07:00
Jerome Jiang 7366195e5a vp8 rc: explicit cast to avoid VS build failure
Change-Id: I6a4daca12b79cf996964661e1af85aa6e258b446
2021-09-16 10:19:09 -07:00
Wan-Teh Chang 8a6fbc0b4e Define the VPX_NO_RETURN macro for MSVC
Define VPX_NO_RETURN as __declspec(noreturn) for MSVC. See
https://docs.microsoft.com/en-us/cpp/cpp/noreturn?view=msvc-160

This requires moving VPX_NO_RETURN before function declarations because
__declspec(noreturn) must be placed there. Fortunately GCC's
__attribute__((noreturn)) can be placed either before or after function
declarations.

Change-Id: Id9bb0077e2a4f16ec2ca9c913dd93673a0e385cf
2021-09-10 15:54:51 -07:00
Jerome Jiang 65a1751e5b Add vp8 support to rc lib
For 1 layer CBR only.
Support for temporal layers comes later.

Rename the library to libvpxrc

Bug: b/188853141

Change-Id: Ib7f977b64c05b1a0596870cb7f8e6768cb483850
2021-09-10 10:46:26 -07:00
Jerome Jiang ca40ca9bed vp8 rc: always update correction factor
Change-Id: Id40b9cb5a85a15fb313a2a93f14f6768259f7c15
2021-09-08 16:52:51 -07:00
Jerome Jiang ee73384f03 Add codec control for vp8 external rc
disable cyclic refresh

Change-Id: I7905602919d5780831fad840577e97730ce0afc2
2021-09-02 16:20:24 -07:00
Jerome Jiang 59c9e1d87e vp9 rc lib: Allow aq 3 to work for SVC with unit test
Also use round to cast float to int with more accurate calculation to
avoid error accumulation which causes qp to be different after ~290
frames.

Change-Id: Iff65a8fdc67401814fd253dbf148afe9887df97f
2021-08-24 16:09:40 -07:00
James Zern 15a75b4530 Merge "vpx_ports/x86.h: sync with aom_ports/x86.h" into main 2021-07-30 00:48:08 +00:00
Hirokazu Honda f685d508da vp9 rc: Fills VP9_COMP zero at initialization
Change-Id: Ib1a544ce87e8fdbe23c0e54b6426ee228011b126
2021-07-30 02:42:35 +09:00
James Zern 0d1aec7373 vpx_ports/x86.h: sync with aom_ports/x86.h
adds a few comments and makes the file ascii:
854b2766a Replace non-ASCII characters

Change-Id: I6c2d76b293158bcad9f1ded7a91a81bda1e700fb
2021-07-26 16:52:56 -07:00
Peter Kasting fc04a9491e Fix some instances of -Wunused-but-set-variable.
Bug: chromium:1203071
Change-Id: Ieb628f95d676ba3814b5caf8a02a884330928c77
2021-07-26 22:01:01 +00:00
Yunqing Wang 3db0921ec3 Merge "Remove unused old FP_MB_STATS code" into main 2021-07-26 20:13:38 +00:00
Yunqing Wang e050db0b8c Merge "Clean up allow_partition_search_skip code" into main 2021-07-26 19:19:02 +00:00
Yunqing Wang 977e77006e Merge "Disable allow_partition_search_skip feature" into main 2021-07-25 22:42:59 +00:00
Yunqing Wang 0973ac05ba Remove unused old FP_MB_STATS code
Change-Id: I78ac1f8ce1598de295efd2ac1fe8244072d9b501
2021-07-23 22:49:09 -07:00
Yunqing Wang 7c00f0ce18 Clean up allow_partition_search_skip code
Change-Id: Ia05157fc3e613d93f10df5abddd77a740a0005ca
2021-07-23 22:37:01 -07:00
Yunqing Wang cf64eb2805 Disable allow_partition_search_skip feature
This feature was added to help speed up still images and slideshows.
It didn't work anymore, and thus was disabled. Code cleanup will
follow.

This had negligible impact to regular test sets. Borg test result
on ugc360p set at speed 3.
  avg_psnr:  ovr_psnr:  ssim:    speed:
   -0.244    -0.278    -0.153    -0.973

Change-Id: If74edabce0c93be1361e645ffd2eec063c2db76b
2021-07-23 22:36:41 -07:00
Jerome Jiang 1d13d705a5 Merge "Add control to get QP for all spatial layers" into main 2021-07-23 18:20:39 +00:00
Jerome Jiang 4a4ea28a38 Add control to get QP for all spatial layers
Change-Id: I77a9884351e71649c8f8632293d9515c60f6adbc
2021-07-22 17:02:12 -07:00
Jerome Jiang c382caa4d1 Merge "Use round to be more accurate casting float to int" into main 2021-07-22 17:07:58 +00:00
Jerome Jiang cd260eba10 Add cyclic refresh to vp9 rtc external ratecontrol
Change-Id: Ia2a881399aa31ca0f34481b975362ddd4ad87f1c
2021-07-21 10:38:06 -07:00
Jerome Jiang 6b4b82fd7a Use round to be more accurate casting float to int
Change-Id: Ifd5961917831752b176dd75d39d6b2cba6ce72fa
2021-07-20 17:02:38 -07:00
Jerome Jiang 0702d5ab27 Merge "Refactor rtc rate control test" into main 2021-07-19 21:00:35 +00:00
Jerome Jiang f9b565f7ec Refactor rtc rate control test
Remove golden files. Run actual encoding as the ground truth.

Change-Id: I1cea001278c1e9409bb02d33823cf69192c790a4
2021-07-19 12:44:29 -07:00
Bohan Li b1f2532b4d Avoid chroma resampling for 420mpeg2 input
BUG=aomedia:3080

Change-Id: I4ed81abf4b799224085485560f675c10c318cde6
2021-07-15 23:06:56 +00:00
Jerome Jiang 76ad30b6fb Add codec control for rtc external ratectrl lib
This will do 3 things:

Turn off low motion computation
Turn off gf update constrain on key frame frequency
turn off content mode for cyclic refresh

Those are used to verify the external ratectrl lib works as expected.

Change-Id: Ic6e61498de82d6b3973e58df246cf5e05f838680
2021-07-13 21:51:37 -07:00
Wan-Teh Chang 69fc604636 Check for addition overflows in vpx_img_set_rect()
Check for x + w and y + h overflows in vpx_img_set_rect().

Move the declaration of the local variable 'data' to the block it is
used in.

Change-Id: I6bda875e1853c03135ec6ce29015bcc78bb8b7ba
2021-07-09 12:48:00 -07:00
Wan-Teh Chang df7dc31cdf Document vpx_img_set_rect() more precisely
Document the side effects and return value of vpx_img_set_rect() more
precisely.

Change-Id: Id1120bc478ff090a70b4ddd23c4798026bbefe10
2021-07-09 12:45:55 -07:00
Yaowu Xu 826d7ca4d9 Merge "Avoid overflow in calc_iframe_target_size" into main 2021-07-08 19:59:34 +00:00
Jerome Jiang 002f14078f Merge "Add codec control to get loopfilter level" into main 2021-07-02 22:59:08 +00:00
Jerome Jiang c64022fa3c Add codec control to get loopfilter level
Change-Id: I70d417da900082160e7ba53315af98eceede257c
2021-07-02 11:33:04 -07:00
James Zern 350b0b47f2 ratectrl_rtc.h: quiet MSVC int64_t->int conv warning
target_bandwidth is int64_t, but layer_target_bitrate[0] is an int. this
is safe in the only place it's set because target_bandwidth defaults to
1000. target_bandwidth is later used to populate the cpi's target, which
is an unsigned int so there may be further fixes/cleanups that can be
done.

Change-Id: I35dbaa2e55a0fca22e0e2680dcac9ea4c6b2815a
2021-07-01 22:20:30 -07:00
Jorge E. Moreira 5f345a9246 Avoid overflow in calc_iframe_target_size
The changed product was observed to attempt to multiply 1800 by 2500000,
which overflows unsigned 32 bits. Converting to unsigned 64 bits first
and testing whether the final result fits in 32 bits solves the problem.

BUG=b:179686142

Change-Id: I5d27317bf14b0311b739144c451d8e172db01945
2021-07-01 20:32:22 -07:00
Marco Paniconi 40e7b4a5be Merge "vp9-rtc: Extract content dependency in cyclic refresh" into main 2021-06-29 18:34:46 +00:00
Cheng Chen 7c5574a635 Merge "Disallow skipping transform and quantization" into main 2021-06-29 16:48:29 +00:00
Matt Oliver 95fbe8c53f project: Update github token. 2021-06-27 14:27:12 +10:00
Cheng Chen fe1c7d2d8c Disallow skipping transform and quantization
The encoder has a feature to skip transform and quantization based
on model rd analysis. It could happen that the model
based analysis lets the encoder skips transform and quantization, while
a bad prediction occurs, leading to bad reconstructed blocks, which
are intrusive and apparently coding errors.

We add a speed feature to guard the skipping feature.
Due to the risk of bad perceptual quality, we disallow such skipping
by default.

On hdres test set, speed 2, the coding performance difference is 0.025%,
speed difference is 1.2%, which can be considered non significant.

BUG=webm:1729

Change-Id: I48af01ae8dcc7a76c05c695f3f3e68b866c89574
2021-06-25 16:31:58 -07:00
Marco Paniconi 67bfbcfbf7 vp9-rtc: Extract content dependency in cyclic refresh
For usage in the external RC. When content_mode = 0,
the cyclic refresh has no dependency on the content
(motion, spatial variance, motion vectors, etc,).

The content_mode = 0, when compared to content_mode = 1,
on rtc set for speed 7: has some regression on some
clips (~3-5%), but overall/average bdrate loss is
about ~1-2%.

Comparing aq_mode=3 with content_mode = 0, vs aq_mode=3:
about ~14% avg/overall bdrate gain, but has ~3-7% regression
on some hard motion clip (e.g.m street).

Change-Id: I93117fabb8f7f89032c15baf1292b201e8c07362
2021-06-25 13:24:12 -07:00
Jerome Jiang bd53f0cc9f Add constructor to VP9RateControlRtcConfig
Also add max_inter_bitrate_pct

Change-Id: Ie2c0e7f1397ca0bb55214251906412cdf24e42e2
2021-06-25 10:56:49 -07:00
Jerome Jiang eebc5cd487 Merge "rc: turn off gf constrain for external RC" into main 2021-06-22 22:13:38 +00:00
Jerome Jiang a00c56373e rc: turn off gf constrain for external RC
Added a new flag in rate control which turns off gf interval constrain
on key frame frequency for external RC.

It remains on for libvpx.

Change-Id: I18bb0d8247a421193f023619f906d0362b873b31
2021-06-22 13:05:25 -07:00
James Zern 7c1b546457 Merge "test-data.sha1: add missing sha sums" into main 2021-06-22 03:02:58 +00:00
Angie Chiang 49370b9c2a Merge changes I9f0852a0,Ieecb98a7 into main
* changes:
  Add use_simple_encode_api to oxcf
  Fix flaky assertions in SimpleEncode
2021-06-22 01:44:02 +00:00
Angie Chiang 0bb7bb6df8 Add use_simple_encode_api to oxcf
Use this flag to change the encoder behavior when
SimpleEncode APIs are used

BUG=webm:1733

Change-Id: I9f0852a03ff99faa01cdd8eee8ab71718cc58632
2021-06-21 14:39:40 -07:00
Angie Chiang a1fdfbb174 Fix flaky assertions in SimpleEncode
Bug: webm:1731

Change-Id: Ieecb98a7ac19e6291acd5d51432dc6a3789e9552
2021-06-21 14:38:13 -07:00
James Zern 9873d61b25 test-data.sha1: add missing sha sums
for rc_interface_test_one_layer_vbr and
rc_interface_test_one_layer_vbr_periodic_key added in:
1f45e7b07 vp9 rc: add vbr to rtc rate control library

Change-Id: I8bfa3698284c8ff289e830f7b8fa1ca42b752563
2021-06-21 13:33:44 -07:00
Jerome Jiang 4546b5db47 Merge "vp9 rc: add vbr to rtc rate control library" into main 2021-06-18 23:25:53 +00:00
Jerome Jiang 1f45e7b07e vp9 rc: add vbr to rtc rate control library
Change-Id: I3d2565572c2b905966d60bcaa6e5e6f057b1bd51
2021-06-18 14:47:43 -07:00
James Zern 2380e13da8 normalize vp9_calc_[ip]frame declarations and definitions
fixes warnings under visual studio:

vp9\encoder\vp9_ratectrl.c(2012): warning C4028: formal parameter 1
different from declaration
vp9\encoder\vp9_ratectrl.c(2027): warning C4028: formal parameter 1
different from declaration

Change-Id: Ia0740db597fb7a259f90d362b483f58662f9f584
2021-06-18 13:38:03 -07:00
Marco Paniconi 338013712e vp9: Adjust logic for gf update in 1 pass vbr
This reduces some regression when external RC
is used, for which avg_frame_low_motion is not
set/updated (=0).

Change-Id: I2408e62bd97592e892cefa0f183357c641aa5eea
2021-06-17 12:18:49 -07:00
Chunbo Hua 364f0e31fe Initialize VP9EncoderConfig profile and bit depth
Change-Id: I5c42013a08677cdef8d47f348458118338ff0138
2021-06-16 19:30:25 -07:00
Jerome Jiang a945f344e0 Change the data path in svc rate control test
Change-Id: Iba58e2aa2578964b5c8b48ab0acbee9b44bcdada
2021-06-15 14:55:29 -07:00
Marco Paniconi 9a25e3169b vp9-rtc: Refactor 1 pass vbr rate control
This refactoring is needed to allow the
RC_rtc library to support VBR.

Change-Id: I863a4a65096fed06b02307098febf7976360e0f3
2021-06-14 16:11:23 -07:00
James Zern d85c54d4e8 Update some comments for rc_target_bitrate
this mirrors the change from libaom:
5b150b150 Update some comments for rc_target_bitrate

Change-Id: Iaabee5924e0320609a29dc8ab71327923fb4c5d2
2021-06-11 16:34:41 -07:00
James Zern 5d678fe78a simple_encode: fix some -Wsign-compare warnings
Bug: webm:1731
Change-Id: I1db777c0c3a8784fb3dcf7cd39f78ebf833ab915
2021-06-09 15:08:15 -07:00
James Zern 71d09c34ff simple_encode_test: fix input file path
this allows the file to be located in LIBVPX_TEST_DATA_PATH similar to
other test sources.

Bug: webm:1731
Change-Id: I51606635d91871e7c179aa8d20d4841b0d60b6ad
2021-06-09 14:54:35 -07:00
Cheng Chen 463d33145d L2E: properly init two pass rc parameters
Two pass rc parameters are only initialized in the second pass
in vp9 normal two pass encoding.
However, the simple_encode API queries the keyframe group, arf group,
and number of coding frames without going throught the two pass
route.
Since recent libvpx rc changes, parameters in the TWO_PASS
struct have a great influence on the determination of the above
information.
We therefore need to properly init two pass rc parameters in
the simple_encode related environment.

Change-Id: Ie14b86d6e7ebf171b638d2da24a7fdcf5a15c3d9
2021-06-07 17:41:23 -07:00
Cheng Chen b8273e8ae5 Fix simple encode
Properly init and delete cpi struct in simple encode functions.

Change-Id: I6e66bcac852cbb3dec9b754ba3fb01a348ac98b8
2021-05-26 15:17:18 -07:00
Chunbo Hua d42b93a15f Fixed redundant wording for decoder algorithm interface
Change-Id: Id56e03dc9cf6d4e70c4681896f29893a9b4c76f2
2021-05-26 02:05:25 -07:00
James Zern 76a4e44563 Merge changes I2e86b005,I971c6261,I87fe4dad
* changes:
  Use 'ptrdiff_t' instead of 'int' for pointer offset parameters
  Implement vpx_convolve8_avg_vert_neon using SDOT instruction
  Merge transpose and permute in Neon SDOT vertical convolution
2021-05-25 03:00:47 +00:00
James Zern 2c6a6171a7 Merge "img_alloc_helper: make align var unsigned" 2021-05-25 02:37:05 +00:00
Jonathan Wright dbda032fcf Use 'ptrdiff_t' instead of 'int' for pointer offset parameters
A number of the load/store functions in mem_neon.h use type 'int' for
the 'stride' pointer offset parameter. This causes Clang to generate
the following warning every time these functions are called with a
wider type passed in for 'stride':

warning: implicit conversion loses integer precision: 'ptrdiff_t'
(aka 'long') to 'int' [-Wshorten-64-to-32]

This patch changes all such instances of 'int' to 'ptrdiff_t'.

Bug: b/181236880
Change-Id: I2e86b005219e1fbb54f7cf2465e918b7c077f7ee
2021-05-24 17:12:23 -07:00
Jonathan Wright 35bce9389e Implement vpx_convolve8_avg_vert_neon using SDOT instruction
Add an alternative AArch64 implementation of
vpx_convolve8_avg_vert_neon for targets that implement the Armv8.4-A
SDOT (signed dot product) instruction.

The existing MLA-based implementation of vpx_convolve8_avg_vert_neon
is retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.

Bug: b/181236880
Change-Id: I971c626116155e1384bff4c76fd3420312c7a15b
2021-05-24 17:12:19 -07:00
Jonathan Wright 10823f5468 Merge transpose and permute in Neon SDOT vertical convolution
The original dot-product implementation of vpx_convolve8_vert_neon
used a separate transpose before and after the convolution operation.
This patch merges the first transpose with the TBL permute (necessary
before using SDOT to compute the convolution) to significantly reduce
the amount of data re-arrangement. This new approach also allows for
more effective data re-use between loop iterations.

Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>

Bug: b/181236880
Change-Id: I87fe4dadd312c3ad6216943b71a5410ddf4a1b5b
2021-05-24 17:08:32 -07:00
Jonathan Wright 66c1ff6850 Implement vpx_convolve8_avg_horiz_neon using SDOT instruction
Add an alternative AArch64 implementation of
vpx_convolve8_avg_horiz_neon for targets that implement the Armv8.4-A
SDOT (signed dot product) instruction.

The existing MLA-based implementation of vpx_convolve8_avg_horiz_neon
is retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.

Bug: b/181236880
Change-Id: Ib435107c47c485f325248da87ba5618d68b0c8ed
2021-05-18 13:33:46 -07:00
Jonathan Wright 4808d831db Optimize remaining mse and sse functions in variance_neon.c
Implement sum of squared difference calculations in vpx_mse16x16_neon
and vpx_get4x4sse_cs_neon using the ABD and UDOT instructions -
instead of widening subtracts followed by a sequence of MLAs.

The existing implementation is retained for use on CPUs that do not
implement the Armv8.4-A UDOT instruction. This commit also updates
the variable names used in the existing implementations to be more
descriptive.

Bug: b/181236880
Change-Id: Id4ad8ea7c808af1ac9bb5f1b63327ab487e4b1c7
2021-05-13 15:41:15 -07:00
Jonathan Wright 231aa6ae32 Implement vertical convolution using Neon SDOT instruction
Add an alternative AArch64 implementation of vpx_convolve8_vert_neon
for targets that implement the Armv8.4-A SDOT (signed dot product)
instruction.

The existing MLA-based implementation of vpx_convolve8_vert_neon is
retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.

Bug: b/181236880
Change-Id: Iebb8c77aba1d45b553b5112f3d87071fef3076f0
2021-05-12 14:03:52 -07:00
Jonathan Wright c8b0432505 Implement Neon variance functions using UDOT instruction
Accelerate Neon variance functions by implementing the sum of squares
calculation using the Armv8.4-A UDOT instruction instead of 4 MLAs.

The previous implementation is retained for use on CPUs that do not
implement the Armv8.4-A dot product instructions.

Bug: b/181236880
Change-Id: I9ab3d52634278b9b6f0011f39390a1195210bc75
2021-05-12 14:03:05 -07:00
Jonathan Wright 2db85c269b Use ABD and UDOT to implement Neon sad_4d functions
Implementing sad16_neon using ABD, UDOT instead of ABAL, ABAL2 saves
a cycle and removes resource contention for a single SIMD pipe on
modern out-of-order Arm CPUs. The UDOT accumulation into 32-bit
elements also allows for a faster reduction at the end of each SAD
function.

The existing implementation is retained for CPUs that do not
implement the Armv8.4-A UDOT instruction, and CPUs executing in
AArch32 mode.

Bug: b/181236880
Change-Id: Ibd0da46e86751d2f808c7b1e424f82b046a1aa6f
2021-05-10 15:20:29 +01:00
Jonathan Wright 0f563e5fad Optimize Neon reductions in sum_neon.h using ADDV instruction
Use the AArch64-only ADDV and ADDLV instructions to accelerate
reductions that add across a Neon vector in sum_neon.h. This commit
also refactors the inline functions to return a scalar instead of a
vector - allowing for optimization of the surrounding code at each
call site.

Bug: b/181236880
Change-Id: Ieed2a2dd3c74f8a52957bf404141ffc044bd5d79
2021-05-09 20:12:48 +01:00
James Zern 43df64a9ac img_alloc_helper: make align var unsigned
quiets an integer sanitizer warning:
vpx/src/vpx_image.c:101:25: runtime error: implicit conversion from
type 'int' of value -2 (32-bit, signed) to type 'unsigned int' changed
the value to 4294967294 (32-bit, unsigned)

Change-Id: Ifeac31cc80811081c1ba10aadaa94dc36cd46efa
2021-05-07 19:35:25 -07:00
Jonathan Wright f7364c0574 Manually unroll the inner loop of Neon sad16x_4d()
Manually unrolling the inner loop is sufficient to stop the compiler
getting confused and emitting inefficient code.

Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>

Bug: b/181236880
Change-Id: I860768ce0e6c0e0b6286d3fc1b94f0eae95d0a1a
2021-05-07 11:49:37 -07:00
Jonathan Wright a28d43658e Optimize Neon SAD reductions using wider ADDP instruction
Implement AArch64-only paths for each of the Neon SAD reduction
functions, making use of a wider pairwise addition instruction only
available on AArch64.

This change removes the need for shuffling between high and low
halves of Neon vectors - resulting in a faster reduction that requires
fewer instructions.

Bug: b/181236880
Change-Id: I1c48580b4aec27222538eeab44e38ecc1f2009dc
2021-05-07 11:49:26 -07:00
James Zern 7ef83148cf Merge "Implement horizontal convolution using Neon SDOT instruction" 2021-05-05 19:57:10 +00:00
Jonathan Wright c1f77a3689 Implement horizontal convolution using Neon SDOT instruction
Add an alternative AArch64 implementation of vpx_convolve8_horiz_neon
for targets that implement the Armv8.4-A SDOT (signed dot product)
instruction.

The existing MLA-based implementation of vpx_convolve8_horiz_neon is
retained and used on target CPUs that do not implement the SDOT
instruction (or CPUs executing in AArch32 mode). The availability of
the SDOT instruction is indicated by the feature macro
__ARM_FEATURE_DOTPROD.

Co-authored by: James Greenhalgh <james.greenhalgh@arm.com>

Change-Id: I5337286b0f5f2775ad7cdbc0174785ae694363cc
2021-05-05 16:07:18 +01:00
James Zern 12a1491394 vp9_denoiser_neon,horizontal_add_s8x16: use vaddlv w/aarch64
this reduces the number of instructions to compute the sum

Change-Id: Icae4d4fb3e343d5b6e5a095c60ac6d171b3e7d54
2021-05-04 12:28:13 -07:00
James Zern abc7105acd test.mk: enable vp9_denoiser_test w/NEON
this file uses GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST so it's
safe to enable unconditionally. the filter check fell out of sync with
the code, there's a sse2 and neon implementation for the filter.

Change-Id: I2a3336ccef3fb524ca5d9b8f88279240c9a276aa
2021-05-04 12:28:01 -07:00
Paul Wilkins 2eb934d9c1 Merge "Add assert for zero_motion_factor range" 2021-04-29 16:31:52 +00:00
Paul Wilkins e14026ac21 Add assert for zero_motion_factor range
Change clamp to an assert so we are warned if changes to input
ranges or defaults in the future lead to an invalid value.

Change-Id: Idb4e0729f477a519bfff3083cdce3891e2fc6faa
2021-04-29 11:12:40 +01:00
Cheng Chen 4ec84326cc Bump ABI version
Due to recent changes to command line options for rate control
parameters.

Change-Id: I1de7cb4ff2850a3ed19ec216dd9d07f64a118e92
2021-04-28 13:54:07 -07:00
James Zern c4fceb9c70 Merge changes Iebe9842f,I174b67a5,I80ed1a16
* changes:
  vpx_convolve_neon: prefer != 0 to > 0 in tests
  vpx_convolve_avg_neon: prefer != 0 to > 0 in tests
  vpx_convolve_copy_neon: prefer != 0 to > 0 in tests
2021-04-28 17:30:48 +00:00
James Zern 55643371e1 Merge "vp8: enc: Fix valid range for under/over_shoot pct" 2021-04-28 02:27:54 +00:00
James Zern ff67c84811 vpx_convolve_neon: prefer != 0 to > 0 in tests
this produces better assembly code; the horizontal convolve is called
with an adjusted intermediate_height where it may over process some rows
so the checks in those functions remain.

Change-Id: Iebe9842f2a13a4960d9a5addde9489452f5ce33a
2021-04-27 18:04:08 -07:00
James Zern 07cf024d4d vpx_convolve_avg_neon: prefer != 0 to > 0 in tests
this produces better assembly code

Change-Id: I174b67a595d7efeb60c921f066302043b1c7d84e
2021-04-27 18:03:30 -07:00
James Zern 5ca8c5f31d vpx_convolve_copy_neon: prefer != 0 to > 0 in tests
this produces better assembly code

Change-Id: I80ed1a165512e941b35a4965faa0c44403357e91
2021-04-27 18:02:35 -07:00
Paul Wilkins 9f57bc4d6c Add limits to Vizier input parameters.
Imposed provisional upper and lower limits to each parameter
that can be adjusted in the Vizier ML experiment.

Also in some cases applied secondary limits on on the
range of the final "used" values.

Defaults and limits may well require further tuning after
subsequent rounds of experimentation.

Re-factor get_sr_decay_rate().

Change-Id: I28e804ce3d3710f30cd51a203348e4ab23ef06c0
2021-04-26 16:35:53 +01:00
James Zern c15555c62f sync CONTRIBUTING.md w/libwebm
Change-Id: I63ffea52d079b0d50002526e209ae3fb64811bac
2021-04-23 16:50:48 -07:00
Sreerenj Balachandran adc185feb7 vp8: enc: Fix valid range for under/over_shoot pct
The overshoot_pct & undershoot_pct attributes for rate control
are expressed as a percentage of the target bitrate, so the range
should be 0-100.

Change-Id: I67af3c8be7ab814c711c2eaf30786f1e2fa4f5a3
2021-04-21 11:58:27 -07:00
Paul Wilkins c911c2d9c5 Further normalization of Vizier parameters.
Further changes to normalize the Vizier command line parameters.
The intent is that the default behavior for any given parameter
is signaled by the value 1.0 (expressed on the command line as a
rational).

The final values used in the two pass code are obtained by multiplying
the passed in factors by a default values if use_vizier_rc_params is 1.
Where  use_vizier_rc_params is 0 the values are explicitly set to
the defaults.

This patch also changes the default value of each parameter to 1.0
even if not set explicitly. This should ensure safe /default behavior
if the user sets use_vizier_rc_params to 1 but does not set all the
the individual parameters.

Change-Id: Ied08b3c22df18f42f446a4cc9363473cad097f69
2021-04-20 17:51:05 +01:00
Cheng Chen 665cccfd6c Pass vizier rd parameter values
Add command line options for three rd parameters.
They are controlled by --use_vizier_rc_params, together with
other rc parameters.
If not set from command line, current default values will be used.

Change-Id: Ie1b9a98a50326551cc1d5940c4b637cb01a61aa0
2021-04-14 22:49:51 -07:00
Cheng Chen ba2bfcf2eb Merge "Set vizier rc parameters" 2021-04-14 16:31:40 +00:00
Cheng Chen 1b07aae9e4 Set vizier rc parameters
If pass --use-vizier-rc-params=1, the rc parameters are overwittern
by pass in values. It --use-vizier-rc-params=0, the rc parameters
remain the default values.

Change-Id: I7a3e806e0918f49e8970997379a6e99af6bb7cac
2021-04-13 15:26:51 -07:00
Paul Wilkins 461c1f1b89 Merge "Removed unused constant" 2021-04-13 19:04:49 +00:00
Paul Wilkins c77a7f6004 Removed unused constant
Deleted #define that is no longer referenced.

Change-Id: If0b132c5a40dd8910f535fffdee7d2d1c7df4748
2021-04-12 13:51:44 +01:00
James Zern 1c792f2991 vpx_image: clear user provided vpx_image_t early
this avoids uninitialized values and potential misuse of them which
could lead to a crash should the function fail

this is the same fix that was applied in libaom:
d0cac70b5 Fix a free on invalid ptr when img allocation fails

Bug: webm:1722
Change-Id: If7a8d08c4b010f12e2e1d848613c0fa7328f1f9c
2021-04-08 17:34:16 -07:00
Cheng Chen 72dc6478ac Merge "Fix compilation for CONFIG_RATE_CTRL" 2021-04-07 19:03:53 +00:00
Paul Wilkins 7b87b35153 Merge "Delete unused constants." 2021-04-07 12:53:07 +00:00
Paul Wilkins a40cb691b6 Merge "Change zm_factor for Vizier." 2021-04-07 12:52:47 +00:00
Cheng Chen 7a5596fa78 Fix compilation for CONFIG_RATE_CTRL
Recently, some function signatures have been changed.
This change fixes compilation error if --enable-rate-ctrl is used.

Change-Id: Ib8e9cb5e181ba1d4a6969883e377f3dd93e9289a
2021-04-06 20:28:32 -07:00
Cheng Chen 06fee5a89b Adjust end to end psnr value
A recent change leads to slight difference of encoding results:
d3aaac367 Change calculation of rd multiplier,
which is caught by Jenkins nightly test.

Adjust the threshold to silence the test failure.

BUG=webm:1725

Change-Id: I7e8b3a26b72c831ae4d88d0fca681b354314739d
2021-04-07 03:15:41 +00:00
Paul Wilkins ab4383063c Change zm_factor for Vizier.
Changes the exposed zm_factor parameter.

This patch alters the meaning of the zm_factor
parameter that will be exposed for the Vizier project.

The previous power factor was hard to interpret in terms
of its meaning and effect and has been replaced by a linear factor.
Given that the initial Vizier results suggested a lower zero motion
effect for all formats, the default impact has been reduced.

The patch as it stands gives a modest improvement for PSNR
but is slightly down on some sets for SSIM

(overall psnr, ssim % bdrate change: -ve is better)

lowres    -0.111, 0.001
ugc360p   -0.282, -0.068
midres2   -0.183, 0.059
hdres2    -0.042, 0.172

Change-Id: Id6566433ceed8470d5fad1f30282daed56de385d
2021-04-06 21:04:48 +01:00
Paul Wilkins 0d05ca39f2 Delete unused constants.
Delete some #defines that are no longer needed.

Change-Id: I9e4e4df10716598b0d62b0c70f538d4b78a32296
2021-04-06 20:49:30 +01:00
Cheng Chen afe1ba7f3f Pass vizier rc parameter values with range check
This is similar to the change:
https://chromium-review.googlesource.com/c/webm/libvpx/+/2771081
Which fails libvpx nightly test.

Here we add range check to get rid of the warning of
"divided by zero".

BUG=webm:1723

Change-Id: I7712efe7abd4b11cdb725643d51fd1c0a300d924
2021-04-05 21:27:42 +00:00
Tom Finegan 8b3e575a45 Revert "Pass vizier rc parameter values from command line to twopass"
This reverts commit f32829a2e5.

BUG=webm:1723

Change-Id: I866cdf288f9873c350b32091515a6d5f4df362a3
2021-04-02 09:41:44 -07:00
Paul Wilkins 806faf17f6 Merge "Change calculation of rd multiplier." 2021-04-01 13:55:21 +00:00
James Zern f510019140 Merge "vp9_ext_ratectrl_test: use uintptr_t for void* value" 2021-03-31 22:49:26 +00:00
Paul Wilkins 939e8d291d Merge "Convert Vizier RD parameters to normalized factors" 2021-03-31 16:28:30 +00:00
Paul Wilkins d3aaac367b Change calculation of rd multiplier.
Change the way the rd multiplier is adjusted for Q and frame type.

Previously in VP9 the rd multiplier was adjusted based on crude Q bins
and whether the frame was a key frame or inter frame.

The Q bins create some problems as they potentially introduce
discontinuities in the RD curve. For example, rate rising with a
stepwise increase in Q instead of falling. As such, in AV1 they
have been removed.

A further issue was identified when examining the first round of
results from from the Vizier project. Here the multiplier for each Q bin
and each frame type was optimized for a training set, for various video
formats, using single point encodes at the appropriate YT rates.

These initial results appeared to show a trend for increased rd
multiplier at higher Q for key frames. This fits with intuition as in
this encoding context a higher Q indicates that a clip is harder to
encode and frames  less well predicted.  However, the situation
appeared to reverse for inter frames with higher rd multipliers
chosen at low Q.

My initial suspicion was that this was a result of over fitting, but on
closer analysis I realized that this may be more related to frame type
within the broader inter frame classification. Specifically frames coded
at low Q are predominantly ARF frames, for the mid Q bin there will
likely be a mix of ARF and normal inter frames, and for the high Q bin
the frames will almost exclusively be normal inter frames from difficult
content.

ARF frames are inherently less well predicted than other inter frames
being further apart and not having access to as many prediction modes.
We also know from previous work that ARF frames have a higher
incidence of INTRA coding and may well behave more like key frames
in this context.

This patch replaces the bin based approach with a linear function
that applies a small but smooth Q based adjustment. It also splits
ARF frames and normal inter frames into separate categories.

With this done number of parameters that will be exposed for the
next round of Vizier training is reduced from 7 to 3 (one adjustment
factor each for inter, ARF and key frames)

This patch gives net BDATE gains for our test sets even with the
baseline / default factors as follows: (% BDRATE change in overall
PSNR and SSIM, -ve is better)

LowRes 		-0.231, -0.050
ugc360p		 0.160,  -0.315
midres2		-0.348, -1.170
hdres2		-0.407, -0.691

Change-Id: I46dd2fea77b1c2849c122f10fd0df74bbd3fcc7f
2021-03-31 17:08:04 +01:00
Cheng Chen f32829a2e5 Pass vizier rc parameter values from command line to twopass
Change-Id: I02eabeccf2fe4604875820d38e23c2586a63e290
2021-03-29 21:05:31 -07:00
Cheng Chen c19de35ed2 Add command line options for a few rc parameters
These rate control parameters are for the Vizier experiment.
They are defined as rational numbers.

Change-Id: I23f382dd49158db463b75b5ad8a82d8e0d536308
2021-03-29 20:55:37 -07:00
Matt Oliver 60c38f8a9a project: Update for 1.10.0 merge. 2021-03-29 22:01:45 +11:00
Matt Oliver 57b3cf86b4 Merge commit 'b41ffb53f1000ab2227c1736d8c1355aa5081c40' 2021-03-29 19:37:31 +11:00
James Zern deef895506 vp9_ext_ratectrl_test: use uintptr_t for void* value
this avoids a warning about differences in size between void* and
unsigned int under msvc:
vp9_ext_ratectrl_test.cc(40,3): warning C4312: 'reinterpret_cast':
conversion from 'const unsigned int' to 'void *' of greater size

Change-Id: I5a412ec785ddcaeff2ec71bb83a6048505400293
2021-03-26 10:55:29 -07:00
Jerome Jiang d55cab425d Merge tag 'v1.10.0'
Release v1.10.0 Ruddy Duck

2021-03-09 v1.10.0 "Ruddy Duck"

  This maintenance release adds support for darwin20 and new codec controls, as
  well as numerous bug fixes.

  - Upgrading:

    New codec control is added to disable loopfilter for VP9.

    New encoder control is added to disable feature to increase Q on overshoot
    detection for CBR.

    Configure support for darwin20 is added.

    New codec control is added for VP9 rate control. The control ID of this
    interface is VP9E_SET_EXTERNAL_RATE_CONTROL. To make VP9 use a customized
    external rate control model, users will have to implement each callback
    function in vpx_rc_funcs_t and register them using libvpx API
    vpx_codec_control_() with the control ID.

  - Enhancement:

    Use -std=gnu++11 instead of -std=c++11 for c++ files.

  - Bug fixes:

    Override assembler with --as option of configure for MSVS.

    Fix several compilation issues with gcc 4.8.5.

    Fix to resetting rate control for temporal layers.

    Fix to the rate control stats of SVC example encoder when number of spatial
    layers is 1.

    Fix to reusing motion vectors from the base spatial layer in SVC.

    2 pass related flags removed from SVC example encoder.

Bug: webm:1712

Change-Id: I4d807da7aee5a4d9d7a7af66b927983622e9cefa
2021-03-24 16:50:50 -07:00
Paul Wilkins e37ee40f7e Convert Vizier RD parameters to normalized factors
This patch converts the Vizier custom RD multipliers, to factors
that adjust each RD multiplier either side of its default value, where
a factor of 1.0 will give the previous default  behavior.

Ultimately I would like to replace the multiple RD multipliers
triggered at different Q thresholds (eg, low, medium, high q)
with a function that adjusts the rd behavior smoothly as Q
changes.

Vizier could then be presented with a single adjustment control
for each of key frame and inter frame rd.

The current behavior is problematic.

Firstly having hard threshold Q values at which rd behavior changes
may cause anomalies in the rate distortion curve, where in some
situations, raising  Q, for example,  may not cause the expected drop
in rate and rise in distortion, because we have crossed a threshold
where the rate distortion multiplier changes sharply and this alters
the balance of bits spent in the prediction and residual parts of the
signal.

Having a single  value that is used for a range of Q index values
(eg 0-64), (65-128)  may also cause problems and over-fitting in
the context of the Vizier ML project. This project tries to optimize
the values for each Q range, for various YT formats, but does so
by analyzing the results of single point encodes on a set of clips.
For a given format all the clips are encoded with the same parameters
(target rate etc) so  there is likely to be clustering in regards to the
Q values used. For example the training set may give a new value
for the Q  range 0-64 but most of the data points used may have Q
close 64.

It will likely require several iterations working with the Vizier team
to get this right. This patch just gives an initial framework for
testing.

Change-Id: Iaa4cd5561b95a202bcae7a1d876c4f40ef444fa2
2021-03-23 15:05:32 +00:00
Paul Wilkins 90c1cc6515 Merge "Change SR_diff calculation and representation" 2021-03-19 19:44:38 +00:00
Adam B. Goode b41ffb53f1 Msvc builds convert to windows path w/msys env
Bug: webm:1720
Change-Id: I56689ad408f8086c511e1711dfa9c8d404727b2e
(cherry picked from commit 04086a3066)
2021-03-18 12:59:46 -07:00
Paul Wilkins b5e754a840 Change SR_diff calculation and representation
This patch changes the way prediction decay is calculated.

We expect that frames that are further from an ALT-REF frame (or Golden
 Frame) will be less well predicted by that ALT-REF frame. As such it is
desirable that they should contribute less to the boost calculation used
to assign bits to the ALT_REF.

This code looks at the reduction in prediction quality between the last
frame and the second reference frame (usually two frames old). We make
the assumption that we can accumulate this to get a proxy for the likely
loss of prediction quality over multiple frames.

Previously the calculation looked at the absolute difference in the
coded errors. The issue here is that the meaning of a unit difference
is not the same for very complex frames as it is for easy frames.

In this patch we scale the decay value based on how the error difference
compares to the overall frame complexity as represented by the intra
coding error.

This was tuned experimentally to give  test results that
were approximately neutral for our various test sets. There was
 a slight drop in Overall PSNR but a consistent improvement in
SSIM. This balance may be improved with tuning further as it is
noteworthy that it was much better on the hd_res set.

Results (Overall PSNR, SSIM -ve better) for low_res, ugc360, midres2,
ugc480P and hd_res are as follows:

0.173	-0.688
0.118	-0.153
0.132	-0.239
0.261	-0.405
-0.305	-1.109

As part of this adjustment the contribution of motion amplitude was
removed.

This patch also changes the control mechanism that will be exposed
on the command line for use by the Vizier project. The control is now
a linear factor which defaults to 1.0, where values < 1.0 mean a lower
decay rate and values > 1.0 mean an increased decay rate.

This presents a more easily understandable interface for use in
optimizing the decay behavior for various formats, where it is clear
what a passed in value means relative to the default.

With the new decay mechanism the current values for various formats
are almost certainly wrong and we still need to define sensible upper
and lower bounds for use during future training.

Change-Id: Ib1074bbea97c725cdbf25772ee8ed66831461ce3
2021-03-18 14:33:44 +00:00
Adam B. Goode 04086a3066 Msvc builds convert to windows path w/msys env
Bug: webm:1720
Change-Id: I56689ad408f8086c511e1711dfa9c8d404727b2e
2021-03-17 16:49:53 -07:00
James Zern cb0d8ce313 Merge "vp8: restrict 1st pass cpu_used range" 2021-03-12 19:44:42 +00:00
Marco Paniconi 973726c38b vp9-rtc: Add postencode_drop control to sample encoder
Change-Id: I1c989f26b0a7b9239adf37df8d96776f33b89a8b
2021-03-12 09:54:59 -08:00
Jerome Jiang 24b43c4ea5 Prepare for v1.10.0 release.
Update CHANGELOG, AUTHORS, README, libs.mk

Bug: webm:1712
Change-Id: Ic99de12b91a92c32f8a9485dcb759c48bc3eccd6
2021-03-11 20:37:39 -08:00
Paul Wilkins cbc4ead586 Vizer: Added in experimental max KF boost values.
Added the experimental max per frame KF boost values derived from
the Vizier experiments.

These are still all off by default.

When enabled I expect these to cause significant regression as they
fluctuate wildly and in a way that makes no sense from format to format.

I suspect these values reflect over fitting perhaps from a subset of
training clips with more frequent mid chunk key frames and or short key
frame groups.

Also fixed incorrect value for gf boost for one format.

Experiment to moderate these values and use different values for first
and subsequent KF groups to follow.

Change-Id: Ibeb4268957f2edacdb4549d74930255a22a2fcc5
2021-03-10 14:41:40 +00:00
Paul Wilkins 8851ed5787 Vizier: Add in field for min kf frame boost.
Added kf_frame_min_boost field to hold the minimum per frame
boost in key frame boost calculations. Replaces hard wired value.
To be used in conjunction with and tied to the maximum value.

Change-Id: I67a39ecb3f21b5918512a5ccd9a1b214d7971e45
2021-03-10 14:40:50 +00:00
Paul Wilkins 2823cc4d0d Merge "Vizier: Add defaults for > 1080P" 2021-03-10 14:35:53 +00:00
Cheng Chen b7d9113ea0 Merge "L2E: let vp9 encoder respect external max frame size constraint" 2021-03-10 05:50:50 +00:00
Paul Wilkins fb98bdb723 Merge "Further integration for Vizier." 2021-03-09 17:30:04 +00:00
Paul Wilkins cc3444f01c Vizier: Add defaults for > 1080P
Previous code did not have sensible defaults for larger image formats.

Added defaults for Vizier RD parameters for sizes > 1080P and changed
the first pass parameters for large formats to use the 1080P values.
No supplied value  for rd_mult_q_sq_key_high_qp case yet so set to
old hard wired default value.

If the Vizier parameters were enabled the lack of sensible defaults
caused a large regression for 2K clips in one of our test sets.

Change-Id: I306c0cd76eab00d50880c91fadb5842faf6661ff
2021-03-09 16:43:14 +00:00
Cheng Chen 36013909a5 L2E: let vp9 encoder respect external max frame size constraint
Change-Id: Ib926e694d4bc4675af1435a32f6316a587756380
2021-03-08 14:21:47 -08:00
Paul Wilkins f27c62c5df Further integration for Vizier.
Further integration of Vizier adjustable parameters,

This patch connects up additional configurable two pass rate control
parameters for the Vizier project.  This still needs to be connected up
to a command line interface and at the moment should still be using
default values that match previous behavior.

Do not submit until verified that defaults are all working correctly.

Change-Id: If1241c2dba6759395e6efa349c4659a0c345361d
2021-03-08 15:51:07 +00:00
James Touton 56b1a197b2 Check for _WIN32 instead of WIN32.
_WIN32 is predefined for the Windows platform in MSVC, whereas WIN32 is not, and WIN32 is also not defined in the makefiles.

Change-Id: I8b58e42d891608dbe1e1313dc9629c2be588d9ec
2021-03-04 18:43:29 -08:00
Paul Wilkins 3cc45a0522 Merge "Add fields into RC for Vizier ML experiments." 2021-03-04 21:17:28 +00:00
Jerome Jiang a58265f795 Merge "override assembler with --as option on msvs" 2021-03-04 20:30:02 +00:00
Jerome Jiang f7c386bab0 Use -std=gnu++11 instead of -std=c++11
Cygwin and msys2 have stricter compliance requirement over standard c
headers.

Bug: webm:1708
Change-Id: I676b1227b9dd304149e50016468df0f057c6a78f
2021-03-04 11:16:06 -08:00
Jerome Jiang 2570e33ece override assembler with --as option on msvs
Bug: webm:1709
Change-Id: I962a64c00042fe95cc1cd845b187f71ad6cfd1b7
2021-03-03 20:20:49 -08:00
Paul Wilkins d0567bd779 Add fields into RC for Vizier ML experiments.
This patch adds fields into the RC data structure for the Vizier.

The added fields allow control of some extra rate control parameters
and rate distortion.

This patch also adds functions to initialize the various parameters
though many are not yet used / wired in and for now all are set to
default values. Ultimately many will be set through new command
line options.

Change-Id: I41591bb627d3837d2104fb363845adedbddf2e02
2021-03-03 17:07:08 +00:00
Wan-Teh Chang ebefb90b75 Remove comments for removed 'active_map' parameter
Change-Id: I8635f6121e13089c25e201df033d5bc68e2862b4
2021-02-26 18:02:24 -08:00
Jerome Jiang 02392eeccc Remove two pass related code from svc sample encoder.
SVC sample encoder is only supposed to be used for realtime SVC.

Bug: webm:1705
Change-Id: I5c0c3491732db3e148073aaf7f90ee8d662b57b5
2021-02-18 09:28:45 -08:00
James Zern 24bd0733ef vp8_denoiser_sse2_test: disable BitexactCheck w/gcc-8+
this test fails under gcc 8-10, but not with other compilers

Bug: webm:1718
Change-Id: I8c6c7a25c4aaf019a7f91f835a1a2c9a731cfadc
2021-02-05 10:21:39 -08:00
Marco Paniconi 0d8354669a svc: Fix an existing unittest for flexible mode
The flag update_pattern_ was being set to 0
(because it was set before reset) instead of 1.
And the example flexible mode pattern was not setting
non-reference frame on top temporal top spatial.

Change-Id: I8aee56ce13cc4e0d614126592f9d0f691fe527b0
2021-02-03 22:12:25 -08:00
Cheng Chen 5a4cfa9563 Merge "L2E: let external rate control pass in a max frame size" 2021-02-03 23:57:00 +00:00
Marco Paniconi 7cbe65b6f4 Merge "svc: Unittest for ksvc flexible mode with no updates on TL > 0" 2021-02-03 22:30:41 +00:00
Marco Paniconi 158aa20c95 svc: Unittest for ksvc flexible mode with no updates on TL > 0
Catches tsan issue fixed in: 7b93b56

Change-Id: I34b17c289afd0f8691987a1e4afa533f6c7f2806
2021-02-03 13:26:05 -08:00
James Zern 6c7989e676 Merge "vp8_denoiser_sse2_test: use ASSERT instead of EXPECT" 2021-02-03 21:16:01 +00:00
James Zern b3506b3307 vp8_denoiser_sse2_test: use ASSERT instead of EXPECT
when test block contents to avoid producing unnecessary output on
failure.

Bug: webm:1718
Change-Id: Ie2cf8245ec8c03556549ad1eea65c8bef15a9735
2021-02-03 13:14:50 -08:00
Cheng Chen 557368a8fa L2E: let external rate control pass in a max frame size
And allow the frame to recode when the frame size is larger
than the input max frame size.

If the max frame size is not specified, let vp9 decide whether
to recode.  The recode follows the vp9's current recoding mechanism.

The rate control api will return the new qindex back to the
external model.

Change-Id: I796fbf713ad50a5b413b0e2501583b565ed2343f
2021-02-03 11:29:06 -08:00
Marco Paniconi 6c5377fd35 Fix to vpx_temporal_svc_encoder
Avoid division by zero.

Change-Id: Icf3f40aa32fe30f42c46417a1437ebe235e3ac96
2021-02-03 10:07:27 -08:00
Elliott Karpilovsky 61edec1efb Relax constraints on Y4M header parsing
Previous parser assumed that the header would not exceed
80 characters. However, with latest FFMPEG changes, the header
of Y4M files can exceed this limit.

New parser can parse an arbitrarily long header, as long each
tag is 255 or less characters.

BUG=aomedia:2876

Change-Id: I9e6e42c50f4e49251dd697eef8036485ad5a1228
2021-01-29 09:52:02 -08:00
Elliott Karpilovsky ebb5ffc1d4 Relax constraints on Y4M header parsing
Previous parser assumed that the header would not exceed
80 characters. However, with latest FFMPEG changes, the header
of Y4M files can exceed this limit.

New parser can parse up to ~200 characters. Arbitrary parsing in
future commit.

BUG=aomedia:2876

Change-Id: I2ab8a7930cb5b76004e6731321d0ea20ddf333c1
2021-01-28 11:52:41 -08:00
James Zern deff7ddd27 Merge changes I43d9d477,I8d4661ec
* changes:
  vp9_end_to_end_test: fix compile with gcc 4.8.5
  sad_test: fix compilation w/gcc 4.8.5
2021-01-27 23:39:14 +00:00
Jerome Jiang 768b6b5e0d Merge "svc: turn off use_base_mv on non base layer." 2021-01-27 20:08:47 +00:00
Jerome Jiang f46b66ac83 svc: turn off use_base_mv on non base layer.
Change-Id: I4a9402f468e54c58081c882ed37f59ee0269c0fc
2021-01-27 10:11:43 -08:00
James Zern 987dd3a9be vp9_end_to_end_test: fix compile with gcc 4.8.5
use Values() rather than ValuesIn() with an initializer list as this
version of gcc under CentOS fails to deduce the type:

../third_party/googletest/src/include/gtest/gtest-param-test.h:304:29:
note:   template argument deduction/substitution failed:
../test/vp9_end_to_end_test.cc:346:59: note:   couldn't deduce template
parameter ‘T’
                            ::testing::ValuesIn({ 6, 7, 8 }));

Bug: webm:1690
Change-Id: I43d9d4777fcd74a4f8fa8bdcd9834cdca5e546ff
2021-01-26 18:09:27 -08:00
James Zern bd8dfea54d sad_test: fix compilation w/gcc 4.8.5
use a #define for kDataAlignment as it's used with DECLARE_ALIGNED
(__attribute__((aligned(n)))) and this version under CentOS is more
strict over integer constants:

../vpx_ports/mem.h:18:72: error: requested alignment is not an integer constant
 #define DECLARE_ALIGNED(n, typ, val) typ val __attribute__((aligned(n)))

Bug: webm:1690
Change-Id: I8d4661ec1c2c1b1522bdc210689715d2302c7e72
2021-01-26 18:09:16 -08:00
Jerome Jiang 14f132648a Merge "Do not reuse mv in base spatial layer if curr buf same as prev." 2021-01-23 02:19:55 +00:00
Jerome Jiang 7b93b56ab9 Do not reuse mv in base spatial layer if curr buf same as prev.
Bug: b/154890543
Change-Id: Iad5791912f781d225e610a61bc13f3dbaef81bb9
2021-01-21 17:09:02 -08:00
Angie Chiang b0050f27e2 Use VPX_CODEC_INVALID_PARAM when ext_ratectrl=NULL
Bug: webm:1716

Change-Id: Ic60c367aabfc03d94816e85476895b988aced5f1
2021-01-20 17:52:35 -08:00
Angie Chiang f57fa3f1df Handle vp9_extrc functions' return status properly
Bug: webm:1716
Change-Id: I204cd3ab35b493759808500b799da3b9e55686d4
2021-01-20 17:52:03 -08:00
Angie Chiang 3aecf4a0ba Merge changes Ib016ab5a,Ie6d63a68,I96b18436,I0b98741d
* changes:
  Add return to vp9_extrc_update_encodeframe_result
  Add status in vp9_extrc_get_encodeframe_decision
  Return status in vp9_extrc_send_firstpass_stats
  Return status in vp9_extrc_create/init/delete
2021-01-21 01:33:09 +00:00
Angie Chiang d49700e25b Add return to vp9_extrc_update_encodeframe_result
Bug: webm:1716
Change-Id: Ib016ab5a49c765971366cc8d2b75bcca3ed5bd0f
2021-01-19 18:54:07 -08:00
Angie Chiang d890579a2e Add status in vp9_extrc_get_encodeframe_decision
Bug: webm:1716
Change-Id: Ie6d63a68539369c51fefefa528e299b00a967e29
2021-01-19 18:54:07 -08:00
Angie Chiang 27f1838519 Return status in vp9_extrc_send_firstpass_stats
Bug: webm:1716

Change-Id: I96b18436c58ed888fcf677097819cc0093b6f41d
2021-01-19 18:54:07 -08:00
Angie Chiang f4fc562489 Return status in vp9_extrc_create/init/delete
Bug: webm:1716

Change-Id: I0b98741db8c639bdddd899fd6ad359da7b916086
2021-01-19 18:54:00 -08:00
James Zern fe1c96d111 {highbd_,}loopfilter_neon.c: quiet -Wmaybe-uninitialized
Seen with arm-linux-gnueabihf-gcc-8 (8.3.0 & 8.4.0)

Without reworking the code or adding an additional branch this warning
cannot be silenced otherwise. The loopfilter is only called when needed
for a block so these output pixels will be set.

BUG=b/176822719

Change-Id: I9cf6e59bd5de901e168867ccbe021d28d0c04933
2021-01-19 18:38:23 -08:00
Elliott Karpilovsky ecbb0e0e2a Relax constraints on Y4M header parsing
Some refactoring and cleanup -- do not count the first 9 bytes against
the header limit. Add a unit test.

BUG=aomedia:2876

Change-Id: Id897d565e2917b48460cc77cd082cec4c98b42cb
2021-01-14 16:18:40 -08:00
Hui Su 576e0801f9 vpxenc: initalize the image object
Otherwise it would cause problem when calling vpx_img_free() at the end
if no frame is read.

Change-Id: Ide0ed28eeb142d65d04703442cc4f098ac8edb34
2021-01-13 10:51:39 -08:00
Angie Chiang 3a38edea2c Fix show_index in vp9_extrc_encodeframe_decision()
Change-Id: I93bb1fb3c14126d881d3f691d30875a0062e436c
2020-12-17 18:09:55 -08:00
Angie Chiang 67b1d7f174 Correct pixel_count in encode_frame_result
Change-Id: I3270af4f793f8e453e10d1caf8ffa1a8d5d584a7
2020-12-17 17:32:26 -08:00
Matt Oliver c027576d54 project: Split winrt projects. 2020-12-18 00:20:11 +11:00
Hui Su 8ed23d5f7f First pass: skip motion search for intra-only
BUG=webm:1713

Change-Id: Ibad79cf5d12aa913e8c87a31d7d2124c00958691
2020-12-15 22:40:09 -08:00
James Zern 2392fe53ab Merge "configure: add darwin20 cross-compile support" 2020-12-11 19:58:17 +00:00
Gregor Jasny 723fca7dd1 configure: add darwin20 cross-compile support
Change-Id: I91c0e832a6e76172397e97413329fd43edc81c78
2020-12-11 10:17:18 -08:00
Jeremy Leconte ffc179d8bf Fix nullptr with offset.
The error occurs with low resolution when LibvpxVp8Encoder::NumberOfThreads returns 1.

Bug: b:175283098
Change-Id: Icc9387c75f4ac6e4f09f102b3143e83c998c5e38
2020-12-10 18:54:53 +00:00
Angie Chiang ebac57ce92 Fix typos in simple_encode.h
Change-Id: Id83eff6cc12c441ce991fb1a73820d106311cf5e
2020-11-25 12:58:24 -08:00
Angie Chiang 0de1df6bf7 Merge "Revert "Close out file in EndEncode()"" 2020-11-24 03:10:44 +00:00
Angie Chiang 5459c4ab98 Revert "Close out file in EndEncode()"
This reverts commit 7370cecd89.

Reason for revert: I accidentally check in this CL

Change-Id: I71ff0b98649070df3edd13b98170a7091541057b
2020-11-24 02:55:24 +00:00
Angie Chiang 71c8e0c009 Merge "Close out file in EndEncode()" 2020-11-24 02:49:12 +00:00
Angie Chiang 1b0fc6a5d6 Merge "Refine documentation of vpx_ext_ratectrl.h" 2020-11-24 02:49:01 +00:00
Angie Chiang 53f8c24374 Merge "Allow user to set rc_mode and cq_level in SimpleEncode" 2020-11-24 02:47:28 +00:00
Angie Chiang c341440874 Refine documentation of vpx_ext_ratectrl.h
Bug: webm:1707
Change-Id: Iba04b5292c157e22dd8618a79e8c977ec9fc2199
2020-11-20 17:41:09 -08:00
Angie Chiang 2ccee3928d Allow user to set rc_mode and cq_level in SimpleEncode
Change-Id: If3f56837e2c78a8b0fe7e0040f297c3f3ddb9c8b
2020-11-20 17:40:04 -08:00
Angie Chiang e56e8dcd6f Add gop_index to vpx_ext_ratectrl.h
Bug: webm:1707

Change-Id: I48826d5f3a7cc292825a7f1e30ac6d0f57adc569
2020-11-19 20:15:18 -08:00
Angie Chiang 5b63f0f821 Capitalize VPX_RC_OK / VPX_RC_ERROR
Change-Id: I526bd6a6c2d2095db564f96d63c7ab7ee4dd90ad
2020-11-17 16:50:44 -08:00
Angie Chiang 275c276993 Add doxygen for vpx_rc_funcs_t
Change-Id: If75215d574fe0b075add50154a9eece5d387741a
2020-11-17 16:50:35 -08:00
Angie Chiang a7731ba488 Add doxygen for vpx_rc_config
Bug: webm:1707

Change-Id: I65bab6b2b792653e70cb136a5f9a21796e34b829
2020-11-17 15:23:18 -08:00
Angie Chiang c22a783bea Copy first pass stats documentation from AV1 to VP9
Bug: webm:1707
Change-Id: Iae7eaa9ba681272b70b6dad17cd2247edab6ef79
2020-11-17 15:23:18 -08:00
Angie Chiang ca7a16babc Add doxygen to structs in vpx_ext_ratectrl.h
Bug: webm:1707

Change-Id: Ib5f6b6f143f55e5279e39eb386fcd3340211de59
2020-11-17 15:23:12 -08:00
Angie Chiang a44cf4592a Merge changes I12a72d3a,I1a6c5752
* changes:
  Fix uninitialized warning in resize_test.cc
  Fix the warning of C90 mixed declarations and code
2020-11-17 21:40:55 +00:00
Jerome Jiang b5d77a48d7 Remove condition on copying svc loopfilter flag
Change-Id: Ib37ef0aa3dc0ec73b25332be6d89969093bd7aeb
2020-11-16 14:12:50 -08:00
Angie Chiang 4e7fd0273a Fix uninitialized warning in resize_test.cc
Change-Id: I12a72d3aa57b13dbcbeb037e1deea41529ea4194
2020-11-13 18:17:48 -08:00
Angie Chiang d4453c73ff Fix the warning of C90 mixed declarations and code
Change-Id: I1a6c57525bbe8bf1a97057ecd64985bc23d1df2e
2020-11-13 18:13:14 -08:00
Marco Paniconi 3f7fee29ed Merge "vp9: Allow for disabling loopfilter per spatial layer" 2020-11-13 04:26:34 +00:00
Marco Paniconi 7beafefd16 vp9: Allow for disabling loopfilter per spatial layer
For SVC: add parameter to the control SET_SVC_PARAMS to
allow for disabling the loopfilter per spatial layer.
Note this svc setting will override the setting via
VP9E_SET_DISABLE_LOOPFILTER (which should only be used
for non-SVC).

Add unittest to handle both SVC (spatial or temporal layers)
and non-SVC (single layer) case.

Change-Id: I4092f01668bae42aac724a6df5b6f6a604337448
2020-11-12 11:31:42 -08:00
Cheng Chen b1d704f12a Accumulate frame tpl stats and pass through rate control api
Tpl stats is computed at the beginning of encoding the altref
frame. We aggregate tpl stats of all blocks for every frame of
the current group of picture.

After the altref frame is encoded, the tpl stats is passed through
the encode frame result to external environment.
Change-Id: I2284f8cf9c45d35ba02f3ea45f0187edbbf48294
2020-11-09 13:14:19 -08:00
James Zern 220e4331bd Merge "libs.mk: set LC_ALL=C w/egrep invocations" 2020-10-30 04:57:32 +00:00
Wan-Teh Chang 98919178f4 Merge "Add a comment about bitdeptharg and inbitdeptharg" 2020-10-29 23:33:54 +00:00
James Zern 9ab65c55d9 libs.mk: set LC_ALL=C w/egrep invocations
this guarantees consistent interpretation of the character ranges

BUG=webm:1711

Change-Id: Ia9123f079cc7ac248b9eff4d817e2e103d627b2b
2020-10-29 15:46:21 -07:00
Wan-Teh Chang 8b27a92490 Add a comment about bitdeptharg and inbitdeptharg
Add a comment to vp9_args to point out that bitdeptharg and
inbitdeptharg do not have a corresponding entry in vp9_arg_ctrl_map and
must be listed at the end of vp9_args.

Change-Id: Ic9834ab72599c067156ca5a315824c7f0760824a
2020-10-27 18:01:22 -07:00
James Zern 89ddf6f32a vp9_ext_ratectrl_test: add missing override
for ~ExtRateCtrlTest()

Change-Id: I311a400093c8c1ee2c002ba000d0b33c4fde209f
2020-10-27 17:09:08 -07:00
Jerome Jiang 4c3d05f13e Merge "Add cmd line option to control loopfilter for vpxenc" 2020-10-27 22:33:20 +00:00
Jerome Jiang 8b8b15e086 Add cmd line option to control loopfilter for vpxenc
Change-Id: I4f5e6ce2f1b535a586bdb6c9e55a3d49ebf61af4
2020-10-27 13:25:24 -07:00
Angie Chiang 16154dae71 Download bus_352x288_420_f20_b8.yuv properly
Bug: webm:1707

Change-Id: I6aabad7cdcddf2bc41a0cc7b5cdfd7d9759f9fae
2020-10-26 11:13:56 -07:00
Angie Chiang ee482c87c7 Merge changes I27932c41,I2ff9e54a,I4ebed472
* changes:
  Small changes of vp9_ext_ratectrl_test.cc
  Add ref frame info to vpx_rc_encodeframe_info_t
  Add vpx_rc_status_t
2020-10-21 21:26:34 +00:00
Angie Chiang a207a0f6b9 Small changes of vp9_ext_ratectrl_test.cc
Change-Id: I27932c41a826cd3c10cc7801956cd32e4877133a
2020-10-20 17:32:27 -07:00
Angie Chiang 13aad8bb64 Merge "Add unit test for vp9_ext_ratectrl" 2020-10-21 00:21:51 +00:00
Angie Chiang 9bfdf4a9d0 Add ref frame info to vpx_rc_encodeframe_info_t
Bug: webm:1707

Change-Id: I2ff9e54a9c8ae535628c1c471a2d078652f49a31
2020-10-20 17:14:10 -07:00
angiebird 90271b2201 Add vpx_rc_status_t
Let callback functions in vpx_ext_ratectrl.h
return vpx_rc_status_t

Bug: webm:1707

Change-Id: I4ebed47278b228740f6c73b07aa472787b2617d2
2020-10-20 17:13:59 -07:00
Marco Paniconi 94384b5c68 vp9-rtc: Fix to control for disabling loopfilter
Adding unit test.

Change-Id: Ic3c03fee7e9c2c224d927bb09914551422bdf816
2020-10-20 10:43:58 -07:00
Angie Chiang e94000aa35 Add unit test for vp9_ext_ratectrl
Fix three bugs along the way.
1) Call vp9_extrc_send_firstpass_stats() after vp9_extrc_create()
2) Pass in model pointer in vp9_extrc_create()
3) Free frame_stats buffer in vp9_extrc_delete()

Bug: webm:1707

Change-Id: Ic8bd62c7b4ebd85a7479ae5e4c82d7f6059d782f
2020-10-19 21:49:38 -07:00
angiebird 8bfc920631 Add vp9_extrc_update_encodeframe_result()
Bug: webm:1707

Change-Id: I962ffa23f03b953f7c0dfd81f49dc79d1975bbba
2020-10-15 18:52:05 -07:00
angiebird f71dd6e23e vp9_extrc_get_encodeframe_decision()
Bug: webm:1707

Change-Id: I90a327b97d7158b65767fe3fbfd5f260030e17f5
2020-10-15 18:51:39 -07:00
James Zern 122a74eda7 install vpx_ext_ratectrl.h
fixes encoder detection / compile with installed headers after:
6dba0d0a0 Add callback functions for external_rate_control

Bug: webm:1707
Change-Id: I370d8c94d6f1b8201002a722077ecf6b3d8cede5
2020-10-15 17:11:53 -07:00
angiebird 9857515cd6 Call vp9_extrc_send_firstpass_stats() properly
Change-Id: I28db5010ba647cc91b8c0aa59309d7e953cd1216
2020-10-09 19:09:36 -07:00
angiebird 705bf9de8c Add vpx_rc_frame_stats_t
Change-Id: I496ce13592f71779bb00cc8bbb601835bca8ff09
2020-10-09 19:08:36 -07:00
angiebird e6208a9507 Add vp9_extrc_send_firstpass_stats()
Change-Id: Ia2457b416200a2b2d1558600bff90ac2746cf396
2020-10-09 19:08:29 -07:00
angiebird 20bca1350a Add vp9_extrc_init/create/delete
Change-Id: I9fcb9f4cc5c565794229593fadde87286fcf0ffd
2020-10-09 17:30:54 -07:00
angiebird 6dba0d0a05 Add callback functions for external_rate_control
Change-Id: I20a1179a2131d2cd069dae9076aa2c18b80784f3
2020-10-09 17:30:49 -07:00
angiebird a04f68148f Add codec control for external rate control lib
VP9E_SET_EXTERNAL_RATE_CONTROL
One can assign an external library using the control flag,
VP9E_SET_EXTERNAL_RATE_CONTROL.
The args alongside the control flag should be of type char**.
args[0]: char* points to the path of rate control library
args[1]: char* points to the config of the rate control library.

Change-Id: Iae47362cdfafa00614bac427884bffcf6944c583
2020-10-02 19:31:36 -07:00
angiebird da7c503fe5 Add SetEncodeConfig and DumpEncodeConfigs
Change-Id: Ie6864b1133c26021d9c4883df033ecd2969585ed
2020-10-02 19:29:36 -07:00
Jerome Jiang 7e8ea22e40 Add codec control to disable loopfilter for vp9
Change-Id: I6d693e84570c353d20ec314acea43363956c0590
2020-10-02 12:09:01 -07:00
James Zern d017a63feb Merge "ratectrl_rtc_test.cc: fix signed/unsigned comparison" 2020-10-01 02:40:49 +00:00
Jerome Jiang 037190e55f Merge "Add file for rate control interface test." 2020-09-30 22:17:19 +00:00
Jerome Jiang 73681bf6a4 Add file for rate control interface test.
Change-Id: Id09dc5b653c1e5bb2b02f63579ac776f887ce0eb
2020-09-30 11:39:27 -07:00
James Zern 3c31fb8053 ratectrl_rtc_test.cc: fix signed/unsigned comparison
Change-Id: Id522c12faf4c959f60b5df1b0f7312f14a71720d
2020-09-29 10:58:56 -07:00
James Zern 956d3cac34 configure.sh: fix arm64-darwin-gcc match
after:
979e27c97 configure: add darwin20 support

make the condition more specific by including the trailing -gcc (-*)

Change-Id: I78f481b6c5ad9137e6b6973198e8671e806ee82c
2020-09-29 10:38:41 -07:00
James Zern 979e27c970 configure: add darwin20 support
this release will have arm64 and x86_64 support. in the future it might
be useful to move to mac/iphone targets to help disambiguate
arm64-darwin-gcc and arm64-darwin20-gcc.

Change-Id: I1f8b145303204af316955822f5e8bab51c47f353
2020-09-25 13:21:11 -07:00
James Zern aea631263d Merge "vp9_ratectrl,vp9_resize_one_pass_cbr: rm redundant casts" 2020-09-16 23:48:21 +00:00
James Zern c211b82d08 vp9_ratectrl,vp9_resize_one_pass_cbr: rm redundant casts
avg_frame_bandwidth is an int, quiets a clang-tidy warning

Change-Id: I2a2822652ca6a06e9d1d6d4318f544d419d437e8
2020-09-16 13:43:05 -07:00
James Zern 58d02a8501 test/encode_test_driver: rm redundant get() w/unique_ptr
Change-Id: I3c1ece92ba9f43df4cbaf47109e35aaf0a807d97
2020-09-16 13:40:01 -07:00
Joel Fernandes 97356acb50 vp8: Remove sched_yield on POSIX systems
libvpx does sched_yield() on Linux. This is highly frowned upon these
days mainly because it is not needed and causes high scheduler overhead.

It is not needed because the kernel will preempt the task while it is
spinning which will imply a yield. On ChromeOS, not yielding has the
following improvements:

1. power_VideoCall test as seen on perf profile:

With yield:
     9.40%  [kernel]            [k] __pi___clean_dcache_area_poc
     7.32%  [kernel]            [k] _raw_spin_unlock_irq  <-- kernel scheduler

Without yield:
     8.76%  [kernel]            [k] __pi___clean_dcache_area_poc
     2.27%  [kernel]            [k] _raw_spin_unlock_irq  <-- kernel scheduler

As you can see, there is a 5% drop in the scheduler's CPU utilization.

2. power_VideoCall test results:

There is a 3% improvement on max video FPS, from 30 to 31. This
improvement is consistent.

Also note that the sched_yield() manpage itself says it is intended
only for RT tasks. From manpagE: "sched_yield() is intended for use
with real-time scheduling policies (i.e., SCHED_FIFO or SCHED_RR)
and very likely means your application design is broken."

BUG=b/168205004

Change-Id: Idb84ab19e94f6d0c7f9e544e7a407c946d5ced5c
Signed-off-by: Joel Fernandes <joelaf@google.com>
2020-09-14 20:49:42 +00:00
Sarah Parker 478c70f6d2 googletest: enable failure on uninstantiated tests
Similar to the change in
https://aomedia-review.googlesource.com/c/aom/+/115162.
This currently is a warning, but the tree should be clean now in the
default x86-64 configuration so we can use it to prevent regressions and
find any remaining issues in other configurations.

BUG=b/159031844

Change-Id: I097537ff018668492d37164fdba5edd241dc5dbe
2020-09-10 20:33:02 -07:00
Sarah Parker ea0bc1b321 Upstream GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST
BUG=b/159031848

Change-Id: I013770f4e54d0ea92304fa3e9cf4d46f5723f129
2020-09-11 02:25:24 +00:00
Marco Paniconi d1a78971eb vp9-rtc: Add control to disable maxq on overshoot
Add encoder control to disable feature to increase Q
on overshoot detection, for CBR. Default (no usage
of the control) means the feature is internally enabled.

Add the control to the sample encoders, but keep it
disabled as default (set to 0, so feature is on).

Change-Id: Ia2237bc4aaea9770e5080dab20bfff9e3fd09199
2020-08-25 13:06:41 -07:00
Daniel Sommermann c413c8f18e Escape number sign in Makefiles
Number signs are handled differently in Makefile variable parsing as
compared to bash variable parsing. See this demo:

```
$ cat Makefile
A=foo#bar
B='foo#bar'
C="foo#bar"
D=foo\#bar
E='foo\#bar'
F="foo\#bar"

$(info $(A))
$(info $(B))
$(info $(C))
$(info $(D))
$(info $(E))
$(info $(F))

$ make
foo
'foo
"foo
foo#bar
'foo#bar'
"foo#bar"
make: *** No targets.  Stop.

$ make -v
GNU Make 4.2.1
```

In other words, the `#` character is evaluated first when parsing
Makefiles, causing the rest of the line to become a comment. The effect of
this is that paths that contain embedded `#` symbols are not handled
properly in the vpx build system.

To test this change, clone vpx to a directory containing a `#` symbol and
attempt a build. With this change, it worked for me on Fedora 31, however
without the change the build failed.

Change-Id: Iaee6383e2435049b680484cc5cefdea9f2d9df46
2020-08-19 11:56:25 -07:00
James Zern ebbe5b82a0 Merge "Refine MMI & MSA detection for mips" 2020-08-19 02:49:29 +00:00
Marco Paniconi 53747dfe65 vp9-svc: Fix to resetting RC for temporal layers
Fix to reset RC for temporal layers: the
first_spatial_layer_to_encode is usually/default 0,
so the logic to reset for temporal layers was not
being executed. Use VPXMAX(1, ) to make sure  all
temporal layers will be reset (when max-q is used
for overshoot).

Change-Id: Iec669870c865420d01d52eab9425cd6c7714eddc
2020-08-18 18:01:53 -07:00
jinbo ea6562f2fc Refine MMI & MSA detection for mips
1.Add compile check to probe the native ability of
toolchain to decide whether a feature can be enabled.
2.Add runtime check to probe cpu supported features.
MSA will be prefered if MSA and MMI are both supported.
3.You can configure and build as following commands:
./configure --cpu=loongson3a && make -j4

Change-Id: I057553216dbc79cfaba9c691d5f4cdab144e1123
2020-08-19 07:57:21 +08:00
Marco Paniconi 529c29bb0f rtc-vp9: Fix to rcstats in vp9_spatial_svc_encoder
Fixes the rcstats for case when #spatial_layers = 1.

Change-Id: Ie28d99852033307bc4c69c7e738e1d4cab4e8cf5
2020-08-17 21:52:12 -07:00
Jerome Jiang 2d20a42ab6 Merge "Merge remote-tracking branch 'origin/quacking' into master" 2020-08-13 16:31:42 +00:00
James Zern b0aa5f1852 Merge "test/*: use canonical downloads.webmproject url" 2020-08-12 22:35:47 +00:00
angiebird 6ac8a0c9b0 Avoid re-allocating fp_motion_vector_info
Replace fp_motion_vector_info_init() by
fp_motion_vector_info_reset() in first_pass_encode()

Change-Id: Iadacb1ecc4f07435340399564fdd3bfd4ac702f4
2020-08-10 19:53:10 -07:00
angiebird 343c4dca64 Cosmetic changes in simple_encode.h
Change-Id: If7d2711e7f37f00629874914f7c4d2396358e39d
2020-08-10 19:39:55 -07:00
angiebird 7370cecd89 Close out file in EndEncode()
Change-Id: Ib6549f954ce6d5d966eef09a119b46f0cc2f54f7
2020-08-10 16:13:16 -07:00
angiebird 04db83211c Correct the first pass motion vector scale
Change-Id: I005a648f7f9ead9d36a39330dfbb096919affb34
2020-08-10 16:06:49 -07:00
angiebird 7122eea6a4 Cosmetic change for simple_encode_test.cc
Change-Id: I50b4d38f7deceb5b416e72dd944d2ed31e42dafa
2020-08-10 16:06:49 -07:00
angiebird 246a65c696 Make target_frame_bits error margin configurable.
Change-Id: I05dd4d60741743c13951727ce6608acf4224ebec
2020-08-10 15:00:51 -07:00
angiebird d6f2ae2c12 Avoid division by zero for rate q_step model
Change-Id: Ic5709b79131a3969fcb2a0eb3f53994f788b5cc9
2020-08-10 15:00:45 -07:00
angiebird 3ec043a795 Add rq_history to encode_frame_result
Change-Id: Ic2a52dcf5e5a6d57b80d390a2c48ee498e89e7b2
2020-08-07 16:48:08 -07:00
angiebird bb7a2ccc38 Fix ObserveFirstPassMotionVectors()
1) Use kRefFrameTypeNone in the unit test
2) Reset mv_info in fp_motion_vector_info_init
3) Call fp_motion_vector_info_init() in first_pass_encode()
4) Set mv_info for intra frame.
5) Set mv_info with zero mv as default for inter frame
6) Remove duplicated fp_motion_vector_info in encode_frame_info

Change-Id: I2f7db5cd4cf1f19db039c9ce638d17b832f45b6e
2020-08-07 15:48:32 -07:00
James Zern 68e1198375 test/*: use canonical downloads.webmproject url
prefer
https://storage.googleapis.com/downloads.webmproject.org/
to
http://downloads.webmproject.org/

similar to libs.mk

BUG=b/163149610

Change-Id: I6abe0848120849b9512fc5a6122ddc54b5cc2240
2020-08-07 13:32:39 -07:00
angiebird e3ae48b861 Make initial q_index guess at 128
This reduce the average recode times per frame from 2.81 to 2.73
when targeting 15% error for target bitrate per frame.

Change-Id: I58f0be86443643ba23623cb1d522ae41897734a3
2020-08-06 15:45:12 -07:00
angiebird 927fad4847 Correct rq_model_update when recode_count == 1
This will reduce the avg recode times per frame form
3.19 to 2.81 when targeting 15% error margin for
target bitrate per frame.

Change-Id: I28c9ec09a1b1318c09fe5229ccb7e51b32b9dfb9
2020-08-06 15:35:21 -07:00
Angie Chiang ca9a262b1d Merge "Cosmetic changes for rate_ctrl experiment" 2020-08-06 22:21:46 +00:00
angiebird 3e0967af8f Cosmetic changes for rate_ctrl experiment
Change-Id: I133c93c2ad4c824fc97a18de3ac2cb2aedac9013
2020-08-05 13:52:30 -07:00
Cheng Chen f9ab864199 L2E: Add ObserveFirstPassMotionVector
Store motion vectors for each 16x16 block found in the first pass
motion search.
Provide an api "ObserveFirstPassMotionVector()" in SimpleEncode
class, similar to "ObserveFirstPassStats()".

Change-Id: Ia86386b7e4aa549f7000e7965c287380bf52e62c
2020-08-03 22:46:38 -07:00
Angie Chiang 8a8e780b58 Merge "Add recode loop logics for rate_ctrl experiment" 2020-08-04 02:50:55 +00:00
angiebird 566905e91e Add recode loop logics for rate_ctrl experiment
Change-Id: I4de5a38e25d6b0836d90e8fcd0e56d268e5fd838
2020-08-03 17:06:54 -07:00
Matt Oliver 9e126911d4 project: Update for 1.9.0 merge. 2020-08-01 20:57:18 +10:00
Matt Oliver b6b5b70fad Merge commit '6516e974f8c40d0e49b19a4b55b1c98e7432edbb' 2020-08-01 20:32:40 +10:00
Jerome Jiang a3fc027cc9 Merge remote-tracking branch 'origin/quacking' into master
BUG=webm:1686

Change-Id: I3ba5215b3791fc2bb63521d11429087cb2abd5b1
2020-07-31 16:45:08 -07:00
Hui Su bdbf872524 Assign correct values for zcoeff_blk in sub8x8 RDO
This fixes a lossless encoding bug as reported in the issue tracker.
Coding performance change is neutral.

BUG=webm:1700

Change-Id: I0f034b16b57e917e722709a7e9addef864b83d27
2020-07-31 17:40:55 +00:00
Jerome Jiang 6516e974f8 Update CHANGELOG
BUG=webm:1686

Change-Id: I51ecd0fb3da5f0aa36764706f3538d0056fac268
2020-07-30 12:59:15 -07:00
Sreerenj Balachandran 129e0756a5 vp9-svc: Fix the bitrate control for spatial svc
Make sure to initialize the layer context for spatial-svc
which has a single temporal layer.

Change-Id: I026ecec483555658e09d6d8893e56ab62ee6914b
(cherry picked from commit 1e9929390c)
2020-07-30 10:46:27 -07:00
Marco Paniconi cd0aca91b1 vp9-svc: Fix to setting frame size for dynamic resize
For svc with dynamic resize (only for single_layer_svc mode),
add flag to indicate resized width/height has already been set,
otherwise on the resized/trigger frame (resize_pending=1), the
wrong resolution may be set if oxcf->width/height is different
than layer width/height in single_layer_svc mode.

Change-Id: I24403ee93fc96b830a9bf7c66d763a48762cdcb4
(cherry picked from commit de4aedaec3)
2020-07-30 10:46:12 -07:00
Marco Paniconi f0ccdb19f4 vp9-rtc: Fix to resetting drop_spatial_layer
The reset happens on the base spatial layer, before
encoding. But it should be reset on the
first_spatial_layer_to_encode, which may not be 0.

Change-Id: I38ef686b4459ca7433062adbfe32ef2134e1ad60
(cherry picked from commit 769129fb29)
2020-07-30 10:45:53 -07:00
Marco Paniconi 17b2ce11ac vp9-svc: Add svc test for denoiser and dynamic resize
This catches the assert/crash fixed in 5174eb5.

Also fix to only check for dynamic resize in SVC mode
for base temporal layer.

Change-Id: Ie6eb7d233cc43eafb1b78cec4aeb94fb4d7fe11a
(cherry picked from commit 3101666d2a)
2020-07-30 10:45:30 -07:00
Marco Paniconi 2a1d89e278 vp9-svc: Fix to dynamic resize for svc denoising
Fix the logic to allow denoiser reset on resize for SVC mode,
as dynamic resize is allowed for SVC under single_layer mode.

Change-Id: I7776c68dadff2ccbce9b0b4a7f0d12624c2ccf90
(cherry picked from commit 5174eb5b92)
2020-07-30 10:44:52 -07:00
Jerome Jiang b358f9076f NULL -> nullptr in CPP files
This should clean up clangtidy warnings

Change-Id: Ifb5a986121b2d0bd71b9ad39a79dd46c63bdb998
2020-07-27 11:51:04 -07:00
James Zern dbe00bb68b Merge "libs.mk: quiet curl output" 2020-07-23 20:29:05 +00:00
James Zern 859d66fabf libs.mk: quiet curl output
+ fix error return

Change-Id: I48a9ed70fe05df603a49b3c11f813119906fc4fb
2020-07-23 11:50:07 -07:00
Jerome Jiang 7a92a785f2 Silience warnings about uninitiated test cases
BUG=b/159031848

Change-Id: I6bb88c24bd08e0590ec6b8ebfb696fd9b07ed011
2020-07-23 09:22:43 -07:00
James Zern fbfd3fdfb7 update googletest to release-1.10.0-224-g23b2a3b1
this matches libaom and provides
GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST

BUG=webm:1695
BUG=b/159031848

Change-Id: Icdaf61481ab2012dd0e517dd1e600045c937c0dd
2020-07-22 15:54:38 -07:00
Jerome Jiang 42329e5ef6 Update README, AUTHORS and libs.mk
BUG=webm:1686

Change-Id: I307cf79a74ca31ea53554a14f468b0582089aa74
2020-07-21 00:28:38 +00:00
Jerome Jiang 12b5c37672 Merge "Build libsimple_encode.a separately" into quacking 2020-07-20 23:53:54 +00:00
James Zern b79f25b546 Merge "vp8,vpx_dsp: [loongson] fix msa optimization bugs" 2020-07-20 23:44:08 +00:00
angiebird 05e0cd7c4f Build libsimple_encode.a separately
BUG=webm:1689

Change-Id: Id920816315c6586cd652ba6cd1b3a76dfc1f12b7
(cherry picked from commit 56345d256a)
2020-07-20 14:45:50 -07:00
Angie Chiang a4f5a74288 Merge "Build libsimple_encode.a separately" 2020-07-20 21:30:28 +00:00
angiebird 642f6a195d Add init version of EncodeFrameWithTargetFrameBits()
Will add a unit test in a followup CL.

Change-Id: I6a6354f307c427e1a352be7c6421927323eb5e1b
2020-07-20 11:07:45 -07:00
jinbo c2f82351e4 vp8,vpx_dsp: [loongson] fix msa optimization bugs
Fix two bugs reported by clang when enable msa optimizatons:
1. clang dose not support uld instruction.
2. ulw instruction will result in unit cases coredump.

Change-Id: I171bed11d18b58252cbc8853428c039e2549cb95
2020-07-18 14:06:33 +08:00
angiebird 56345d256a Build libsimple_encode.a separately
BUG=webm:1689

Change-Id: Id920816315c6586cd652ba6cd1b3a76dfc1f12b7
2020-07-17 18:04:09 -07:00
angiebird 16935397ee Add SetEncodeSpeed() to SimpleEncode
Change-Id: I2fcf37045a96bb101de3359e2e69dcc266c1dc10
2020-07-15 14:53:01 -07:00
James Zern f0a7200e30 Merge "test/*: rename *TestCase to TestSuite" into quacking 2020-07-15 20:10:45 +00:00
Jerome Jiang 3e833ddaef Cap target bitrate to raw rate internally
BUG=webm:1685

Change-Id: Ida72fe854fadb19c3745724e74b67d88087eb83c
(cherry picked from commit baefbe85d0)
2020-07-13 12:25:53 -07:00
Jerome Jiang 8c7142d773 Merge "Cap target bitrate to raw rate internally" 2020-07-13 19:24:11 +00:00
Jerome Jiang baefbe85d0 Cap target bitrate to raw rate internally
BUG=webm:1685

Change-Id: Ida72fe854fadb19c3745724e74b67d88087eb83c
2020-07-09 20:25:45 -07:00
James Zern 3e9cb9342d test/*: rename *TestCase to TestSuite
similar to the TEST_CASE -> TEST_SUITE changes in:
83769e3d2 update googletest to v1.10.0

BUG=webm:1695

Change-Id: Ib2bdb6bc0e4ed02d61523f8a8315b017b8ad6dad
(cherry picked from commit 6ee3f3649f)
2020-07-10 02:35:51 +00:00
James Zern bf3fe26f7e Merge "test/*: rename *TestCase to TestSuite" 2020-07-10 02:35:32 +00:00
James Zern 6ee3f3649f test/*: rename *TestCase to TestSuite
similar to the TEST_CASE -> TEST_SUITE changes in:
83769e3d2 update googletest to v1.10.0

BUG=webm:1695

Change-Id: Ib2bdb6bc0e4ed02d61523f8a8315b017b8ad6dad
2020-07-09 16:38:00 -07:00
jinbo 5b7882139c vp8,vpx_dsp:[loongson] fix bugs reported by clang
1. Adjust variable type to match clang compiler.
Clang is more strict on the type of asm operands, float or double
type variable should use constraint 'f', integer variable should
use constraint 'r'.

2. Fix prob of using r-value in output operands.
clang report error: 'invalid use of a cast in a inline asm context
requiring an l-value: remove the cast or build with -fheinous-gnu-extensions'.

Change-Id: Iae9e08f55f249059066c391534013e320812463e
2020-07-07 09:25:58 +08:00
Marco Paniconi a1cee8dc91 vp9: Update last_q for dropped frames
last_q is used in resize logic, should
always be last Q selected for previous
frame, encoded or dropped.

Change-Id: Ie9019ccf5a9e3acc8456a2e70cc2aa8d1c90236e
2020-07-06 14:06:49 -07:00
Marco Paniconi 9e15c30585 vp9: Fix to use last_q for resize check
For temporal layers resize is only checked
on the base/TL0 frames. So rc->last_q should be used,
which because rc is in the layer context, rc->last_q
will correspond to the qindex on last TL0 frame.
In the previous code cm->base_qindex was used, which
would correspond to qindex on last encoded frame, which
is not TL0 when temporal_layers > 1.

Change-Id: Iaf86f7156d2d48ae99a1b34ad576d453d490e746
2020-07-06 11:35:43 -07:00
Sreerenj Balachandran 1e9929390c vp9-svc: Fix the bitrate control for spatial svc
Make sure to initialize the layer context for spatial-svc
which has a single temporal layer.

Change-Id: I026ecec483555658e09d6d8893e56ab62ee6914b
2020-07-01 10:40:20 -07:00
James Zern a97e332df4 add CONTRIBUTING.md
serves as a brief introduction and adds a link to the gerrit
instructions on webmproject.org.

Bug: webm:1669
Change-Id: If1d483eb48e2edcda8c51e66bdd1a86b7c35b986
(cherry picked from commit 220b00dd0d)
2020-06-30 17:18:54 +00:00
James Zern 220b00dd0d add CONTRIBUTING.md
serves as a brief introduction and adds a link to the gerrit
instructions on webmproject.org.

Bug: webm:1669
Change-Id: If1d483eb48e2edcda8c51e66bdd1a86b7c35b986
2020-06-29 19:50:36 -07:00
jinbo c039b5442b vp8,vpx_dsp:[loongson] fix specification of instruction name
1.'xor,or,and' to 'pxor,por,pand'. In the case of operating FPR,
  gcc supports both of them, clang only supports the second type.
2.'dsrl,srl' to 'ssrld,ssrlw'. In the case of operating FPR, gcc
  supports both of them, clang only supports the second type.

Change-Id: I93b47348e7c6580d99f57dc11165b4645236533c
2020-06-29 18:57:06 +00:00
Marco Paniconi de4aedaec3 vp9-svc: Fix to setting frame size for dynamic resize
For svc with dynamic resize (only for single_layer_svc mode),
add flag to indicate resized width/height has already been set,
otherwise on the resized/trigger frame (resize_pending=1), the
wrong resolution may be set if oxcf->width/height is different
than layer width/height in single_layer_svc mode.

Change-Id: I24403ee93fc96b830a9bf7c66d763a48762cdcb4
2020-06-26 17:08:54 -07:00
Marco Paniconi 3f18b08397 vp9-svc: Allow scale_references for single layer svc
This is needed to allow for newmv search in nonrd_pickmode
for resize/scaled frame, and for int_pro_motion_estimation
on resized/scaled frame.

Change-Id: I5e2fdbc4706a10813c1b00f6194e2442f648905a
2020-06-25 13:53:04 -07:00
James Zern 7129ee8d8b update googletest to v1.10.0
this moves the framework to c++11 and changes *_TEST_CASE* to
_TEST_SUITE

BUG=webm:1695,webm:1686

Change-Id: I07f2c20850312a9c7e381b38353d2f9f45889cb1
(cherry picked from commit 83769e3d25)
2020-06-21 17:04:43 -07:00
James Zern ff078e58c7 vp9_skip_loopfilter_test: make Init() return a bool
ASSERT's in the function only force a return, not termination. this
fixes a static analyzer issue with using a null decoder object in
following calls.

BUG=webm:1695,webm:1686

Change-Id: I79762df8076d029c5c8fef4d5e06ed655719de62
(cherry picked from commit 0370a43816)
2020-06-21 17:04:21 -07:00
James Zern d9a69a1e29 Merge "tools/lint-hunks.py: skip third_party files" 2020-06-19 18:15:00 +00:00
James Zern 7ec916e818 Merge changes I07f2c208,I79762df8
* changes:
  update googletest to v1.10.0
  vp9_skip_loopfilter_test: make Init() return a bool
2020-06-19 02:52:58 +00:00
James Zern 1c9fd977aa tools/lint-hunks.py: skip third_party files
Change-Id: I2fda3119c08b5755f1a9b2fad1125090b0d86850
2020-06-18 18:34:55 -07:00
Marco Paniconi 769129fb29 vp9-rtc: Fix to resetting drop_spatial_layer
The reset happens on the base spatial layer, before
encoding. But it should be reset on the
first_spatial_layer_to_encode, which may not be 0.

Change-Id: I38ef686b4459ca7433062adbfe32ef2134e1ad60
2020-06-18 11:26:46 -07:00
James Zern 83769e3d25 update googletest to v1.10.0
this moves the framework to c++11 and changes *_TEST_CASE* to
_TEST_SUITE

BUG=webm:1695

Change-Id: I07f2c20850312a9c7e381b38353d2f9f45889cb1
2020-06-18 10:56:39 -07:00
James Zern 0370a43816 vp9_skip_loopfilter_test: make Init() return a bool
ASSERT's in the function only force a return, not termination. this
fixes a static analyzer issue with using a null decoder object in
following calls.

BUG=webm:1695

Change-Id: I79762df8076d029c5c8fef4d5e06ed655719de62
2020-06-18 10:54:23 -07:00
Marco Paniconi e9c6cb6474 vp9-rtc: Fixes to resizer for real-time
Reduce the time before sampling begins (after key)
and reduce averaging window, to make resize act
faster.

Reset RC paramaters for temporal layers on resize.

Add per-frame-bandwidth thresholds to force
downsize for extreme case, for HD input.

Change-Id: I8e08580b2216a2e6981502552025370703cd206c
2020-06-18 09:36:09 -07:00
Marco Paniconi 3101666d2a vp9-svc: Add svc test for denoiser and dynamic resize
This catches the assert/crash fixed in 5174eb5.

Also fix to only check for dynamic resize in SVC mode
for base temporal layer.

Change-Id: Ie6eb7d233cc43eafb1b78cec4aeb94fb4d7fe11a
2020-06-16 12:31:04 -07:00
Marco Paniconi 5174eb5b92 vp9-svc: Fix to dynamic resize for svc denoising
Fix the logic to allow denoiser reset on resize for SVC mode,
as dynamic resize is allowed for SVC under single_layer mode.

Change-Id: I7776c68dadff2ccbce9b0b4a7f0d12624c2ccf90
2020-06-15 19:33:11 -07:00
angiebird e753d4930f Let SetExternalGroupOfPicturesMap use c-style arr
Change-Id: Ic92ce5a3cc5bb74120eb32fc6219e43b1b861f14
2020-06-11 15:10:38 -07:00
angiebird 812eb89b26 Fix assertion error in simple_encode.cc
Change-Id: I271d11cc35d34d5450a8b56fabcedaf2bb7c6565
2020-06-08 16:46:15 -07:00
Angie Chiang e53dc9f2ea Merge "Refactor simple_encode_test.cc" 2020-06-03 23:31:45 +00:00
Jerome Jiang c176557314 Merge "Add NV12 support" 2020-06-02 22:28:57 +00:00
angiebird d1ed2f0d7a Refactor simple_encode_test.cc
1) Avoid using global variables.

2) Add comments to EncodeConsistencyTest.

3) Check frame_type and show_idx in EncodeConsistencyTest.

Change-Id: I2261a0bd65189beb70432d62c077ef618a2712ab
2020-06-02 15:27:59 -07:00
Jerome Jiang 64485398d8 Add NV12 support
Change-Id: Ia2a8221a156e0882079c5a252f59bc84d8f516b1
2020-06-02 14:09:48 -07:00
angiebird 34034789d7 Add extra check / unit test to SetExternalGroupOfPicturesMap()
Let SetExternalGroupOfPicturesMap() modify the gop_map_ to satisfy
the following constraints.
1) Each key frame position should be at the start of a gop.
2) The last gop should not use an alt ref.

Add unit test for SetExternalGroupOfPicturesMap()

Change-Id: Iee9bd238ad0fc5c2ccbf2fbd065a280c854cd718
2020-05-28 18:07:12 -07:00
Angie Chiang 9c7e04a159 Merge "Refactor decode_api_test and realtime_test" 2020-05-28 05:40:26 +00:00
angiebird 23b070f46e Add functions to compute/observe key frame map
Change-Id: I2fc0efb2ac35e64af3350bddaa802a206d1aa13c
2020-05-26 23:33:03 -07:00
angiebird fe8cce2e36 Init static_scene_max_gf_interval in vp9_rc_init()
Change-Id: I2cad885fac2fd5f3e84d02b905a2ce59eb66760e
2020-05-26 23:31:14 -07:00
angiebird 5c8431b9e4 Make SetExternalGroupOfPicture support no arf mode
Rename external_arf_indexes by gop_map

Use kGopMapFlagStart to indicate the start of a gop in the gop_map.
Use kGopMapFlagUseAltRef to indicate whether to use altref in the
gop_map.

Change-Id: I743e3199a24b9ae1abd5acd290da1a1f8660e6ac
2020-05-26 23:30:57 -07:00
angiebird fdf04093ec Add GOP_COMMAND
Send GOP_COMMAND to vp9 for setting gop decisions on the fly.
GOP_COMMAND has three members.
use: use this command to set gop or use vp9's gop decision.
show_frame_count: number of show frames in this gop.
use_alt_ref: use alt ref frame or not.

Move the logic of processing external_arf_indexes_ from
get_gop_coding_frame_num() to GetGopCommand() and
GetCodingFrameNumFromGopMap().

Change-Id: Ic1942c7a4cf6eecdf3507864577688350c7ef0cf
2020-05-26 19:25:52 -07:00
angiebird a53da56629 Refactor decode_api_test and realtime_test
Replace NULL by nullptr.
Use override specifier over virtual specifier.

Change-Id: Iac2c97f997abd6ed9a5cd3991e052e79996f40f4
2020-05-20 14:49:00 -07:00
James Zern 3bc58f13cc vp9_decoder: free postproc_state.prev_mip
this fixes a leak when using MFQE

BUG=webm:1692

Change-Id: I19fb2f07155769f59924e0843989b3d3f8899bf6
2020-05-19 16:41:54 -07:00
Marco Paniconi f80e888723 vp9-svc: Fix key frame update refresh simulcast flexible svc
For flexible svc in simulcast mode: don't allow refresh
of all reference slots on key frame. Which slots to update
should be based on the user flags.

Change-Id: I3597c61ebcdfed2055bbdffec7ce701fad892744
2020-05-15 14:11:37 -07:00
Yunqing Wang 1c1d5c5baf Merge "vp9_firstpass.c: limit mv_limits with MV_MAX in motion_search" 2020-05-15 16:17:52 +00:00
Marco Paniconi 1243d2fc27 vp9-rtc: Increase thresh for scene detection
For CBR screen content mode. Makes it more
robust to false detections.

Change-Id: Icad89adb6f79b530b589bba2c71ba88ee5088d37
2020-05-13 15:16:49 -07:00
Jerome Jiang b9fa9ef693 Merge "Don't collect stats if they won't be used" 2020-05-12 16:54:29 +00:00
James Zern e334d9fd1b Merge "Temporarily convert to 64 bits to avoid overflows" 2020-05-12 00:32:03 +00:00
Jorge E. Moreira b6fa0f2056 Temporarily convert to 64 bits to avoid overflows
In the vp8_cost_branch function a couple of unsigned int are being
multiplied by integer coefficients and added to later be divided by
256. While the end result most likely fits an unsigned int, the
intermediary result of multiplying and adding sometimes doesn't (I was
able to reproduce it by leaving the encoder running at 60 fps for a
while). To avoid the multiplication overflow (which is undefined
behavior and causes a wrong result anyways) the calculation is
performed using unsigned long long instead and cast to unsigned int
for return.

Bug: b/154172422
Test: run cuttlefish with webrtc enabled for an hour
Change-Id: If7ebbda38b2450a59ed3c99ffbb59dc62431a324
2020-05-11 21:29:18 +00:00
James Zern 5fd3c083a9 Merge changes Ib55d46e9,I4a4feeab
* changes:
  decode_api_test: add negative test for vpx_codec_error_detail
  examples: use die() on dec/enc_init() failure
2020-05-11 17:25:36 +00:00
Jerome Jiang 77960f37b3 Merge "Fix mac build with vp9 ratectrl interface" 2020-05-09 01:46:42 +00:00
Jorge E. Moreira e8b818b581 Don't collect stats if they won't be used
When the encoder is run continuously for a few minutes at 60 fps, the
total_target_vs_actual field overflows. Since this field is a signed
integer that's considered undefined behavior in C++, which causes an
abort when used in an android binary (those run with ubsan enabled)

Bug: b/154172422
Test: run cuttelfish with webrtc enabled for an hour
Change-Id: I8f7d9d0884311a6338bdcdec76348b8cc3ce8c69
2020-05-08 15:30:26 -07:00
James Zern 2d0f3b23a9 Merge "vpx_dec_fuzzer: add coverage for VP9D_SET_LOOP_FILTER_OPT" 2020-05-08 21:56:25 +00:00
Jerome Jiang d058d41e87 Fix mac build with vp9 ratectrl interface
Add -std=c++11 for darwin build.

Change-Id: I760d4f7096bc33520c02b2cd7000fed9ac6cdd90
2020-05-08 14:22:25 -07:00
James Zern da379e11af decode_api_test: add negative test for vpx_codec_error_detail
Change-Id: Ib55d46e9290d2bd36345ff4a9737e227664c2a5b
2020-05-07 17:18:20 -07:00
James Zern cbc276a3d4 examples: use die() on dec/enc_init() failure
rather than die_codec(). calling any api functions with an uninitialized
codec context is undefined. this avoids a crash in a call to
vpx_codec_error_detail().

BUG=webm:1688

Change-Id: I4a4feeabc1cafa44c8d2f24587fad79e313dba6d
2020-05-07 17:15:42 -07:00
James Zern c36a4d8d8e Merge "libs.mk,msvc: add missing vp9rc project" 2020-05-07 19:54:39 +00:00
James Zern 36187e607d libs.mk,msvc: add missing vp9rc project
+ fix some test_rc_interface issues:
add a space before $^ in the vcproj rule to add sources to the target,
one between the -I's, and make the guid unique; fixes build / link
errors.

Change-Id: Ia9c99f6a4482a001d993affbc3b3903c2a4e366a
2020-05-07 10:30:16 -07:00
James Zern 8c1c7471f8 vpx_dec_fuzzer: add coverage for VP9D_SET_LOOP_FILTER_OPT
BUG=chromium:1076203

Change-Id: Ib3339a9fd7d940b69a5ef89b3fbf7f4fdeaac006
2020-05-06 13:45:30 -07:00
James Zern b130f71bf0 Merge "vp8_dx_iface.c: make vp8_ctf_maps[] static" 2020-05-04 23:37:39 +00:00
James Zern 4478c121f5 vp8_dx_iface.c: make vp8_ctf_maps[] static
Change-Id: I6c19745a392681733c6deaaacc7e3540bc72fd4d
2020-05-04 15:27:53 -07:00
Wan-Teh Chang 1647d0c62c Remove unneeded null check for entry in for loop
In vpx_codec_control_(), before we enter the for loop, we have already
checked if ctx->iface->ctrl_maps is null and handle that as an error. So
the for loop can assume ctx->iface->ctrl_maps is not null, which implies
'entry' is not null (both initially and after entry++).

Change-Id: Ieafe464d4111fdb77f0586ecfa1835d1cfd44d94
2020-05-04 11:21:25 -07:00
James Zern b120ba5781 test/*.sh: add explicit error checks/returns
there was an assumption that function calls would terminate early with
an error given 'set -e' was being used. this is true, but only when the
function is part of a simple command otherwise it won't inherit the
behavior. many of the call sites use 'func || return 1' syntax meaning
the function would continue to completion return with the status of the
last command executed. this hid errors with e.g., eval statements. inner
calls within the functions are now explicitly tested for failure.

BUG=aomedia:2669

Change-Id: Ie33a5ac4023dcc800bd302cb8cc54c6c3f2282f5
2020-04-28 18:51:01 -07:00
Wan-Teh Chang 3d28ff9803 Update a comment on nonexistent vpx_codec_init
Update a comment on the nonexistent vpx_codec_init() function. Replace it
with vpx_codec_dec_init() and vpx_codec_enc_init().

I missed this comment in the last commit.

Change-Id: I1d3614b3bb3aa4330ac6bd49e4d2e1f4e627b6b0
2020-04-27 11:53:33 -07:00
James Zern 94e4341bc0 Merge "Update comments on nonexistent vpx_codec_init" 2020-04-27 18:38:37 +00:00
Neil Birkbeck 7b25b1397c vp9_firstpass.c: limit mv_limits with MV_MAX in motion_search
Currently, in rare cases on big videos (> 5K), best_mv may differ from ref_mv by more than the allowable MV_MAX. Intersect mv_limits with those bound by MV_MAX before diamond search.

We could use vp9_set_mv_search_range, but that seems a bit more constrained than the bug I encountered (e.g., MAX_FULL_PEL_VAL < MV_MAX / 8).

Change-Id: I2c6563c05039d6ee05edf642665faaccf51787d4
2020-04-25 10:24:45 -07:00
Marco Paniconi f1283ca4c6 Merge "vp9-rtc: Some speedups to speed 5 real-time mode" 2020-04-25 00:20:42 +00:00
James Zern e404f867cc Update comments on nonexistent vpx_codec_init
Update comments on the nonexistent vpx_codec_init() function. Replace it
with vpx_codec_dec_init() and vpx_codec_enc_init().

based on the change in libaom:
b1b8c68e8 Update comments on nonexistent aom_codec_init

Change-Id: I63d3f6c87706a98f631457b5f6ce51e8b0c5cfb1
2020-04-24 16:25:10 -07:00
Jerome Jiang ae145ca3a4 Merge "Revert "Revert "Remove RD code for CONFIG_REALTIME_ONLY in vp9.""" 2020-04-24 22:09:21 +00:00
Marco Paniconi 4b7baf805f vp9-rtc: Some speedups to speed 5 real-time mode
Disable checking rectangular partitions in
nonrd_pick_partition, and enable use_source_sad.

~3-4% speedup for HD clip on x86.
bdrate loss of ~0.2% on rtc set.

Change-Id: Ibef8f100f1f623482d47510cb4ec9278ba777d7c
2020-04-24 14:51:51 -07:00
Jerome Jiang ce5f42b245 Revert "Revert "Remove RD code for CONFIG_REALTIME_ONLY in vp9.""
Under CONFIG_REALTIME_ONLY flag, map speed < 5 to speed 5.

Bug: webm:1684

This reverts commit 85cb983682.

Change-Id: I67b7ed37e8b74417db310ea0c817d3c5a5de9e44
2020-04-24 10:47:24 -07:00
Marco Paniconi ea0734cc90 vp9-rtc: Allow simulcast mode for flexible/bypass mode
Change-Id: I0252d06c4f21d7c700c81d387bef89646229a63c
2020-04-24 10:11:39 -07:00
Jerome Jiang 742dbe2a6a Merge "Move index check before array access" 2020-04-23 21:28:03 +00:00
Jerome Jiang e0069c8338 Merge "Revert "vp9-rtc: Some speedups to speed 5 real-time mode"" 2020-04-23 20:02:32 +00:00
Marco Paniconi 547b2bb701 Revert "vp9-rtc: Some speedups to speed 5 real-time mode"
This reverts commit 62af22b5e5.

Reason for revert: causes crash in chromium test

Change-Id: I27792e05ece84c79739638b8cce634ffeaef3ba1
2020-04-23 18:57:31 +00:00
James Zern ca2601262a Merge "realtime_test: add IntegerOverflow test" 2020-04-21 22:15:31 +00:00
James Zern afde303d00 realtime_test: add IntegerOverflow test
use an extreme bitrate to cover rate control calculations.
this is disabled by default as there are a mix of
-fsanitize=undefined/integer warnings for vp9 and -fsanitize=integer
warnings for vp8.

this is a follow-up to:
5e065cf9d vp8/{ratectrl,onyx_if}: fix some signed integer overflows
5eab093a7 vp9_ratectrl: fix some signed integer overflows

BUG=webm:1685

Change-Id: I24d223e33471217528a79b0088965ba51d0399ba
2020-04-21 15:14:34 -07:00
Jerome Jiang 85cb983682 Revert "Remove RD code for CONFIG_REALTIME_ONLY in vp9."
This reverts commit da24d35132.

BUG=webm:1684

Change-Id: I552c37c7bdc844610879a65cc02038d76a5d32b1
2020-04-20 16:45:22 -07:00
Marco Paniconi 62af22b5e5 vp9-rtc: Some speedups to speed 5 real-time mode
Enable use_source_sad at speed 5 and use it to
condition min_partition_size in nonrd_select_partition.
Also disable checking rectangular partitions in
nonrd_pick_partition for speed >= 5.

~5-8% speedup for HD clip on x86.
bdrate loss of ~1% on rtc set.

Change-Id: Ia643b34a51191e3929a443de77e271561e7c877d
2020-04-19 20:37:25 -07:00
Vitaly Buka 31fa91b4cd Move index check before array access
This lets us run code with -fsanitize=bounds.

Bug: b/15471229

Change-Id: I5961ef43d21f04a0dc9e8bf7280dc27eb0a62094
2020-04-16 23:13:22 -07:00
Johann 686e12674f simplify x86_abi_support.asm symbol declaration
Define LIBVPX_{ELF,MACHO} to simplify blocks.

Create new globalsym macro and include logic for PRIVATE.

BUG=webm:1679

Change-Id: I303ba1492a2813f685de51155ccef7e4831e1881
2020-04-13 08:45:58 +09:00
James Zern 0ac5afb375 Merge "transpose_sse2.h,cosmetics: fix some comments" 2020-04-10 17:10:35 +00:00
Marco Paniconi fc5ab625b5 vp9-rtc: Fix to disable_16x16part speed feature
Condition on current_video_frame count, as the
avg_frame_qindex needs some time to settle.

Fixes psnr test failures.

Change-Id: I462c45250becb55b72b6ffe2b7087094d6d58a01
2020-04-09 13:54:08 -07:00
Marco Paniconi 264a7ecfd8 Merge "vp9-rtc: Disable nonrd_keyframe for SVC, speed >=8" 2020-04-09 03:30:39 +00:00
Marco Paniconi bf073d996b vp9-rtc: Disable nonrd_keyframe for SVC, speed >=8
For speed >= 8: disable nonrd_keyframe SVC with
spatial_layers > 1. In this case having base
spatial layer key frame with higher quality
(hybrid mode search) is beneficial, without too
much cpu cost (since its on lowest spatial layer).

Change-Id: Iff7c43aed4e808603d8abdedb6eb5d2c9c8ecb8d
2020-04-08 19:09:16 -07:00
Marco Paniconi 80de626f9f vp9-rtc: Set disable_16x16part for low-resoln high Q
Only affects variance partition at low-resoln,
speed 6,7 real-time mode. At very high Q better to
save bits from the split to 8x8.

bdrate gain ~3% on rtc_derf at very low bitrates

Change-Id: I94ee58e67d5ba6277cbab8f8dd9ea45b035c82b5
2020-04-08 18:34:59 -07:00
Jerome Jiang 8dc6f353c6 Merge "vp9: add rate control interface for RTC" 2020-04-07 00:27:08 +00:00
Jerome Jiang 745979bc29 vp9: add rate control interface for RTC
Add vp9 RTC rate control without creating encoder,
to allow external codecs to use vp9 rate control.

A new library (libvp9rc.a) will be built. Applications using this
interface must be linked with the library.

BUG=1060775

Change-Id: Ib3e597256725a37d2d104e1e1a1733c469991b03
2020-04-06 15:31:40 -07:00
Johann 1717ac939c x86_abi_support: do not decorate coff functions
:private_extern only applies to macho. Match x86inc.asm logic:
%if FORMAT_ELF
  global %2:function hidden
%elif FORMAT_MACHO
  global %2:private_extern
%else
  global %2
%endif

May fix a build issue on windows:
vp8/encoder/x86/block_error_sse2.asm:18: error:
  COFF format does not support any special symbol types

BUG=webm:1679

Change-Id: I7e1f4043b064a04752d1cedd030cbe7f5461fe40
2020-04-06 16:06:57 +09:00
Johann Koenig 104adb2aa3 Merge changes I24997420,Ie4ca7435,I36011727,Ibb01b09c,Ifb17acbe, ...
* changes:
  x86inc.asm: update to 3e5aed95c
  x86inc.asm: namespace ARCH_* defines
  x86inc.asm: only set visibility for chromium builds
  x86inc.asm: do not align .text for aout
  x86inc.asm: use .text on march32
  x86inc.asm: copy PIC macros from x86_abi_support.asm
  x86inc.asm: set PREFIX from libvpx defines
  x86inc.asm: pull settings from libvpx
  x86inc.asm: update to 3e5aed95
2020-04-06 00:09:56 +00:00
James Zern ce84336ec3 transpose_sse2.h,cosmetics: fix some comments
Change-Id: Idae90838012c78605f20f1d7a3125b71683f6f44
2020-04-03 13:50:40 -07:00
Wan-Teh Chang 676c936ed3 Return VPX_CODEC_INCAPABLE on capability failure
All decoder functions should return the VPX_CODEC_INCAPABLE error code
if the algorithm does not have the requested capability.

Move the definitions of VPX_CODEC_CAP_FRAME_THREADING and
VPX_CODEC_CAP_EXTERNAL_FRAME_BUFFER to the VPX_CODEC_CAP_* section.

Change "PUT_SLICE and PUT_FRAME events are posted" to "put_slice and
put_frame callbacks are invoked".

Also fix some other minor comment errors.

This carries back to libvpx the following libaom CL:
https://aomedia-review.googlesource.com/c/aom/+/108405

Change-Id: If67a271c9abbb3eebc2359719cc7d9f235b690d2
2020-04-02 20:56:52 -07:00
Johann a9c693aa8c x86inc.asm: update to 3e5aed95c
BUG=webm:1679

Change-Id: I24997420b7b43cac3c674300c667eb493794893e
2020-04-02 11:09:58 +09:00
Johann 27fa7914ea x86inc.asm: namespace ARCH_* defines
Reapply fad865c54 to prevent redefinition warnings.

BUG=webm:1679

Change-Id: Ie4ca7435b1f84711d0231e7957129580b05b3918
2020-04-02 11:08:57 +09:00
Johann db41156a28 x86inc.asm: only set visibility for chromium builds
Reapply and update a4b47b89f. This restores the previous version's
behavior avoiding issues with builds that may split sources on
directory boundaries; protected visibility may work in this case.

BUG=webm:1679

Change-Id: I36011727485847dd11f06782bc6beddedc39019c
2020-04-02 11:08:54 +09:00
Johann a22ba99d42 x86inc.asm: do not align .text for aout
Reapply a97c83f7a. Only use .text sections for aout and do not specify
an alignment.

BUG=webm:1679

Change-Id: Ibb01b09c205f9e0ecd4bfa0241e3d5e01ae5a55e
2020-04-02 11:08:32 +09:00
Johann a90c2e5425 x86inc.asm: use .text on march32
Reapply 9679be4bc. The read only sections are getting stripped on some
OS X builds. As a result, random data is used in place of the intended
tables.

BUG=webm:1679

Change-Id: Ifb17acbed73df4b9949a8badae2d9305a3073b83
2020-04-02 11:02:23 +09:00
Johann cbd820dceb x86inc.asm: copy PIC macros from x86_abi_support.asm
Reapply 7e065cd57. x86inc.asm always defines PIC for x86_64. We undefine
it for x32.

Incorporate e56f96394 as well to ensure GET_GOT_DEFINED is defined.

BUG=webm:1679

Change-Id: I1535d57bcb4223327ca63b4fd11bffcda1009332
2020-04-02 11:01:52 +09:00
Johann 66095ce595 x86inc.asm: set PREFIX from libvpx defines
Reapply 4de9641f1

BUG=webm:1679

Change-Id: I70b2224121f8f997fcd04c38a07a8126c2855ec6
2020-04-02 10:32:29 +09:00
Johann 9d0807cd5b x86inc.asm: pull settings from libvpx
Reapply 1be46ef6b. Include vpx_config.asm and prefix functions with vpx.

BUG=webm:1679

Change-Id: I5fba3154203822a829bc88ad0e302adf2ce3bbee
2020-04-02 10:32:29 +09:00
Johann a9ca3e871c x86inc.asm: update to 3e5aed95
Pull a clean copy in and name it _new. Will apply the libvpx
patches and then move it over.

BUG=webm:1679

Change-Id: I48d3d4ab7911340c0997dd79a0dbadccf5697682
2020-04-02 10:32:22 +09:00
Johann e8be64983a x86_abi_support: use correct hidden syntax
Chromium needs :function hidden and the space between the symbol and the
colon removed, at least for nasm. This matches x86inc.asm.

BUG=webm:1679

Change-Id: Ie47bb75d44d3130791639cbf4e2ebe019e2d686e
2020-04-01 08:46:30 +09:00
Johann Koenig 667138e1f0 Merge "nasm: require 2.14 with -DCHROMIUM" 2020-03-31 23:41:42 +00:00
Johann Koenig 39f6b4c960 Merge "auto-detect darwin19" 2020-03-31 23:41:26 +00:00
Marco Paniconi aae27c8f1e vp9-rtc: Refactor postencode for 1 pass
Move some code for 1 pass, that is not
directly related to rate control, out of
the postencode.

This avoids the need of extra flag for the
RC interface in:
https://chromium-review.googlesource.com/c/webm/libvpx/+/2118915

Change-Id: I3992ea8255196a762c1174c35dd7dcc9b01d317e
2020-03-31 11:37:04 -07:00
Johann 3ef630a357 auto-detect darwin19
Change-Id: I3912c79d0f0f7a65fc753ae29bb10cdcac76878a
2020-03-31 15:52:30 +09:00
Johann b3d12b3c7f nasm: require 2.14 with -DCHROMIUM
BUG=webm:1679

Change-Id: I75b1f860d111febf0aabe38b89d845ef296728a4
2020-03-31 15:36:39 +09:00
Marco Paniconi 4b0422ad09 rtc: Increase resize limit resoln for rtc
Increase resize limit to avoid resized frame
from going below 320x180.

Change-Id: If736ac3fac4731b47844e4d8c771ecf5c66550de
2020-03-30 11:55:37 -07:00
Marco Paniconi 31326b5bbd vp9-rtc: Increase resize down limit to 320x180
For RTC dynamic resize: don't allow resize for
resoln <= 320x180.

Change-Id: I9109e9e1338e5420e72436a57d266ae46e9f2d60
2020-03-27 14:26:49 -07:00
angiebird d28d82263f Init frames_to_key in vp9_rc_init()
Change-Id: Ic667c77ff58672212fc2e9dd5066c650b0152226
2020-03-25 18:19:10 -07:00
James Zern eda6b92549 Merge "Optimize vp9_get_sub_block_energy." 2020-03-25 00:56:41 +00:00
James Zern 5e065cf9d3 vp8/{ratectrl,onyx_if}: fix some signed integer overflows
in calculations involving bitrate in encode_frame_to_data_rate() and
vp8_compute_frame_size_bounds()

note this isn't exhaustive, it's just the result of a vpxenc run with:
-w 800 -h 480 --cpu-used=8 --rt --target-bitrate=1400000000

Bug: b/151945689
Change-Id: I3a4f878046fcf80e87482761588c977c283ae917
2020-03-21 16:19:04 -07:00
James Zern 5eab093a7b vp9_ratectrl: fix some signed integer overflows
in calculations involving bitrate in vp9_rc_postencode_update() and
calc_pframe_target_size_one_pass_vbr()

note this isn't exhaustive, it's just the result of a vpxenc run with:
-w 800 -h 480 --cpu-used=8 --rt --target-bitrate=1400000000

Bug: b/151945689
Change-Id: I941a77340fd44b09fc965dd182d7aeab9f1f3da0
2020-03-21 16:18:04 -07:00
Clement Courbet 8137453290 Optimize vp9_get_sub_block_energy.
Because energy scaling is non-decreasing, we can work on the variance
and scale after the loop. This avoids costly computations (in
particular, log()) within the loop.
We've measured that we spend 0.8% of our total time computing the log.

Change-Id: I302fc0ecd9fd8cf96ee9f31b8673e82de1b2b3e2
2020-03-20 13:33:37 +01:00
Angie Chiang a0765aa9f3 Merge changes I8a14fcad,Iad7ca261,I2063c592,I9c5c74ab
* changes:
  Correct time_base of ivf header in SimpleEncode
  Add detail comments on valid_list in SimpleEncode
  Add missing Copyright to python files
  Move member functions up in simple_encode.h
2020-03-17 23:32:15 +00:00
angiebird 77c959654c Correct time_base of ivf header in SimpleEncode
Change-Id: I8a14fcad3e7b4c4689f4e7387414e59ba9c4c20a
2020-03-17 14:13:43 -07:00
angiebird 5010fa9dc3 Add detail comments on valid_list in SimpleEncode
Change-Id: Iad7ca261a99c7b5f082cf3cc6504f4af438bf409
2020-03-16 18:01:14 -07:00
angiebird f81bdda8dd Add missing Copyright to python files
BUG=webm:1655,webm:1654

Change-Id: I2063c59218e082f40958dddbdcb1c105d5440199
2020-03-16 11:59:49 -07:00
James Zern 223645aa83 vpx_codec_enc_config_default: rm unnecessary loop
quiets -Wunreachable-code-loop-increment, present since:
e57f388bc vpx_codec_enc_config_default: disable 'usage'

as g_usage was never supported for vp8/9 this was always a single
iteration. if additional usages are added in the future similar to av1
this can be restored.

Bug: b/150166387
Change-Id: Ic6f0985829e87694de8b5e0340cffa6c451ed1c2
2020-03-13 20:35:32 -07:00
angiebird 34f3fe9512 Move member functions up in simple_encode.h
Change-Id: I9c5c74ab52361bcd73aef110729c6e332066c2af
2020-03-13 16:24:26 -07:00
Johann e3979bd385 fix minor spelling errors
Change-Id: I929fec66d541705fe94365b56a5bdd8cf5ee7c37
2020-03-13 08:26:53 +09:00
Angie Chiang 323b11a99b Merge changes Ie7c70a1d,I2c5abbe2,If41a1ea6,Id6ba4664,I156308bc
* changes:
  Add unit test for ref_frame_info
  Add key frame group info to SimpleEncode
  Add ref_frame_info to encode_frame_result
  Add init/update_frame_indexes()
  Add GetVectorData()
2020-03-05 19:24:29 +00:00
Marco Paniconi 5532775efe rtc: Update svc test for resize
Add count on expected number of resizes,
and use the speed_setting_ for base layer.

Also allow AQ_MODE=3 for the tests with
dynamic layer disabling/enabling.

Change-Id: I03fb0789a2210ba00b8b153941bf79fb774d51bf
2020-03-04 16:20:49 -08:00
Marco Paniconi 1e892e63f9 vp9-svc: Allow for dynamic resize for single layer SVC
Make internal dynamic resize work for SVC mode
when single layer SVC is running (i.e, other layers
are dropped due to 0 bitrate).

Added unittest.

Change-Id: Icf03e1f276d9c4ba2734c87c927f7881c6b0a116
2020-03-03 21:42:37 -08:00
angiebird e7aa1e3630 Add unit test for ref_frame_info
Fix several bugs to make the test pass.
1) Move update_frame_indexes() out of show_frame check.
2) Init coding_indexes[i] to -1 when key frame appears
3) Fix a bug in PostUpdateRefFrameInfo()

Change-Id: Ie7c70a1d460e5b89475a1aef77416fc9a88387e1
2020-03-03 10:57:03 -08:00
angiebird 483bcb310d Add key frame group info to SimpleEncode
Change-Id: I2c5abbe23c84c6d794e06ed6429136b10fb18683
2020-03-03 10:57:03 -08:00
angiebird fc898231f1 Add ref_frame_info to encode_frame_result
Change-Id: If41a1ea6ce0a2b8db3811f2fa8efcf16f97fa0bd
2020-03-03 10:56:53 -08:00
angiebird 93834facfb Add init/update_frame_indexes()
We will init and update current_video_frame and
current_frame_coding_index in the functions.

So it's easier to keep track of when the frame indexes are updated.

Change-Id: Id6ba46643f8923348bb4f81c5dd9ace553244057
2020-03-02 20:17:09 -08:00
angiebird a1c0c95c8c Add GetVectorData()
It's necessary to get data pointer from a vector sometimes.
This function will guarantee that the data pointer is nullptr
if the vector is empty.

Change-Id: I156308bcb193fe404452d3cd3b24b3f80c3c3727
2020-03-02 20:16:56 -08:00
angiebird c2aa1520a4 Add RefFrameInfo
RefFrameInfo contains the coding_indexes and valid_list of
three reference frame types.

Note that I will add unit test in the follow-up CLs.

Change-Id: Ia055df1f8a5537b2bdd02c78991df9bbf48e951a
2020-02-26 14:25:00 -08:00
angiebird d6f7334abc Keep ref frame coding indexes in SimpleEncode
Change-Id: Id76aeb54ef93b11ca9a582f76289da0e60368e56
2020-02-25 15:59:49 -08:00
Cheng Chen 8a96ad8f47 Make external arf consistent with vp9
Add a test to ensure that encoding with the external arfs gets the
same result as long as the arfs are the same as the vp9 baseline.

Change-Id: I92c79001018f4df3bc16e9fc56c733509bebb9dc
2020-02-24 19:06:57 -08:00
Cheng Chen 08b01d5e05 Allow external arf to determine gop size
When "rate_ctrl" experiment is on, we allow the external arf
passed from outside to determine group of picture size
in define_gf_group().

Change-Id: I0b8c3e1bf3087f21a4e484354168df4967d35bba
2020-02-24 14:49:14 -08:00
Cheng Chen 7e665ec968 Add interface for external arf indexes.
Pass in external arf indexes to encode command.

Change-Id: Ifea5a7d835643760fc5effc594bb448848f6d639
2020-02-24 14:29:51 -08:00
angiebird 0484849e8e Rename values in RefFrameType and FrameType
Replace golden and altref by past and future in RefFrameType.
So that we don't get confused with FrameType and RefFrameType.

Change-Id: I1be45d49f76c68869fc4bf53ff946fee9ce7eb9d
2020-02-21 10:06:26 -08:00
angiebird bff7ecc517 Use ref_frame[0] to determine mv_count
The motion vector counts should be determined by whether this
block is using intra_mode or not.

Change-Id: If866c91fb8a3f2b3944e5b219a90154d2172690d
2020-02-20 17:14:31 -08:00
angiebird da3e3ecc7b Consistency test for GroupOfPicture
Make sure frame_type, show_idx and coding_index in GroupOfPicture
match the results in EncodeFrameInfo.

Change-Id: I3b477a03b5efd651c2d174e7146a4cd4f5551604
2020-02-20 16:35:30 -08:00
angiebird a6238a1085 Use ObserveGroupOfPicture() in EncodeFrame test
In the previous version, we assume the number of coding frames is
known.

Although the assumption is true for now with rate_ctrl flag on,
it's more proper to use ObserveGroupOfPicture() to get
the partial info about how many coding frames are in the group.

Because We want to keep the flexibility of changing the size of
group of pictures on the fly in the future.

Change-Id: Ibbe6ab49268c468bf1cef8344efd3a3e1eab972a
2020-02-20 16:25:31 -08:00
angiebird 320fb4c34a Add kGoldenFrame and kOverlayFrame to FrameType
Add coding_index to EncodeFrameInfo
Add start_coding_index to GroupOfPicture
Add frame_coding_index_ to SimpleEncode

The definition of coding index is as follows.

Each show or no show frame is assigned with a coding index based
on its coding order (starting from zero) in the coding process of
the entire video. The coding index for each frame is unique.

Change-Id: I43e18434a0dff0d1cd6f927a693d6860e4038337
2020-02-20 14:24:22 -08:00
James Zern 5774259307 Merge "x86_simd_caps: make mask value unsigned" 2020-02-19 21:41:41 +00:00
Marco Paniconi 55f2e4a0a8 Merge "vp9-rtc: Increase partition threshold to 8x8 for high Q" 2020-02-19 02:24:39 +00:00
Jerome Jiang a338982357 Merge "Cap delta_q_uv to -15..15" 2020-02-18 23:40:47 +00:00
Marco Paniconi f9eee0cfbe vp9-rtc: Increase partition threshold to 8x8 for high Q
For low resolutions: increase the partition threshold
to split to 8x8 blocks for high Q.
Some improvement in quality for low bitrates at low resoln.

On rtc_derf speed 7: ~1.7 bdrate gain for low bitrates.

Change-Id: I1900c32497b75da4e8b882fedc8f4b440b017480
2020-02-18 15:36:30 -08:00
Jerome Jiang 2d1a7cd45a Cap delta_q_uv to -15..15
only 4 bits in bitstream

Change-Id: I338fe54475e094ee5e556467e0b66c982bb560fa
2020-02-18 10:48:55 -08:00
Marco Paniconi 939f84f865 vp9-rtc: Set enable_adaptive_subpel_force_stop to 0
Set enable_adaptive_subpel_force_stop to 0 as default
for all speeds. Its only enabled for speed >= 9.

Change-Id: I23a1c1cb9765994d2153ef401976c11a07f3fe7f
2020-02-18 10:15:09 -08:00
James Zern 0f3fe088fa vp8_decode: add missing vpx_clear_system_state
this avoids leaving the floating point unit in an inconsistent state on
error and breaking subsequent tests on x86
the test clip invalid-bug-148271109.ivf would also result in a sanitizer
error prior to:
vp8,GetSigned: silence unsigned int overflow warning

BUG=b/148271109

Change-Id: Ia254f3892ac1eeec51db5e9d42ea071545db0cd8
2020-02-14 21:03:56 -08:00
James Zern 9cfcac1cb3 vp8,GetSigned: silence unsigned int overflow warning
in non-conformant fuzzed bitstreams the calculation of br->value may
overflow. this is defined behavior and harmless in that the stream is
already corrupt.

BUG=b/148271109

Change-Id: I3668ada57e0bd68cea86b82917fb03c19ac1283d
2020-02-14 20:44:26 -08:00
James Zern c713f84616 move common attribute defs to compiler_attributes.h
BUG=b/148271109

Change-Id: I620e26ff1233fcd34ebe0723cb913e82eb58271c
2020-02-14 20:44:01 -08:00
James Zern 0d887ef514 x86_simd_caps: make mask value unsigned
fixes -fsanitize=integer warning:
runtime error: implicit conversion from type 'int' of value -1 (32-bit,
signed) to type 'unsigned int' changed the value to 4294967295 (32-bit,
unsigned)

Change-Id: I95d41aade78cea5e4f870a804d3f358c2cf618d7
2020-02-14 17:39:22 -08:00
Cheng Chen 6ca77eb7ca Merge "Set mv to zero if the second ref does not exist" 2020-02-11 20:30:50 +00:00
Cheng Chen b7075bc90b Merge "Add a unit test to check partition info" 2020-02-11 20:21:19 +00:00
Cheng Chen 69b30b37fd Add a unit test to check partition info
Change-Id: I397d7005961a037c9c9cb29e3ff0a3d39a501d15
2020-02-10 21:39:37 -08:00
Cheng Chen 9d5bc18b09 Set mv to zero if the second ref does not exist
Change-Id: I94b936c2642981eccdff073fc71c12e2dccb7909
2020-02-10 16:21:23 -08:00
angiebird a264f53693 Do save/restore_encode_params when rate_ctrl is on
Change-Id: I06492a4d1511869cb243477a47295d5f82608fca
2020-02-10 12:06:09 -08:00
angiebird 91f8be5045 Replace NULL by nullptr in simple_encode.c/h
Change-Id: Ib68740a02be852d03a3a2ad4d9d4a7d84d537590
2020-02-06 18:15:02 -08:00
angiebird a3171b107a Sync simple_encode.h
Change-Id: I046b8c65c96e1864813f9a82649dd6b41ba0aa1f
2020-02-06 18:10:25 -08:00
angiebird 67377d3833 Rename inverse_vpx_rational to invert_vpx_rational
Change-Id: I9139ebc22be74e9726eee157821faf22d44bd30f
2020-02-06 18:10:25 -08:00
Cheng Chen 3329c0e15f Consistency test for motion vector info
Change-Id: Ie1d77e231b973eb16f4e9c520721b47cdf86622c
2020-02-06 11:54:51 -08:00
Cheng Chen 32b6fb96b6 Pass motion vector info to encode frame result
Pass the motion vector info stored to the encode frame result
through the interface "update_encode_frame_result()".

Change-Id: I589affa0c4c4d0fd4d639edff9068e44a715beff
2020-02-06 11:53:55 -08:00
angiebird 2c465567e6 Let SimpleEncode be able to output bitstream
Add outfile_path to SimpleEncode() with default value NULL.
The encoder will only output bitstream when outfile_path is set.

Change-Id: Ic68e5358ea454358c510bb0ae214f4201cb3db39
2020-02-05 16:12:32 -08:00
angiebird 8a5cb084e6 Add coded_frame to EncodeFrameResults
This coded_frame represents the raw coded image.

Change-Id: Iea439da2f9e84c4507b082d77ebaac49bfd74fff
2020-02-05 13:43:33 -08:00
Cheng Chen 48b2d90262 Merge "Store frame motion vector info" 2020-02-05 21:21:47 +00:00
James Zern 36133b04c0 loopfilter_sse2: call unsuffixed lpf functions
this allows calls to use better versions (e.g., avx2) if available. in
most other cases the function pointer will be defined to the sse2
variant if another isn't available. this improves performance at 1080P
by ~2% on a Xeon E5-2690.

Change-Id: Ie9da3a567021f8416651a29b8c9ab9238dc4bdf1
2020-02-03 17:00:01 -08:00
Cheng Chen e582df2337 Store frame motion vector info
Allocate motion vector information for the frame, and store it
when a superblock (64x64) is encoded.

The unit size of the smallest block is 4x4.

A special requirement by the vp9 spec is that sub 8x8 blocks
of a 8x8 block must have the same reference frame.

There is no such requirement for blocks large or equal to 8x8.

Change-Id: Iba17c568c450361e5d059503c6fb7bc458184c31
2020-02-02 13:54:46 -08:00
Jerome Jiang 5be37810d2 Merge "Fix initialization of delta_q_uv" 2020-01-29 00:53:15 +00:00
Jerome Jiang 47321016da Fix initialization of delta_q_uv
Change-Id: If778c6534a5e68a9bcd5974f778e97e1c5cc89ee
2020-01-28 15:25:31 -08:00
Cheng Chen 22b23e4d98 Change partition_info to a vector
Change-Id: Ia59229da51671045448ea904ed65026155868993
2020-01-28 09:58:42 -08:00
Cheng Chen d53645f498 Merge "Add some description about partition info" 2020-01-28 04:41:17 +00:00
Cheng Chen a0fa2841b8 Correctly assign partition info
If partition type is horz or vert, the info of the second rectangle
block should be stored.

Change-Id: I8af5f37eb2c9140cf75d4b87a0fadcec5e4d7b28
2020-01-27 17:24:29 -08:00
Cheng Chen b32cb9876c Add some description about partition info
Change-Id: I62e45433aad7887f47e3c88fc40f046feef92ad9
2020-01-27 15:26:13 -08:00
Cheng Chen 4138443c33 Merge "l2e: cosmetic changes of multi-dimension arrays" 2020-01-27 22:50:08 +00:00
Cheng Chen 0ffa76d41d Merge "Consistency test for partition info" 2020-01-27 22:49:58 +00:00
Cheng Chen a6ff3abbe6 Merge "Pass partition info to encode frame result" 2020-01-27 22:14:57 +00:00
Cheng Chen db900e5268 Consistency test for partition info
Test the information stored in the encoder is the same
between two encode runs.

Change-Id: I4f97fac4f212602f766aee0a6cbef566ca43b41e
2020-01-27 13:34:08 -08:00
Cheng Chen c2b2234062 Pass partition info to encode frame result
Init the memory for partition information in "EncodeFrameResult".
And pass the partition information of vp9 encoder to it through
the interface: "update_encode_frame_result()".

Change-Id: Iea049e661da79f54d41da7924b9ef28ff7cfbfa3
2020-01-27 13:05:48 -08:00
Marco Paniconi 4254ecaa07 Merge "vp9-rtc: Fix condition in regulate_q for cyclic_refresh" 2020-01-27 06:06:29 +00:00
Marco Paniconi cafbef75bf vp9-rtc: Fix condition in regulate_q for cyclic_refresh
The bits_per_mb factor from cyclic refresh does not
need to be conditioned on seg_enabled, cr->apply_cyclic_refresh
is sufficient. This is more correct for the case where
the refresh is turned off/on dynamically.

Small/neutral change in bdrate metrics.

Change-Id: Ifbeda9d3e022e6b61cdefa1482d3075f076d7253
2020-01-26 22:04:59 -08:00
Cheng Chen b4345aa689 l2e: cosmetic changes of multi-dimension arrays
Change-Id: I8c504b031cefeb8cfa4df8ca3a85c55fd1ae5a7f
2020-01-26 21:28:29 -08:00
Marco Paniconi 7885960830 vp9-svc: Fix to resetting rc flags on change_config
Condition should account for spatial layers.

Change-Id: I53ef27800d6cba1ae9d313d8f476e5137734d3d8
2020-01-26 18:13:43 -08:00
Cheng Chen eea06db178 Store frame partition info
Allocate partition information for the frame, and update it
when a superblock (64x64) is encoded.

The unit size of the smallest block is 4x4.

For each 4x4 block, store the current positition (row, column),
the start positition (row_start, column_start) of the partition,
and the block width and height of the partition.

Change-Id: I11c16bbca7e89a088715a1200abd23fe2f9ca1d6
2020-01-24 06:12:04 +00:00
James Zern 7a63240c48 Merge "vpx_timestamp,gcd: assert params are positive" 2020-01-23 22:53:25 +00:00
Marco Paniconi eeaf70361d vp9-rtc: Lower threshold for color sensitivity for screen
For screen content: lower the threshold for setting
color sensitivity on scene change.

Reduces artifacts in color slide change content.

Change-Id: Ie9a375dee9b8a546dede8afbd241e0e46f79a7f4
2020-01-22 17:20:23 -08:00
Jerome Jiang 7763c888e0 Merge "vp9: fix control for delta qp for uv" 2020-01-22 18:03:38 +00:00
James Zern f33f2fc686 vpx_timestamp,gcd: assert params are positive
this function is currently only used with range checked timestamp
values, but this documents the function's expectations in case it's used
elsewhere

Change-Id: I9de314fc500a49f34f8a1df3598d64bc5070248e
2020-01-21 16:12:25 -08:00
James Zern 7adcbf2f1f Merge "add static_assert.h" 2020-01-22 00:10:34 +00:00
Jerome Jiang 76ed2d5d69 vp9: fix control for delta qp for uv
It could be overwritten by other controls.

Change-Id: I86b430842d6819d3858bc65e728f7cb2bd471284
2020-01-21 14:46:47 -08:00
James Zern bc9b7f50bb add static_assert.h
unify COMPILE_TIME_ASSERT definitions and rename to VPX_STATIC_ASSERT

Change-Id: Id51150c204e0c4eaf355ee45b20915113209d524
2020-01-17 23:17:46 -08:00
James Zern b78d3b21e3 Merge "Validate data used by vpx_codec_control..." 2020-01-18 07:14:34 +00:00
Brian Foley 6efe45375f Validate data used by vpx_codec_control...
...instead of blindly derefing NULL.

Found by some additional fuzzing of the vp8/vp9 decoders to be
upstreamed soon.

Change-Id: I2ea08c2d15f689f3fac8cc73622056a82d94ec00
2020-01-17 11:41:44 -08:00
Jerome Jiang 49aa77c61f vp9: add delta q for uv channel. add tests.
Add control for delta q for uv. 0 by default.

Change-Id: Ib8ed160b1a5c9a61ba15985076f6c6f121477103
2020-01-17 09:19:23 -08:00
Cheng Chen 18e93be9f2 Add comments to frame counts.
Change-Id: I74a1ccb55af78af1153af75734ca43fa140910a7
2020-01-13 14:29:45 -08:00
Cheng Chen 087e4e0b07 Merge "Copy frame counts to the encode result." 2020-01-13 19:18:21 +00:00
James Zern e2ade251b1 Merge "vp9_encoder.c,cosmetics: fix some typos" 2020-01-11 04:10:07 +00:00
James Zern bb2d935eb8 Merge "simple_encode*.cc: add missing copyright" 2020-01-11 04:08:21 +00:00
James Zern 468157dfba vp9_encoder.c,cosmetics: fix some typos
Change-Id: Iac474fcd1937371a9ef2620110740f60fed6b083
2020-01-10 17:21:16 -08:00
James Zern 64b469f4f6 Merge "trivial: fix spelling errors" 2020-01-11 01:20:39 +00:00
James Zern 9cd933ff51 Merge "Add text to clarify the unit of variables for target bitrate" 2020-01-11 01:09:19 +00:00
James Zern b809ac3c9e simple_encode*.cc: add missing copyright
Change-Id: I58ddf13698e3892aa591af4196ca03d7c09426c6
2020-01-10 16:47:48 -08:00
Johann 652beb6ac1 trivial: fix spelling errors
Found when updating a downstream client.

Change-Id: Ibaa20d883ebfea9410d0252e7a19c7acdb78c907
2020-01-10 15:59:30 -08:00
Jerome Jiang ba7f9f38c9 Merge "Fix test failure with --size-limit" 2020-01-10 06:00:31 +00:00
Jerome Jiang 99284cb118 Fix test failure with --size-limit
The test didn't verify expected error code with invalid sizes. It
assumed VPX_CODEC_OK.

Added new Encoder class which doesn't run decoding at all. It accepts
expected error code to verify with encoder output.

The encoder behavior was changed in 94a65e8.

BUG=webm:1670

Change-Id: I6324d8f744e6c4aa82aa66913923dc140b07bfc9
2020-01-09 19:59:14 -08:00
Cheng Chen 86675d87a3 Copy frame counts to the encode result.
Explicitly copy frame counts of each frame to the encode result
struct.

Change-Id: Icc18ac83a9e2be8d7a4819f2fffcfda6568b275c
2020-01-09 14:38:31 -08:00
Clement Courbet ccb06a9fb1 Avoid reloads in vp9_read_mode_info.
The compiler cannot prove that the buffers do not alias, so it has to emit a
reload. On our internal workloads, the reloads are about 1% of the total time
spent decoding frames.

The loop before the change:

movzwl 0x8(%r15), %edx       # load ref_frame
addq $0xc, %rax
movw %dx, -0x4(%rax)         # store ref_frame
movq 0xc(%r15), %rdx         # load mv
movq %rdx, -0xc(%rax)        # store mv
cmpq %rax, %rcx
jne -0x1a

The loop after the change:

movw %r9w, 0x8(%rax)         # store cached ref_frame
addq $0xc, %rax
movq %r8, -0xc(%rax)         # store cached mv
cmpq %rax, %rdx
jne -0x12

Change-Id: Ia1e9634bcabb4d7e06ed60f470bc4cd67f5ab27e
2020-01-07 17:04:36 +01:00
Matt Oliver 5a058e395c project: Update for 1.8.2 merge. 2019-12-31 03:09:20 +11:00
Matt Oliver e96c8c1d72 Merge commit '7ec7a33a081aeeb53fed1a8d87e4cbd189152527' 2019-12-31 02:27:06 +11:00
Johann 50d1a4aa72 Merge remote-tracking branch 'origin/pekin'
Change-Id: I6f8e21696023fa4067960a7dedb6e7bbdb531ff9
2019-12-19 14:24:23 -08:00
Wonkap Jang 07177569ec Add text to clarify the unit of variables for target bitrate
ts_target_bitrate, layer_target_bitrate, and ss_target_bitrate

Change-Id: I845c4b67b5b8b546f7a185e97ad9e510bc246ce0
2019-12-18 14:31:52 -08:00
Johann Koenig d99a0bc769 Merge "vp8: move error check earlier" 2019-12-18 06:01:54 +00:00
Johann Koenig 67aa6c8c23 Merge "trivial: remove reference to error correction" 2019-12-17 20:44:58 +00:00
Johann Koenig f5d2cc293c Merge "trivial: fix 'fragment' spelling" 2019-12-17 20:22:07 +00:00
Johann d1e872e1b9 vp8: move error check earlier
This avoids assigning variables which will not be used. A
similar change was made to vpx_dsp/bitreader.c a long time
ago.

Change-Id: Ia5012091b8d85ca9bfefc7735a2aa69c5c2bf516
2019-12-17 11:44:29 -08:00
Johann Koenig 8a30a2a450 Merge "vp8 boolreader: ignore invalid input" 2019-12-17 19:41:23 +00:00
angiebird ace8ab89b7 Rename encode_frame_index
to next_encode_frame_index

Change-Id: Id9bd2a0f6c4278bf0f0c270eb937a317232dead6
2019-12-16 15:19:35 -08:00
angiebird 56f51daecb Add start_show_index/show_frame_count
to GroupOfPicture

Change-Id: I905be72686b6c0e27ea782a12f1e8a8176c8b0f5
2019-12-16 15:07:51 -08:00
angiebird 01347cd62c Cosmetic change of update_encode_frame_result()
Move output parameter to the end.

Change-Id: I579a118768d29cb1ae2e3c8995a952ef11cfeb8d
2019-12-13 11:15:12 -08:00
angiebird bfa9d015b0 Move psnr/sse computation under RATE_CTRL flag
in update_encode_frame_result()

Change-Id: Ie86d11f66744ef95dd224c7daf325750a5e5458b
2019-12-13 11:15:12 -08:00
angiebird b4a8ac3c46 Add detailed description about GroupOfPicture
Change-Id: I96a447e59bdcf156ab6fbf9e766d867633ca47f3
2019-12-13 11:15:05 -08:00
angiebird fadfea8a6a Cosmetic change of vp9_get_gop_coding_frame_count
Move the output parameter to the end.

Change-Id: I39c718b683a76cd7c5998724c3a07e88275198bf
2019-12-12 12:15:33 -08:00
angiebird a53b7e53e8 Add GetFramePixelCount to SimpleEncode
Gets the total number of pixels of YUV planes per frame.

Change-Id: Ifdf35190cdde1378de6d7e93ab4428868a5795fa
2019-12-12 12:03:28 -08:00
angiebird 204ba94f4b Cosmetic changes for RATE_CTRL related functions
Move input parameters ahead of output parameters.

Change-Id: I384f69523b6be92224535d05373ebb33467a040e
2019-12-11 15:43:39 -08:00
Angie Chiang fa370c32e7 Merge changes I54f60f62,Idbc437d3
* changes:
  Rename parameter two_pass to twopass.
  Add GetNextEncodeFrameInfo ObserveGroupOfPicture
2019-12-11 00:42:01 +00:00
angiebird 2f65eb2a00 Rename parameter two_pass to twopass.
Change-Id: I54f60f62f27f9ef96db892d5b6219c9591ce2dc9
2019-12-10 11:43:40 -08:00
angiebird 757a5e6aa9 Add GetNextEncodeFrameInfo ObserveGroupOfPicture
GetNextEncodeFrameInfo()
Gets encode_frame_info for the next coding frame.

ObserveGroupOfPicture()
Provides the group of pictures that the next coding frame is in.

Change-Id: Idbc437d32c392f25b06efb2d4e1ec01347d678f2
2019-12-10 11:41:01 -08:00
Johann 7ec7a33a08 Release v1.8.2 Pekin Duck
Fixed: webm:1661

Change-Id: Icc17635d63fbd533a084e17cc291693b9a453887
2019-12-09 15:09:20 -08:00
Angie Chiang 9c46931645 Merge changes I41ff04bb,I3d88d719
* changes:
  Set frames_since_key in vp9_get_coding_frame_num
  Add vp9_get_gop_coding_frame_count()
2019-12-09 22:15:50 +00:00
angiebird 38a4e46fd3 Set frames_since_key in vp9_get_coding_frame_num
Set frames_since_key to 0 whenever a key frame appears.

Add dependency notes to get_gop_coding_frame_num()

Change-Id: I41ff04bb1c6176e60946b05fe21c72fbb82be62a
2019-12-09 14:14:11 -08:00
angiebird c33f859615 Add vp9_get_gop_coding_frame_count()
Call this function before coding a new group of picture to get
information about it.

Change-Id: I3d88d719dd27c6d7383eb8f92307a93096b30706
2019-12-09 14:13:53 -08:00
Debargha Mukherjee b7e03724b3 Merge "Merge Timestamp TestVpxRollover tests for Vp8/Vp9" 2019-12-07 02:10:33 +00:00
Debargha Mukherjee 65e0663f06 Merge Timestamp TestVpxRollover tests for Vp8/Vp9
BUG=webm:701

Change-Id: Id0b928db3cbb6263d136d7b9eb8d9453b3c63824
2019-12-06 15:52:25 -08:00
angiebird 94fb57d3a5 Add GetKeyFrameGroupSize()
Makes vp9_get_frames_to_next_key() public.

Change-Id: I903cefbb3925d6ffc641412c6d60d95a2ff256a4
2019-12-06 15:03:23 -08:00
James Zern efa05b7cc9 configure.sh,darwin: fix asm conv w/external build
always set asm_conversion_cmd as e.g., vpx_config.asm may still be
generated with make when using --enable-external-build

BUG=webm:1535

Change-Id: I120452d4e06580b67119aee8d0a710998ac87a7a
2019-12-06 13:18:37 -08:00
Wan-Teh Chang f835ab7608 Fix argv leak on Unrecognized input file type err
Free argv (allocated by argv_dup) after the
"Unrecognized input file type" error.

Change-Id: I2b6273a1abca2ff8e51445fb15839bd993c41741
2019-12-06 10:42:49 -08:00
Debargha Mukherjee 04383393e4 Add missing typecast and re-enable timestamp test
BUG=webm:701

Change-Id: I1d8a6e263fddb9e4cc6265a313011a18d18bbf9e
2019-12-05 19:15:53 -08:00
Johann 92ce61dded trivial: fix 'fragment' spelling
Change-Id: I71b17f3dcb72d5cb2c1d7fe94dd5228433c6eef5
2019-12-04 15:10:19 -08:00
Johann 69022e9f41 trivial: remove reference to error correction
vp9 does not support error correction

Change-Id: I89517ae97abfa60833c9150495556d49c9656778
2019-12-04 15:07:12 -08:00
Johann 80e5666cdc vp8 boolreader: ignore invalid input
Do basic initialization even when the result will not be used.

BUG=chromium:1026961

Change-Id: Iaa480534b49efe1ecc66484b316f8d654e8a1245
2019-12-04 14:01:28 -08:00
Jerome Jiang 1d49040369 remove init_motion_estimation from update_initial_width
Change-Id: I04da24eb6a87425490b25e50ead7a8fd8117e7cb
2019-12-04 12:46:56 -08:00
Angie Chiang 3609eca0c4 Merge "Fix the encode inconsistency of SimpleEncode" 2019-12-04 20:26:28 +00:00
Angie Chiang 8cb34dd640 Merge "Describe ObserveFirstPassStats with more details" 2019-12-04 19:31:02 +00:00
angiebird 42916ce604 Fix the encode inconsistency of SimpleEncode
Make sure restore_coding_context() is always called in the end
of encode_with_recode_loop().

Add EncodeConsistencyTest.

Change-Id: I3c8e4c8fcff4e3f7afef9bec469beef2a5fb6eeb
2019-12-03 18:33:10 -08:00
angiebird f0878a719a Describe ObserveFirstPassStats with more details
Change-Id: I7c15aeaf0c0884b7c7b265fb03fbbb9ccc6b73be
2019-12-03 18:32:48 -08:00
Debargha Mukherjee 89375f0315 Merge "Avoid dividing by 0 in vp8 gf_group bits compute" 2019-12-03 02:29:30 +00:00
Debargha Mukherjee 5eeabfffcb Avoid dividing by 0 in vp8 gf_group bits compute
BUG=webm:1653

Change-Id: Ic59fe5e573f08dbca678d3927d4a750ae75f903c
2019-12-02 14:49:50 -08:00
Jerome Jiang d2a5e26359 Fix SVC regression in webrtc tests.
BUG=1029438

Change-Id: I4495fc7bb45e77e9d91059a5c6c4695d8da1bf34
2019-12-02 12:20:17 -08:00
James Zern 2ba45e8291 Merge "Fix mutex free in multi-thread lpf" 2019-12-02 20:16:33 +00:00
Venkatarama NG. Avadhani b4e0986391 Fix mutex free in multi-thread lpf
The mutex lf_mutex will now be allocated and destroyed, making it easier
to verify if it has been inited before destruction.

BUG=webm:1662

Change-Id: I8169bea9e117bd615d68b8d02da98aeab570b53f
2019-11-27 08:56:32 +05:30
Johann Koenig b8549ed889 Merge "fix __has_attribute in visual studio" 2019-11-26 14:44:15 +00:00
angiebird 8d211a3969 Make GetCodingFrameNum const function
Change-Id: I6a5a2400cfb6e122c77667e0950c80026c48a1f6
2019-11-25 10:38:16 -08:00
angiebird ef263a11fe Add missing includes to simple_encode.h
Change-Id: Ic3bb2450443c52ba3df1ed6729cecdab51245e76
2019-11-25 10:37:53 -08:00
angiebird 85108a55b2 Correct typo in simple_encode.h
Change-Id: Ifa858acad8b943d1579283fd1c72ff41434c0710
2019-11-25 10:37:25 -08:00
angiebird 2157d613c5 Cosmetic change of GetBitrateInKbps
Change-Id: Id4b852cdfba0f6fa1e12a05e2617df0de395be9d
2019-11-25 10:37:07 -08:00
Angie Chiang 30c669ca87 Merge "Change vp9_get_encoder_config." 2019-11-25 18:34:26 +00:00
Johann 2ed1830ca5 fix __has_attribute in visual studio
Similar to __has_feature, __has_attribute needs to be defined
away on unsupported platforms.

BUG=chromium:1020220,chromium:977230

Change-Id: I803fff0fef2b18b535604f3b7f9f8300e45f7ef8
2019-11-25 08:01:23 -08:00
Vitaly Buka 6a3ba2118c Disable -ftrivial-auto-var-init= for hot code
Improves encode_time by 10% on FullStackTest.VP9KSVC_3SL_High and other
tests when -ftrivial-auto-var-init= is used.

vp9_pick_inter_mode can be called recursevely so multiple pred_buf is
neede. So alternative to attribute should be list of bufferes in
ThreadData or TileData.

Bug: 1020220, 977230
Change-Id: I939a468f88c2b5dd2ec235de7564b92bfaa356f5
2019-11-23 02:12:18 +00:00
Vitaly Buka 4529dc8483 Disable -ftrivial-auto-var-init= for hot code
This helps to improve some benchmarks by 10%, e.g. decode_time
PCFullStackTest.VP9SVC_3SL_Low

Bug: 1020220, 977230
Change-Id: Ic992f1eec369f46a08e19eb33bc3a7c15c1e7c87
2019-11-23 02:12:01 +00:00
Jerome Jiang cbec0795c6 Merge "Move buffer from extend_and_predict into TileWorkerData" 2019-11-22 22:50:30 +00:00
angiebird 16e5e849b3 Change vp9_get_encoder_config.
Add vp9_dump_encoder_config for config comparison.

This function will generate the same VP9EncoderConfig used by the
vpxenc command given below.
The configs in the vpxenc command corresponds to parameters of
vp9_get_encoder_config() as follows.

WIDTH:   frame_width
HEIGHT:  frame_height
FPS:     frame_rate
BITRATE: target_bitrate

INPUT, OUTPUT, LIMIT will not affect VP9EncoderConfig

vpxenc command:
INPUT=bus_cif.y4m
OUTPUT=output.webm
WIDTH=352
HEIGHT=288
BITRATE=600
FPS=30/1
LIMIT=150
./vpxenc --limit=$LIMIT --width=$WIDTH --height=$HEIGHT --fps=$FPS
--lag-in-frames=25 \
 --codec=vp9 --good --cpu-used=0 --threads=0 --profile=0 \
 --min-q=0 --max-q=63 --auto-alt-ref=1 --passes=2 --kf-max-dist=150 \
 --kf-min-dist=0 --drop-frame=0 --static-thresh=0 --bias-pct=50 \
 --minsection-pct=0 --maxsection-pct=150 --arnr-maxframes=7 --psnr \
 --arnr-strength=5 --sharpness=0 --undershoot-pct=100 --overshoot-pct=100 \
 --frame-parallel=0 --tile-columns=0 --cpu-used=0 --end-usage=vbr \
 --target-bitrate=$BITRATE -o $OUTPUT $INPUT

Change-Id: If7fd635d6f3fad4e6199a4fbcd556323efc1c250
2019-11-22 10:47:28 -08:00
angiebird 67fe324ab6 Add trailing underscore to members of SimpleEncode
Change-Id: I7a1d19ed4fd60fef374392c86df69d2122c335f0
2019-11-21 10:54:23 -08:00
angiebird 22ef949667 Rename impl by EncodeImpl
Change-Id: Id182cd234c9f4f37c2854ea5ca761d8cfa113791
2019-11-21 10:54:23 -08:00
angiebird 585eb34f86 Cosmetic changes of SimpleEncode code
Change-Id: Ied06630d605a4978711070778b92bfb731c32161
2019-11-21 10:54:23 -08:00
angiebird 93e2e701a9 Fix a bug related to use_external_quantize_index
Move the break point in encode_with_recode_loop after
save_coding_context() so that restore_coding_context
can work properly.

Change-Id: I58f46928c8cae0ae542fd8343076670fb35681bf
2019-11-20 10:41:16 -08:00
angiebird fb4f013f27 Fix a bug in free_encoder()
Move vpx_free(buffer_pool) after vp9_remove_compressor()

buffer_pool needs to be free after cpi because buffer_pool
contains allocated buffers that will be free in
vp9_remove_compressor()

Change-Id: I8bcedae2858cfe132bde110c8f3f6b55dcbe3f36
2019-11-20 10:41:16 -08:00
angiebird f563d975c6 Use indicative mood in comments of SimpleEncode
Change-Id: I913e14994646945a7237c9ab65097647fb3a5b5c
2019-11-20 10:41:16 -08:00
angiebird 3afd6e8270 Rename pimpl by impl_ptr in SimpleEncode
Change-Id: I0071216b710544731a6f8e8c7a63c7a28f25bbac
2019-11-20 10:41:16 -08:00
angiebird cd686f727a Move pimpl to the function body of SimpleEncode
Change-Id: Id4757d61916b8348d76c99dddbe48e68f2b3ef1a
2019-11-20 10:41:16 -08:00
angiebird 89077d4505 Fix a bug in EncodeFrame test
Move key frame checks after EncodeFrame()

Change-Id: I4e3eded5dc54e757f85e846c4920cddc1ea7444b
2019-11-20 10:41:16 -08:00
angiebird 38b636094b Add namespace vp9
Change-Id: I29d05557becbfc5d55d1cd1bb709e519d27c928b
2019-11-20 10:41:16 -08:00
angiebird 9b79c51b02 Add copyright and header guard for simple_encode.h
Change-Id: Ib4502fc35202b36aa25f06c7c2bb5203673faa06
2019-11-20 10:36:00 -08:00
Angie Chen 65c7b631a2 Close the file that SimpleEncode opens in its ctor in its dtor.
Change-Id: I1e5d1be9f076c70ec1d7764d5703aeba8afd4436
2019-11-20 01:54:41 +00:00
Angie Chiang 9fbcfd159b Merge changes I32ab6829,If47867d4,I4442de01
* changes:
  Add coding_data_bit_size to EncodeFrameResult
  Pass in infile_path to SimpleEncode()
  Add SimpleEncode::EncodeFrameWithQuantizeIndex()
2019-11-20 01:04:56 +00:00
Sai Deng 831536cb3a Merge "Use a better model for tune=ssim" 2019-11-20 00:58:57 +00:00
sdeng e6bfdce30d Use a better model for tune=ssim
Comparing to the baseline tune=ssim, the average gains are
PSNR -0.55, SSIM -0.30, MS-SSIM -0.98, VMAF -1.26

Details: (150f VBR)
               PSNR    SSIM  MS-SSIM    VMAF
Lowres       -1.347   0.291   -0.307  -1.291
Midres       -0.628  -0.329   -1.011  -2.173
Hdres         0.781  -0.656   -1.319   0.210
Ugc360p      -2.695  -0.972   -1.503  -4.055
Lowres_bd10   0.074   0.196   -0.623  -0.835
Midres_bd10   0.517  -0.327   -1.124   0.566

Change-Id: Ie034eaedf20e1fe843921cafbb3b7ad9a2bc89d1
2019-11-18 16:57:22 -08:00
angiebird 6a950fee92 Add coding_data_bit_size to EncodeFrameResult
Change-Id: I32ab6829083c896ab2c6234e191939a000dea6e5
2019-11-18 11:37:24 -08:00
angiebird 2b97860f97 Add quantize_index to EncodeFrameResult
Change-Id: Idfb36a8bfa264df8294eba70424fd25fa5d88cda
2019-11-18 11:37:24 -08:00
angiebird 21e93eb814 Pass in infile_path to SimpleEncode()
Change-Id: If47867d4d59a59e252bfe7eb24c940f9e089d335
2019-11-18 11:37:24 -08:00
angiebird 56735c3fdb Add psnr and sse to EncodeFrameResult
Change-Id: I33c410a14b86f95278eff8d1d0e6992f1b82a17d
2019-11-18 11:37:24 -08:00
angiebird c4f1fe4b22 Add SimpleEncode::EncodeFrameWithQuantizeIndex()
Change-Id: I4442de01dfdbf13b0b9f7830f0fb393d3b935522
2019-11-18 11:37:24 -08:00
angiebird 6956e393c7 Add frame_type and show_idx to EncodeFrameResult
Let vp9_get_compressed_data update ENCODE_FRAME_RESULT, a C
version of EncodeFrameResult.
Let unit test to test frame_type and show_idx properly.

Change-Id: Id810c26c826254fd82249f19ab855ea3b440d99c
2019-11-18 11:37:24 -08:00
angiebird f975d02d6d Add EncodeFrameResults
It contains coding_data_size and coding_data.

The EncodeFrame will allocate a buffer, write the coding data into the
buffer and give the ownership of the buffer to
encode_frame_result->coding_data

Change-Id: I6bd86aede191ade1db4a1f1bba5be601eef97d60
2019-11-18 11:37:24 -08:00
angiebird 7dd26ae967 Rename frame_stats by first_pass_stats
This is in simple_encode.cc

Change-Id: I2770e4a229b435f92e1ebe226644d8d104114d29
2019-11-18 11:37:24 -08:00
angiebird 422445a81d Add SimpleEncode::GetCodingFrameNum()
Also add unit tests for GetCodingFrameNum() and EncodeFrame()

Change-Id: I3e7b65f47226be4660409481435f8f784db72a68
2019-11-18 11:37:24 -08:00
angiebird 9330bd71a1 Add SimpleEncode::EncodeFrame()
Change-Id: I08f074b7db2011f88769bd1d9d50cb376c238fe5
2019-11-18 11:37:24 -08:00
angiebird 7ee697a5da Add ComputeFirstPassStats()
Change-Id: Iaed87a4fa35f456aec5d88d07fade636280eb211
2019-11-18 11:37:24 -08:00
angiebird 9f33d7530a Add vp9_iface_common.c
Change-Id: Iac8c31a333a0ae04c9b5f188b3e3b09c25df4046
2019-11-18 11:37:14 -08:00
Angie Chiang d579a85949 Merge changes Id42dbddd,I6dff1bda
* changes:
  Add const to oxcf of vp9_create_compressor
  Add simple_encode.cc/h
2019-11-18 19:33:55 +00:00
James Zern 76d9afc349 vp9_cx_iface: quiet unused fn warning w/CONFIG_REALTIME_ONLY
since:
71684703a Remove output_pkt_list from cpi

Change-Id: I14afae6598051680fdaf8c7509b6705d73789dd6
2019-11-15 20:30:39 -08:00
angiebird 027ead10dc Add const to oxcf of vp9_create_compressor
Change-Id: Id42dbdddae3e0a16022343c89cbc57912297398c
2019-11-15 15:10:57 -08:00
angiebird 04f50db953 Add simple_encode.cc/h
Change-Id: I6dff1bda4bea760a32c2f8e38773e5913c830204
2019-11-15 15:10:50 -08:00
angiebird b0e761f95b Add vp9_update_compressor_with_img_fmt()
Add utility functions
vpx_img_chroma_subsampling
vpx_img_use_highbitdepth

Change-Id: I7b44fdc2cf67bbb49e161fdf778917b9ec0c8832
2019-11-14 18:11:10 -08:00
angiebird ddd80abd3f Add vp9_lookahead_full/vp9_lookahead_next_show_idx
vp9_lookahead_full -  Check if lookahead is full
vp9_lookahead_next_show_idx - Return the show_idx
that will be assigned to the next frame pushed by
vp9_lookahead_push()

Keep track of the show_idx of each frame in the queue

Change-Id: If7ec2c7250f52413e6ce00c5b96f026ebf60a403
2019-11-14 18:11:10 -08:00
angiebird 71684703aa Remove output_pkt_list from cpi
Move the pkt operations to encoder_encode

Change-Id: Ibe730baab61bf7a395998641f106eb0f06d3b8ae
2019-11-14 18:10:44 -08:00
Vitaly Buka 33a8fa870c Move buffer from extend_and_predict into TileWorkerData
This avoids unneeded initializations.

extend_and_predict is called from multiple nested loops, allocate
large buffer on stack and use just a portion of it.
-ftrivial-auto-var-init= inserts initializations which performed on
multiple iterations of loops causing 258.5% regression on
webrtc_perf_tests decode_time/pc_vp9svc_3sl_low_alice-video.

Bug: 1020220, 977230

Change-Id: I7e5bb3c3780adab74dd8b5c8bd2a96bf45e0c231
2019-11-14 00:02:46 -08:00
angiebird 76cdfe2d73 Pack psnr pkt outside of vp9_get_compressed_data
Change-Id: I5549c3dbcbe1550824deaebf03178e38c1b07d54
2019-11-13 13:47:53 -08:00
angiebird 8eb69628c5 Unite vpx_psnr_pkt and PSNR_STATS
Change-Id: Ia2be91a49dfa95906fa2ce232ff9d3a69deda4ad
2019-11-13 13:47:53 -08:00
angiebird 86eb125911 Remove psnr_pkt in LAYER_CONTEXT
It's not used by anycode

Change-Id: I30e86c142d4367c7b301f5b19e39c14480d4129b
2019-11-13 13:47:53 -08:00
angiebird 53a0b134a6 Remove the macro of vp9_lookahead_push
Change-Id: Iffc06e53714165cbd8e509ca6d2114e9c4d4ab96
2019-11-13 13:47:52 -08:00
angiebird b1021be915 Add g_timebase/g_timebase_in_ts to oxcf
Use get_g_timebase_in_ts() to set priv->timestamp_ratio
and oxcf->g_timebase_in_ts

Change-Id: Iea9d589cb7e5611067bcedfdf6f5becd4592d3cf
2019-11-13 13:47:52 -08:00
angiebird 733d356fa7 Add frame_rate param to vp9_get_encoder_config
Change-Id: I14a3d076d71240b4ed2436947418aa3177911fc1
2019-11-13 13:47:52 -08:00
Jerome Jiang f216dba557 Merge "example: Enable row-mt on low res and speed 7 8." 2019-11-08 23:19:48 +00:00
Johann Koenig 8d85e79d6e Merge "remove unused vp8_hex_search parameter" 2019-11-08 19:48:42 +00:00
Johann Koenig a4b2025293 Merge "remove unused cpi parameters from firstpass.c" 2019-11-08 19:48:21 +00:00
Johann c08c30348c remove unused vp8_hex_search parameter
BUG=webm:1612

Change-Id: I80765f4ed05fb5d588249e56a018bf8b9828a197
2019-11-07 23:42:02 +00:00
Johann 67e3972faf remove unused cpi parameters from firstpass.c
BUG=webm:1612

Change-Id: I77db5f9f2cb8244cca831b76c00926112c3e0dfe
2019-11-07 23:41:36 +00:00
Johann 3e2562bdb9 remove unused Pass1Encode parameters
BUG=webm:1612

Change-Id: Ifbe5bbba706311057bfc5d5fa9b63e57ac56e398
2019-11-07 23:41:04 +00:00
Jerome Jiang c7aa088ca4 example: Enable row-mt on low res and speed 7 8.
Verified row-mt works for low res and speed 7 8.

Change-Id: I1e7f260fe5cda40a2da80ca47692a5864712ec30
2019-11-06 15:36:15 -08:00
James Zern 17f2474ea4 Merge "test/vp[89]_boolcoder_test: quiet msan warnings" 2019-11-06 20:32:00 +00:00
Angie Chiang 71f9da5c78 Merge changes I341bd674,Ia9a0d71d,I71c1f906,I2e36e07c,I94ee2e85, ...
* changes:
  Refactor check_initial_width
  Move noise_sensitivity to set_encoder_config
  Remove extra function calls in check_initial_width
  Move init_ref_frame_bufs to vp9_create_compressor
  Remove bits_left update in encoder_encode()
  Add vp9_get_encoder_config / vp9_get_frame_info
  vp9_get_coding_frame_num()
  Make [min/max]_gf_interval static under rate_ctrl
  Add rate_ctrl flag
2019-11-06 20:07:49 +00:00
James Zern 7afac52ba6 test/vp[89]_boolcoder_test: quiet msan warnings
the bitreaders may fill beyond what was written to the buffer as an
optimization. the data isn't used meaningfully, but it may trigger a
msan warning.

BUG=b/140939146

Change-Id: Id03cd203b8ee7ecaf6fdfe3f3c9f2ccfec527129
2019-11-05 23:04:06 -08:00
Johann 03b779e99b remove unused vp8dx_receive_compressed_data parameters
BUG=webm:1612

Change-Id: If2dc8a77c8f8bca86ee4b8349091dd1117b42dce
2019-11-04 15:09:50 -06:00
Johann 7e5b40800b ensure ctx is used
Rather that (void)ing ctx, document the case where it might not be used.

BUG=webm:1612

Change-Id: I1f1ba9a3d52b43a6987dbe3afec96fa17101e3bf
2019-11-04 15:05:12 -06:00
Johann 88f6a8925a remove unused mbmi parameter
BUG=webm:1612

Change-Id: I0f982d8269ec50a767efc222d958d37a55d5c77f
2019-11-04 14:59:19 -06:00
Johann 16d4f15782 remove unused simple loopfilter parameters
The simple filter only processes the Y plane.

BUG=webm:1612

Change-Id: I9886ff43ea7f621d8915846cb65f609a9298566d
2019-11-04 14:52:17 -06:00
Johann a653423f23 remove unused postproc parameters
BUG=webm:1612

Change-Id: I92937417403af2c943e903ba66799609ef6ab635
2019-11-04 14:45:13 -06:00
James Zern c82282592e configure.sh,darwin: fix external_build check
disabled external_build will return an incorrect result for a value not
explicitly set on the command line; use ! enabled instead.

fixes ios build

Change-Id: I48dda3a06731bc9809c2266880797e1779e4c01c
2019-10-31 23:16:33 -07:00
angiebird 2b690d6d13 Refactor check_initial_width
1) Rename it by update_initial_width() because it's actually
changing the initial_width

2) Move alloc_raw_frame_buffers out of it.

Change-Id: I341bd6743eb0e1217bdf1bdbd7f67c4ea7d76ee2
2019-10-30 15:51:13 -07:00
angiebird 38739edaa1 Move noise_sensitivity to set_encoder_config
Change-Id: Ia9a0d71dc8a329d00ebf20a82d42cda43e13431b
2019-10-30 14:59:58 -07:00
angiebird 4d307cd7ff Remove extra function calls in check_initial_width
These function are already called in set_frame_size()

Change-Id: I71c1f906fa4deef7bc630dcff1506f5b57c6d045
2019-10-30 14:40:53 -07:00
angiebird b66b33b524 Move init_ref_frame_bufs to vp9_create_compressor
Change-Id: I2e36e07c273692a08a9c3ebba814882d32d32f8c
2019-10-30 14:36:54 -07:00
Johann 70d8d72ffe darwin: disable compiler checks
When configuring with --enable-external-build the .mk files
are not expected to work. This avoids some spurious warnings
when configuring for darwin targets on other platforms.

Fixed: webm:1535
Change-Id: Idac2b397db1b595ba7ea9231c4eb835b6013abdc
2019-10-30 16:05:26 +00:00
Johann Koenig 553c68dca1 Merge changes I7dd2b487,I6db5b053
* changes:
  support visual studio 2019 (vs16)
  remove old visual studio remnants
2019-10-30 13:07:55 +00:00
angiebird 2bfcc7179b Remove bits_left update in encoder_encode()
It's already updated properly in vp9_init_second_pass()

Change-Id: I94ee2e8536387c94a2abf9a7686011c76489c2f9
2019-10-29 18:52:40 -07:00
angiebird 90a65c2064 Add vp9_get_encoder_config / vp9_get_frame_info
Change-Id: Id5c8b2d69a36d218ec04cd504868ce0efebf6b69
2019-10-29 17:37:48 -07:00
angiebird 65f9ded395 vp9_get_coding_frame_num()
Change-Id: I36fa92d9acfc272fc9a2f700bcd1466e95f1443c
2019-10-29 16:43:34 -07:00
angiebird 46fa3d6b5b Make [min/max]_gf_interval static under rate_ctrl
Change-Id: I0624c4b44a35c760bb00e4d1a07bb0ac2640ea0b
2019-10-29 16:34:18 -07:00
angiebird d693439bed Add rate_ctrl flag
BDRate Changes (negative means improvement)
lowres: 0.565%
midres: 0.361%
lowres: 0.233%
ugc360: -0.242%

Make gop size independent from coding results

Change-Id: I1f54c48b12dc45ee5162ca2527a877c1610528bd
2019-10-29 16:34:18 -07:00
Angie Chiang 7d349b2cab Merge changes Ibde94f52,Iae804fcc,I94f3b93a
* changes:
  Add get_arf_layers()
  Use RANGE in get_gop_coding_frame_num
  Add get_gf_interval_active_range()
2019-10-29 23:14:38 +00:00
Johann e6123709dc always use lf for shell scripts
Ensure scripts do not get crlf endings when checking out
on Windows.

Fixed: webm:1651

Change-Id: I7cb6039c6d600bb57e7fbdb2fdbb84f4040803f5
2019-10-29 14:54:56 -04:00
Johann 59d95e1e71 remove .gitattributes
None of these file patterns match any existing files.

Change-Id: I069bab91fe43887b094d02e6328b00da62706d94
2019-10-29 17:24:59 +00:00
Johann 6316fbdc5e remove .gitattributes filters
These only appear to exist in this repository. Based
on the name they may have been intended to manage
tabs vs spaces.

Change-Id: I2ac1a858f75cb0e5714964cb68e49082c4eb3ca5
2019-10-29 17:24:48 +00:00
Johann daefbf2a6f support visual studio 2019 (vs16)
Fixed: 1633
Change-Id: I7dd2b4873aeb548c7f9ebf7025baf15a8e65c68f
2019-10-29 13:19:43 -04:00
Johann 154da5ba5a remove old visual studio remnants
The oldest supported Visual Studio version has been vs14
since 539dc7649f.

Clean up scripts and remove dead code.

Change-Id: I6db5b053a55d7656275d3d48e35d672c8ce22067
2019-10-29 13:08:06 -04:00
angiebird 7cd76b428a Add get_arf_layers()
Change-Id: Ibde94f52235a37e122e6a548d71cb230e7b28368
2019-10-29 09:56:12 -07:00
angiebird c75cb7beb2 Use RANGE in get_gop_coding_frame_num
Change-Id: Iae804fccd7cca180eef9e6664de70f0930ee2e94
2019-10-29 09:56:12 -07:00
angiebird a82f32d59b Add get_gf_interval_active_range()
Change-Id: I94f3b93a932f351b6c9743932238d7ede2938462
2019-10-29 09:56:12 -07:00
Hien Ho 9b73e21c0d Merge "remove clang flag for integer sanitizer testing" 2019-10-25 16:24:23 +00:00
Angie Chiang a57aa0545d Merge changes I309357fd,I0d170956,I5c7fc771,I6ebb023a,I1f6ef8c6, ...
* changes:
  Make gop size independent from kf_zeromotion_pct
  Add get_frames_to_next_key()
  Rename i by frames_to_key in find_next_key_frame
  Remove input_stats when decide frames_to_key
  Remove twopass param from test_candidate_kf
  Pass first_pass_info/show_idx to test_candidate_kf
  Refactor test_candidate_kf()
  Decide the key frame directly when auto_key is off
  Remove detect_transition_to_still()
  Change the interface of find_next_key_frame
2019-10-24 20:25:05 +00:00
Hien Ho 47a6c557af remove clang flag for integer sanitizer testing
BUG=webm:1615

Change-Id: Idfc86722e744d0c71ad47e284afb9cf9b8474473
2019-10-24 11:02:41 -07:00
Hien Ho d18c0bbe97 Merge "vpx_dsp/x86/avg_intrin_sse2: fix int sanitizer warnings" 2019-10-24 16:39:52 +00:00
James Zern cfe6fa98f7 Merge "vpx_int_pro_col_sse2: use unaligned loads" 2019-10-24 06:36:38 +00:00
Hien Ho 10b475ec67 vpx_dsp/x86/avg_intrin_sse2: fix int sanitizer warnings
Unit Test: VP9/AqSegmentTest. VP9/CpuSpeedTest, AVX2/Loop8Test6Param

implicit conversion from type 'int' of value 59741 (32-bit, signed) to
type 'int16_t' (aka 'short') changed the value to -5795 (16-bit, signed)

BUG=webm:1615

Change-Id: I2e5b688a97c3caa29d4b8a817b95a4986b81a562
2019-10-23 15:55:56 -07:00
Johann Koenig aa86438a0d Merge "simplify darwin autodetection" 2019-10-23 20:22:11 +00:00
Johann Koenig fda86393b8 Merge "add darwin18 target" 2019-10-23 19:23:55 +00:00
Johann 533702b370 simplify darwin autodetection
Use sed to extract tgt_os

Change-Id: I2f7cd290102a2b591c6ae6e40766918b55abff10
2019-10-23 14:17:57 -04:00
Johann 5ed1801f10 add darwin18 target
Fix autodetection on MacOS 10.14. Without this it defaults
to generic-gnu

Change-Id: I19cd4a9f2fb106dff16ab5e38821a5f374add59c
2019-10-23 13:50:36 -04:00
Johann d7822442d4 use a compile time constant for kDataAlignment
const or constexpr should be sufficient for this use but older
versions of gcc fail to expand DECLARE_ALIGNED correctly. Work
around this by using an enum.

Fixed: webm:1660
Change-Id: Ifa4f7585417760f90f9fb28332152019de9f8169
2019-10-23 10:52:38 -04:00
James Zern 849b63ffe1 vpx_int_pro_col_sse2: use unaligned loads
this fixes a segfault when scaling is enabled; in some cases depending
on the ratio offsets may become odd.

vpx_int_pro_row_sse2 was updated previously, though the reason wasn't
listed:
54eda13f8 Apply fast motion search to golden reference frame

BUG=webm:1600

Change-Id: I8d5e105d876d8cf917919da301fce362adffab95
2019-10-22 19:58:29 -07:00
angiebird 42ad1ac7a0 Make gop size independent from kf_zeromotion_pct
Change-Id: I309357fd0e008d10b974c9d2603d0712e1aa0bcd
2019-10-22 17:26:52 -07:00
angiebird a263098e3b Add get_frames_to_next_key()
Change-Id: I0d1709562bf96648fbaf2a0dce2dc23b9d2b81f1
2019-10-22 17:26:50 -07:00
angiebird 19f3b9c2ca Rename i by frames_to_key in find_next_key_frame
Change-Id: I5c7fc771f0852d3b9e8b30be34097b13dfbc2513
2019-10-22 17:25:57 -07:00
Hien Ho 2a9698c111 Merge "vpx_dsp/inv_txfm: fix int sanitizer warnings" 2019-10-23 00:22:43 +00:00
angiebird c68e75b940 Remove input_stats when decide frames_to_key
Also remove the corresponding reset_fpf_position

Change-Id: I6ebb023a38627785ff19e161bfe7bbef797fc710
2019-10-22 12:10:36 -07:00
Angie Chiang 96d584e1b8 Merge changes I00697e9a,I9dfc2ba3,I73619051,Ib4d37667
* changes:
  Refactor kf_group_err in find_next_key_frame
  Simplify the logics in find_next_key_frame
  Add get_gop_coding_frame_num()
  Localize zero_motion_accumulator
2019-10-22 18:17:51 +00:00
angiebird ce023d5413 Remove twopass param from test_candidate_kf
Change-Id: I1f6ef8c6d453177e3b48c95434b66480ee19f91d
2019-10-21 18:18:16 -07:00
angiebird 793ae58853 Pass first_pass_info/show_idx to test_candidate_kf
Change-Id: I5c18de464be9981236f95c62391258c4963e469b
2019-10-21 18:04:49 -07:00
angiebird d24a72ef73 Refactor test_candidate_kf()
Replace detect_flash() by detect_flash_from_frame_stats()

Change-Id: Ia4eca1ca553fdb2f4f63ff6f683c79d92fc52556
2019-10-21 17:56:54 -07:00
angiebird d1832eab19 Decide the key frame directly when auto_key is off
Change-Id: I41d107558a8b1d31ef3b263ecc0ec1e1d91c8f7e
2019-10-21 17:20:17 -07:00
angiebird 55ee1fca03 Remove detect_transition_to_still()
Change-Id: I877f55355fc85d67f46bb76e521a19d35d76df09
2019-10-21 17:09:11 -07:00
angiebird 1bd337b1d8 Change the interface of find_next_key_frame
Change-Id: I9c25cbac2953755efa9fd72f59149f26513d1977
2019-10-21 16:51:27 -07:00
angiebird b2e498000f Refactor kf_group_err in find_next_key_frame
Move the computation out of the while loop.

Change-Id: I00697e9a16d5d597c63e5d9895e4ae00efc7a2df
2019-10-21 16:07:28 -07:00
angiebird f2d91e2c24 Simplify the logics in find_next_key_frame
Since the while loop's condition already check
rc->frames_to_key < cpi->oxcf.key_freq,
it impossible to have "frames_to_key >= 2 * cpi->oxcf.key_freq"
and "frames_to_key > cpi->oxcf.key_freq".

Hence, these logics are removed.

Change-Id: I9dfc2ba36e1012718c857fc710036e2d30acd3b8
2019-10-21 15:23:46 -07:00
angiebird 31193de1cc Add get_gop_coding_frame_num()
This function will decide number of coding frames and whether to
use altref

Change-Id: I736190512ea92ce5387600712bd0e250ad7cb44c
2019-10-21 12:03:10 -07:00
Johann Koenig 062a43acb6 Merge "Fix AVX-512 capability detection" 2019-10-21 18:24:31 +00:00
Angie Chiang 454877d874 Merge changes I2acc7d6b,I560dccfc,I3fb23f5c,Ifa24a501
* changes:
  Rename num_show_frames by num_coding_frames
  Use compute_arf_boost() in define_gf_group()
  Localize av_err mean_mod_score in define_gf_group
  Move code of deciding gop size into brackets
2019-10-18 19:19:06 +00:00
Birk Magnussen 602d25ec43 Fix AVX-512 capability detection
When Checking for AVX Support, only the CPU's Capabilities and YMM
Register support by the OS were queried. In case of AVX-512, that is
insufficient, and ZMM Register support by the OS needs querying,
otherwise the OS will raise an Illegal Operation Exception if the CPU
is capable of AVX-512 but the OS is not.

Change-Id: I3444b19156d5743841de96cecbdaac19cc3f2b3f
2019-10-17 13:16:54 +02:00
angiebird d1124adef0 Localize zero_motion_accumulator
Change-Id: Ib4d37667c217cb06e6941de7b3204ba71b880396
2019-10-16 13:50:02 -07:00
angiebird ab7974d36d Rename num_show_frames by num_coding_frames
Change-Id: I2acc7d6bde2ec2fae4460869663db1e8f6c576fe
2019-10-16 11:38:53 -07:00
angiebird 10eefccdc3 Use compute_arf_boost() in define_gf_group()
Remove reset_fpf_position() because
compute_arf_boost does not count on twopass->stats_in

Change-Id: I560dccfcc4a2cbaa8e78a493a070a416465db4a9
2019-10-16 11:38:45 -07:00
Angie Chiang 2be0ee5f05 Merge changes I1d71908a,Id1b41c3b,I07722c81,I31cf7889
* changes:
  Localize last_loop_decay_rate
  Make get_zero_mtion_factor avoid using cpi
  Add check_transition_to_still()
  Add compute_arf_boost()
2019-10-16 18:32:19 +00:00
angiebird 3c055612c6 Localize av_err mean_mod_score in define_gf_group
Change-Id: I3fb23f5c8df1c3276b663a32556ca800b7ba2ade
2019-10-15 15:13:33 -07:00
angiebird 9d377ff930 Localize last_loop_decay_rate
Change-Id: I1d71908a79ff494c4fb32dab0dc881f7a70bd519
2019-10-15 15:13:33 -07:00
angiebird 8f730cede0 Move code of deciding gop size into brackets
Identify the internal params used for deciding gop size

Change-Id: Ifa24a501952e06e5779a4fd2050dd486083cfa4c
2019-10-15 15:13:33 -07:00
angiebird 1ef50d643b Make get_zero_mtion_factor avoid using cpi
Change-Id: Id1b41c3b77a7eae6c2934efbff2608094ee7b3c5
2019-10-15 15:13:33 -07:00
angiebird 98390f2d4e Add check_transition_to_still()
The behavior is the same as that of detect_transition_still,
only we void using cpi and twopass->stats_in

Change-Id: I07722c817d98d8e4991a0a883235a582db8b5c3c
2019-10-15 15:13:28 -07:00
angiebird 9f291d85b6 Add compute_arf_boost()
It's behavior is the same as that of calc_arf_boost()
But, we avoid using cpi and twopass->stats_in

Change-Id: I31cf7889abf43effcca9004a9d55f4b424ce388a
2019-10-15 12:55:36 -07:00
Hien Ho b285035727 Merge "vp8/decoder/decodeframe: fix int sanitizer warnings" 2019-10-14 17:18:08 +00:00
angiebird 9dadf3189a Correct the num_frams of fps_init_first_pass_info
Note the last packet is cumulative first pass stats.
So the number of frames is packet number minus one

Change-Id: I5f617e7eeb63d17204beaaeb6422902ec076caeb
2019-10-11 18:49:27 -07:00
angiebird b255d47775 Simplify the logics of computing gf_group_err etc
Move the logics of computing
gf_group_err, gf_group_raw_error, gf_group_noise,
gf_group_skip_pct, gf_group_inactive_zone_rowsa,
gf_group_inter, gf_group_motion
into one for loop

The behavior stays the same.

Change-Id: Idbc338a88469bf7a2786c831880e8aba8ed4feb5
2019-10-11 17:13:34 -07:00
angiebird 821db08a43 Add calc_norm_frame_score()
The behavior is the same as calculate_norm_frame_score(),
but we avoid use cpi.

Change-Id: I3400abcdd02e041eb3b1ebf402b40b97df00d6f4
2019-10-11 16:43:47 -07:00
angiebird a2f62fec69 Remove mod_frame_err in define_gf_group
Change-Id: I3cefcc797e8756c9d3256321679784a356fc1946
2019-10-11 16:25:33 -07:00
angiebird 4398e01511 Simplify the if clause in define_gf_group
Change-Id: I70a06a4f48c5a215831d8b6e918eebc3041ef65a
2019-10-11 16:11:23 -07:00
angiebird 412547ad4b Refactor calc_frame_boost()
Replace detect_flash() by detect_flash_from_frame_stats()

Change-Id: I31862820926b5167ff70cebe2009c04aa745a019
2019-10-10 15:45:58 -07:00
angiebird df0ec9f0b6 Add first_pass_info in TWO_PASS
This is part of the change aims at replacing
stats_in/stats_in_start/stats_in_end by first_pass_info.

Change-Id: Ibcd2a08e57cb749fe68996f33fe3a5e7f92b1758
2019-10-10 15:44:51 -07:00
angiebird aa49cd4ad4 Refactor get_prediction_decay_rate()
Replace cpi by frame_info
Rename next_frame by frame_stats

Change-Id: I909f01ce724aac13030931970fba8b7b3f4d0080
2019-10-10 15:44:51 -07:00
angiebird 4748129151 Replace cpi by frame_info in get_sr_decay_rate()
Change-Id: I8ed925edb12345042cf3e446095b4ad4acfa11c4
2019-10-10 15:44:51 -07:00
angiebird 16cf728c53 Change the interface of calc_frame_boost
Replace cpi by frame_info and avg_frame_qindex

Change-Id: Ie63526ac9942acf75cc416fcaa0a169838b23322
2019-10-10 15:44:51 -07:00
angiebird 4a014dfdce Use frame_info in calculate_active_area
Change-Id: I16049bef4aee54c915dc5cf181111c5a334b5eaf
2019-10-10 15:44:51 -07:00
angiebird 6b7d221ca6 Add FRAME_INFO into VP9_COMP
Change-Id: Ibc804f2420113010013c04dc005b02dfebdfda8a
2019-10-10 15:44:51 -07:00
angiebird 3a210cae43 Add detect_flash_from_frame_stats()
Change-Id: I06a64b45045334cf9563d27e154a3b8095ad80a3
2019-10-10 15:44:46 -07:00
Jerome Jiang 65c4395237 vp9: fix non bitexact when reuse_inter_pred is 0.
when the best filter selected is not EIGHTTAP_SMOOTH, and
reuse_inter_pred is 0, pred buffer was not pointing to the right place.

Change-Id: I5b519fedd2d892bf140879faa74b463a161e253b
2019-10-04 15:17:36 -07:00
Hien Ho fdbd18a419 vpx_dsp/inv_txfm: fix int sanitizer warnings
with vp9-highbitdepth off.
Unit Test:  SSE2/Trans16x16DCT , VP9/LevelTest.TestTargetLevel20Large, VP9/CpuSpeedTest

implicit conversion from type 'int32_t' (aka 'int') of value -32851
(32-bit, signed) to type 'tran_low_t' (aka 'short') changed the value to
32685 (16-bit, signed)

BUG=webm:1615
BUG=webm:1647

Change-Id: I9ef064dc9ac734379628565ff6505b0876984123
2019-10-04 20:07:58 +00:00
Hien Ho 08da66d5df Merge "vp9/common/vp9_reconinter: fix int sanitizer warnings" 2019-10-04 15:53:02 +00:00
Hien Ho ca42eebf62 vp8/decoder/decodeframe: fix int sanitizer warnings
Unit test: VP8/InvalidFileTest
implicit conversion from type 'int' of value -45844 (32-bit, signed) to
type 'short' changed the value to 19692 (16-bit, signed)

 BUG=webm:1615
 BUG=webm:1644

Change-Id: Id5d470f706d68e24f7a1e689526c9ecd3a8e8db8
2019-10-03 23:34:50 +00:00
Hien Ho 891c4b3ce6 Merge "vpx_dsp/quantize: fix int sanitizer warnings" 2019-10-03 23:34:20 +00:00
Hien Ho e14acc942b vp9/common/vp9_reconinter: fix int sanitizer warnings
Unit Test: VP9/InvalidFileTest
implicit conversion from type 'int' of value -65536 (32-bit, signed) to
type 'int16_t' (aka 'short') changed the value to 0 (16-bit, signed)

BUG=webm:1615
BUG=webm:1645

Change-Id: I4ce0c6abf8b5bf43ee43e958ad75d9fa28b23eee
2019-10-03 11:56:27 -07:00
Hien Ho 37102e55ec vpx_dsp/quantize: fix int sanitizer warnings
From unit test: AVX/VP9QuantizeTest; SSSE3/VP9QuantizeTest ...
implicit conversion from type 'int' of value -139812 (32-bit, signed)
to type 'tran_low_t' (aka 'short') changed the value to -8740 (16-bit,
 signed)

BUG=webm:1615

Change-Id: I730946ac6c7a250dcbcfd8a2712c0f1150ddb4fd
2019-10-03 18:55:56 +00:00
Hien Ho f70f5dbae2 vp9/decoder/vp9_detokenize: fix int sanitizer warnings
From unit test: VP9MultiThreaded/InvalidFileTest
implicit conversion from type 'int' of value 83144 (32-bit, signed) to
type 'tran_low_t' (aka 'short') changed the value to 17608 (16-bit,
 signed)

 BUG=webm:1615
 BUG=webm:1648

Change-Id: I4170494c328596ace66432c8563c55f31745cf76
2019-10-03 18:52:57 +00:00
James Zern 4e2cfb63de Merge "namespace ARCH_* defines" 2019-09-30 21:53:34 +00:00
James Zern fad865c54a namespace ARCH_* defines
this prevents redefinition warnings if a toolchain sets one

BUG=b/117240165

Change-Id: Ib5d8c303cd05b4dbcc8d42c71ecfcba8f6d7b90c
2019-09-30 11:13:29 -07:00
Jerome Jiang 97cd0bd5db Fix python3 format issue
Change-Id: If0fed04c8682baee82efbdf5b4f90bcc8e8ac102
2019-09-27 19:07:20 -07:00
Marco Paniconi 4f69df9969 Merge "vp9-rtc: Fix to speed 4 for real-time mode" 2019-09-24 15:18:42 +00:00
Marco Paniconi a7515c0877 vp9-rtc: Fix to speed 4 for real-time mode
Fix some speed feature settings for speed 4
in real-time mode.

Use rd pickmode (i.e.,nonrd_pick_mode=0), but
use variance partitioning. Allow aq-mode=3 to
work at speed 4 and modify some other speed settings.
This makes it much faster than the current speed 4,
and still better quality than speed 5.

Change-Id: I94ec43ccac022030a75b5a528703be0c37f9a35c
2019-09-23 16:43:52 -07:00
Angie Chiang 8fa9281b78 Merge "Remove USE_PQSORT and CHANGE_MV_SEARCH_ORDER" 2019-09-20 22:52:11 +00:00
Angie Chiang 93c95bf3e1 Remove USE_PQSORT and CHANGE_MV_SEARCH_ORDER
Remove the feature_score related code to simplify the code.
The feature_score is incorporated in get_local_structure and will
be integrated in later.
The current non_greedy_mv performances are
lowres: -0.239% midres: -0.569% hdres: -0.365%

Change-Id: Ida28bb1baff6932f1c28b24d371a35a1546fa7e9
2019-09-20 11:33:15 -07:00
Marco Paniconi b8d86733e9 vp9-svc: Fix to forced key frame for spatial layers
Condition to disallow key frames on spatial
enhancement layers should be based on the
first_spatial_layer_to_encode, which need not be
layer 0.

Change-Id: If6bc67568151c38c9c98290e5838a23b3ab18e8a
2019-09-20 08:56:47 -07:00
Angie Chiang 616f02c170 Merge "Remove redundant comment" 2019-09-19 17:47:07 +00:00
Angie Chiang 55564969be Remove redundant comment
Change-Id: I2020d21701ec7a7b018c4063918232098124d033
2019-09-18 15:02:52 -07:00
Angie Chiang 4c79c10416 Merge "Move vp9_alloc_motion_field_info" 2019-09-18 21:52:06 +00:00
Angie Chiang b3a42feb0c Move vp9_alloc_motion_field_info
Move vp9_alloc_motion_field_info out of init_tpl_buffer, so that
vp9_alloc_motion_field_info will be called even though there is
not alternate reference frame.

This fix the crash with shields_720p50 at bitrate 2000

Change-Id: If2877e8d0b8a834556be12d239b7b58ad1fc8c73
2019-09-18 12:20:51 -07:00
Jerome Jiang f9ffc19ecb Fix msan on svc tests.
BUG=b/140939146

Change-Id: Ib3e714f01c58fc0452c7e1adfc8fd3f1d9f8e0a0
2019-09-16 11:04:32 -07:00
James Zern 8025696407 Merge "vp9_quantize_sse2: quiet clang-7 integer sanitizer warning" 2019-09-11 18:19:07 +00:00
Jerome Jiang c094391e95 vpx_clear_system_state after drop due to overshoot
BUG=999780

Change-Id: I096fdc22812eab22a38a33135c0cbe60a6e64add
2019-09-10 19:31:08 -07:00
James Zern c098cfdce7 vp9_quantize_sse2: quiet clang-7 integer sanitizer warning
nzflag is used as a boolean, it doesn't need to be a sized type, int is
enough (and _mm_movemask_epi8 returns one)

fixes:
vp9_quantize_sse2.c:136:16: implicit conversion from type
'int' of value 65535 (32-bit, signed) to type 'int16_t' (aka 'short')
changed the value to -1 (16-bit, signed)

BUG=webm:1649

Change-Id: I0e3f5278af49d84760f3dfb607f28099cf02f21d
2019-09-10 15:50:22 -07:00
Matt Oliver 540af22e9a project: Add appveyor support for VS 2019. 2019-09-08 21:48:22 +10:00
Marco Paniconi 5a0242ba5c vp9-svc: Add new frame drop mode for SVC
add SVC framedrop mode: Lower spatial layers
are constrained to drop if current spatial layer
needs to drop.

No change in behavior to other existing modes.

Change-Id: I2d37959caf8c4b453b405904831b550367f716ba
2019-09-06 10:16:57 -07:00
James Zern bacb32aef5 Merge "Don't generate mv refs that won't be used" 2019-09-06 08:00:46 +00:00
Angie Chiang 2601db8f14 Merge "Upload Motion Field Estimation Unit Test Files" 2019-09-03 18:23:23 +00:00
Angie Chiang 101f370a42 Report failure of vp9_alloc_motion_field_info
Change-Id: I87f2a8dbf4e89b1cc8526307e82812aea6ac137e
2019-08-30 16:20:36 -07:00
Dan Zhu a3ec6c7fb3 Upload Motion Field Estimation Unit Test Files
Change-Id: Ia8f9e9dca562183ff188cd29dfc7ba3435d77900
2019-08-30 13:21:18 -07:00
Angie Chiang dbe5a1a111 Free set_mv properly
Change-Id: I9b1830dc16189678121c860e0493ed8b04c512a8
2019-08-29 18:38:53 -07:00
Alex Converse 5cfedb745a Don't generate mv refs that won't be used
Spends 25% less time in dec_find_mv_refs for
grass_1_1280X768_fr30_bd8_sub8X8_l31.webm saving 0.7% overall.

Change-Id: I658bb5d6dd8ac82a568c7823dea3f4947ad7ed73
2019-08-28 21:46:15 -07:00
Angie Chiang aa4a3f5759 Merge "Move motion field from TplDepFrame to MotionField" 2019-08-29 00:35:27 +00:00
Jerome Jiang 4baffb97e3 Merge "Add resize test for smaller width bigger size." 2019-08-28 23:41:13 +00:00
Hien Ho bbb7f55d8c Merge "vpx_dsp/x86/highbd_idct4x4_add_sse2: fix int sanitizer warnings" 2019-08-28 20:41:16 +00:00
Angie Chiang f6251cc7a8 Move motion field from TplDepFrame to MotionField
Replace get_pyramid_mv by vp9_motion_field_mi_get_mv.

The goal is to modularize motion field related operations.

Change-Id: I33084e680567ab106659ba9389cc4b507b893c69
2019-08-28 13:39:33 -07:00
Angie Chiang f40c00b206 Merge changes I0fad9437,I79fcb1fd,I93660044
* changes:
  Add MACRO MAX_INTER_REF_FRAMES
  Add motion_filed_info in VP9_COMP
  Add free_tpl_buffer
2019-08-28 20:38:20 +00:00
Jerome Jiang 1b6f2e3f99 Add resize test for smaller width bigger size.
Stack trace is the same as that in the bug.

BUG=webm:1642

Change-Id: I9d88c18a40af8df4a679727620070b13f1606f14
2019-08-28 11:30:20 -07:00
Hien Ho 4973c57fe1 vpx_dsp/x86/highbd_idct4x4_add_sse2: fix int sanitizer warnings
implicit conversion from type 'int' of value 49161 (32-bit, signed) to
type 'int16_t' (aka 'short') changed the value to -16375 (16-bit,
signed)

BUG=webm:1615

Change-Id: I3f18283609ac2ce365202a63ef61a47eb00c155b
2019-08-28 11:20:05 -07:00
Hien Ho c5f298b71f Merge "vp8/encoder/vp8_quantize: fix int sanitizer warnings" 2019-08-28 16:28:41 +00:00
Angie Chiang efd02817da Add MACRO MAX_INTER_REF_FRAMES
Use MAX_INTER_REF_FRAMES wheneve it's suitable

Change-Id: I0fad94371a6600099313685cbe38faebb44178c4
2019-08-27 15:57:25 -07:00
Angie Chiang 2981cfac00 Add motion_filed_info in VP9_COMP
Call vp9_alloc_motion_field_info and
vp9_free_motion_field_info properly

Change-Id: I79fcb1fd37ee5e95bf7febb728480583ebd5a065
2019-08-27 15:57:21 -07:00
Angie Chiang d0e5b82084 Add free_tpl_buffer
Change-Id: I93660044880ec08dc716138d492c757d510e0952
2019-08-27 14:46:51 -07:00
Angie Chiang fa2509d729 Merge "Cosmetic changes to vp9_alloc_motion_field_info" 2019-08-27 21:44:26 +00:00
Hien Ho ccc5a6c29a vp8/encoder/vp8_quantize: fix int sanitizer warnings
implicit conversion from type 'int' of value 65536
 (32-bit, signed) to type 'short' changed the value to 0 (16-bit, signed)

 BUG=webm:1615

Change-Id: I6a04e57bd3272934de9c75fab60a1620ff6c3636
2019-08-27 19:05:53 +00:00
Hien Ho 70041462c8 Merge "test/acm_random.h: int sanitizer warning" 2019-08-27 19:05:19 +00:00
Angie Chiang 305a5283c5 Cosmetic changes to vp9_alloc_motion_field_info
Change-Id: I6808ac11a9a0f2137b30ae66773f6e3dcccef77d
2019-08-26 16:30:57 -07:00
Hien Ho ebadd5287a test/acm_random.h: int sanitizer warning
runtime error: implicit conversion from type 'int' of value
-61240 (32-bit, signed) to type 'int16_t' (aka 'short') changed the
value to 4296 (16-bit, signed)

BUG=webm:1615

Change-Id: I213fc153f0df9ea46737a7fb98d909e670125724
2019-08-26 13:46:08 -07:00
Dan Zhu 1d66ec91da Merge "Add Search Smooth Models[Adapt/Fix]" 2019-08-23 19:01:00 +00:00
Dan Zhu 9931730340 Merge "Add Anandan model" 2019-08-23 19:00:54 +00:00
Dan Zhu 2c62f4002b Merge "Fix some bugs of python code" 2019-08-23 19:00:47 +00:00
Dan Zhu 7fd6a8f186 Merge changes I13f59f52,I7441e041,I7441e041
* changes:
  add unit test for local structure computation
  add unit test for smooth motion field
  modify smooth model(float type mv + normalization)
2019-08-23 19:00:32 +00:00
Angie Chiang 9e2229ec08 Merge "Let do_motion_search process one ref at at time" 2019-08-23 18:30:58 +00:00
Hien Ho d180c3b477 Merge "vpx_dsp/loopfilter.c: fix int sanitizer warnings" 2019-08-23 18:20:21 +00:00
Dan Zhu e436930c39 Add Search Smooth Models[Adapt/Fix]
Change-Id: Ia88d16a14b0525d880ac17a133700431949ece31
2019-08-22 18:21:27 -07:00
Dan Zhu 7f73dee0e5 Add Anandan model
Change-Id: Ic3450125c83b41e7e4a953093b4d8177f04d220a
2019-08-22 18:20:38 -07:00
Dan Zhu e2d0e7fe01 Fix some bugs of python code
Change-Id: I509cbda24d7d0c8dac75209efa40e24c09a107c5

Exhaust: add exhaust search with neighbor constraint
GroundTruth: be able to import motion field variable
MotionEST: use new function names
Util: be able to set the size of image
Change-Id: I36cfdf4b1f28b8190b3ad2be61c241da1347cfc3
2019-08-22 18:15:18 -07:00
Hien Ho 2f52ae2384 test/vp9_quantize_test: fix int sanitizer warning
implicit conversion from type 'int' of value 42126 (32-bit, signed)
to type 'tran_low_t' (aka 'short') changed the value to -23410 (16-bit, signed)

BUG=webm:1615

Change-Id: I339c640fce81e9f2dd73ef9c9bee084b6a5638dc
2019-08-22 23:15:17 +00:00
Hien Ho e6aa05171e vpx_dsp/loopfilter.c: fix int sanitizer warnings
implicit conversion from type 'int' of value -139 (32-bit, signed)
to type 'int8_t' (aka 'signed char') changed the value to 117 (8-bit, signed)

BUG=webm:1615

Change-Id: Ic64959759f4a188087aa24bedbae5f9fa60674ad
2019-08-22 23:13:59 +00:00
Hien Ho 232ff9361e Merge "vpx_dsp/x86/fwd_txfm_sse2: fix int sanitizer warnings" 2019-08-22 23:11:37 +00:00
Dan Zhu 0ad301e5b0 add unit test for local structure computation
Change-Id: I13f59f529204070faf076144124069c3b1180633
2019-08-22 15:07:28 -07:00
Dan Zhu 08012eceef add unit test for smooth motion field
Change-Id: I7441e04190b8a797f3863166e95b3b6c9924ab51
2019-08-22 14:58:55 -07:00
Dan Zhu 0859579d4d modify smooth model(float type mv + normalization)
Change-Id: I7441e04190b8a797f3863166e95b3b6c9924ab50
2019-08-22 14:51:29 -07:00
James Zern f8f496fc1f update libwebm to libwebm-1.0.0.27-363-g37d9b86
clears a compiler warning

changelog:
https://chromium.googlesource.com/webm/libwebm/+log/81de00c..37d9b86

Change-Id: I7a9cce81cb193305220059b12071019d46155be2
2019-08-22 12:37:40 -07:00
Hien Ho da7c5beded Merge "vp8/encoder/bitstream: fix int sanitizer warnings" 2019-08-21 22:14:33 +00:00
Hien Ho 0d4453c8fe Merge "test/lpf_test: fix int sanitizer warning" 2019-08-21 22:13:35 +00:00
Hien Ho 569503fbc9 vpx_dsp/x86/fwd_txfm_sse2: fix int sanitizer warnings
implicit conversion from type 'int' of value 32768 (32-bit, signed)
to type 'short' changed the value to -32768 (16-bit, signed)

BUG=webm:1615

Change-Id: I7cdba7f7e550f62fd3ac31574e49b1909b6ab054
2019-08-21 13:34:14 -07:00
elliottk 9b8acbfee4 Update documentation: CRF works with VPX_Q mode
Tested that if VPX_Q is set, this variable will still be used
to pull the CRF value.

Change-Id: I065a219a7acd18b50478d4d0d3dc7ba5e1c90901
2019-08-20 17:29:21 -07:00
Angie Chiang 7080fc4604 Merge "Add [full/sub]_pixel_motion_search" 2019-08-20 18:04:42 +00:00
Yue Chen 40ea310277 Merge "Add 6:1:1 weighted PSNR to opsnr.stt" 2019-08-20 17:38:25 +00:00
Angie Chiang 14c41cd0b3 Let do_motion_search process one ref at at time
Change-Id: Iee3f2d1fbeddeee27400edb6fe1519c39352901d
2019-08-19 18:15:24 -07:00
Angie Chiang 3e56934883 Add [full/sub]_pixel_motion_search
Change-Id: Idcd3c3178f583b8584e2b34ca2fbe96337feaadd
2019-08-16 15:40:35 -07:00
Angie Chiang f9b9371476 Add MotionField and MotionFieldInfo
Also add related buffer alloc/free functions.

Change-Id: I77dde3dd991f6b21b5c2c1ffa72300ce7738fd50
2019-08-16 14:34:58 -07:00
Angie Chiang b0c89c99ce Add temporary motion_compensated_prediction_new
Temporarily add motion_compensated_prediction_new() to
decouple non_greedy_mv's motion search from baseline.

We need to decouple non_greedy_mv's full pixel motion search and
sub pixel motion search

Change-Id: I1a0e4a170c19b5b718e9d19b62268b520105a0ef
2019-08-16 11:01:28 -07:00
Dan Zhu ff90269431 Merge "estimate local variation of reference frame" 2019-08-16 17:52:34 +00:00
Dan Zhu 3572f39586 Merge "smooth motion field" 2019-08-16 17:52:10 +00:00
Dan Zhu 6e122b6f82 estimate local variation of reference frame
Change-Id: I4218057403ad4f565ee2dcb5403ecaae17af7e26
2019-08-16 09:37:13 -07:00
Dan Zhu 795c9188f2 smooth motion field
Change-Id: I1e8273fa65f7655e49f626863fe457efef23fb54
2019-08-15 16:49:11 -07:00
Angie Chiang 7c1f5208e9 Add test_non_greedy_mv.cc
Change-Id: I7862d39ae52ab016bf6c3ba3aa4b8b1d9760cf27
2019-08-15 16:01:10 -07:00
Yue Chen dc3ded3679 Add 6:1:1 weighted PSNR to opsnr.stt
Change-Id: I6f519ff99bacbe6968d9271a224cc2cbc0958cd8
2019-08-14 14:35:36 -07:00
Hien Ho 0665e9b4ae vp8/encoder/bitstream: fix int sanitizer warnings
implicit conversion from type 'unsigned int' of value 256
(32-bit, unsigned) to type 'unsigned char' changed the value to
0 (8-bit, unsigned)

BUG=webm:1615

Change-Id: I2b630bf22cad28b5a7a8a37f6938e6ebe12bc64e
2019-08-13 23:38:24 +00:00
Hien Ho a7059eca33 test/lpf_test: fix int sanitizer warning
runtime error: implicit conversion from type 'int' of value 65594 (32-bit, signed)
to type 'uint16_t' (aka 'unsigned short') changed the value to 58 (16-bit, unsigned)

BUG=webm:1615

Change-Id: I6046a4a4fc0a108c337153f2c59d5cef5c8dcbd6
2019-08-13 16:50:47 +00:00
Hien Ho 0af40df746 Merge "vp9_rdopt: fix integer sanitizer warnings" 2019-08-12 18:28:11 +00:00
Jerome Jiang e46820c693 Merge "Fix vp9_quantize_fp(_32x32)_neon for HBD" 2019-08-06 04:07:16 +00:00
Jerome Jiang e1f127c4b5 Fix vp9_quantize_fp(_32x32)_neon for HBD
In high bitdepth build, Neon code would outrange because of use of
int16x8_t and vmulq_s16.
C code always truncate outrange values.

Change-Id: I33a968b8d812e3c8477f3a61d84482758a3f8b21
2019-08-05 16:06:46 -07:00
Hien Ho 84d4199077 Merge "vp8/encoder/boolhuff: fix integer sanitizer warnings" 2019-08-02 20:34:03 +00:00
Jerome Jiang e034854618 Merge "Fix saturation issue in vp9_quantize_fp_neon" 2019-08-02 17:04:44 +00:00
Hien Ho c1d630d5c8 Merge "vp9_svc_layercontext.c: fix integer sanitizer warnings" 2019-08-02 15:41:17 +00:00
Hien Ho 9dbbbd1812 Merge "vpx_dsp/bitwriter.h: fix clang integer sanitizer warning" 2019-08-02 15:40:47 +00:00
Jerome Jiang 8894c766c6 Fix saturation issue in vp9_quantize_fp_neon
Change-Id: I7850a5c5aea3633e50e9a2efc8116b9e16383a8f
2019-08-01 14:57:28 -07:00
Angie Chiang 0971d3204d Reduce call num of exhaustive search
The encoding time difference between non_greedy_mv and baseline
is reduced from 51% to 13%

However, there is also a performance impact.

non_greedy_mv performance:
Before this CL
lowres 0.395% midres 0.716% hdres 0.533%
After this CL
lowres 0.242% midres 0.429% hdres 0.305%

Change-Id: I047d6509df504b264981c0b903c0cc955f45b273
2019-07-31 15:07:56 -07:00
Angie Chiang 119dad38e6 Merge "Cosmetic changes of vp9_nb_mvs_inconsistency" 2019-07-31 22:03:27 +00:00
Angie Chiang 4b0bfe8f7d Merge "Change the child classes methods names to align with parent's" 2019-07-31 19:55:29 +00:00
Hien Ho 12cbfbb807 vpx_dsp/bitwriter.h: fix clang integer sanitizer warning
implicit conversion from type 'unsigned int' of value 256 (32-bit, unsigned)
to type 'uint8_t' (aka 'unsigned char') changed the value to 0 (8-bit, unsigned)


BUG=webm:1615

Change-Id: Ia9ac3772021ae492368c650a73846e7d22c8fdfc
2019-07-31 16:05:36 +00:00
Hien Ho 1979d41540 vp9_svc_layercontext.c: fix integer sanitizer warnings
implicit conversion from type 'int' of value -1
(32-bit, signed) to type 'uint8_t' (aka 'unsigned char') changed the
value to 255 (8-bit, unsigned

BUG=webm:1615

Change-Id: If507e73aea4dccd3914b6470f8d15db3b67300ce
2019-07-30 18:34:11 +00:00
Hien Ho 95ce7c6452 vp9_rdopt: fix integer sanitizer warnings
implicit conversion from type 'int' of value -9 (32-bit, signed) to type
'uint8_t' (aka 'unsigned char') changed the value to 247 (8-bit, unsigned)

BUG=webm:1615

Change-Id: Ic2254ef4312f349ee38ec6e12a56b2cd5714b101
2019-07-30 17:41:42 +00:00
James Zern 0fc1b220a4 sad_test: align exp_sad[]
fixes a crash on win32 in SSE4_1/SAD*

BUG=webm:1637

Change-Id: I9838915dccf8ed435d1326bc43465edd89687c18
2019-07-27 10:40:11 -07:00
Angie Chiang f8c416f7bb Cosmetic changes of vp9_nb_mvs_inconsistency
Change-Id: I41022a2dca996657b64ffb0ede4df3ab6a466ab6
2019-07-24 14:04:08 -07:00
Angie Chiang 59b7f2f36f Merge "Add vp9_non_greedy_mv.c/h" 2019-07-24 21:01:42 +00:00
Marco Paniconi 18d309c127 Merge "vp9-rtc: Add intra speed feature for speed >= 8" 2019-07-23 18:29:48 +00:00
Dan Zhu e7548da489 Merge "Add Horn & Schunck Estimator" 2019-07-23 18:16:37 +00:00
Dan Zhu 5db8ff42db Merge "Add Exhaust Search (Neighbor Constrain) Estimator" 2019-07-23 18:16:24 +00:00
Dan Zhu 8c5896e8dd Merge "Add Ground Truth Estimator" 2019-07-23 18:16:11 +00:00
Dan Zhu 4c89f06b3c Change the child classes methods names to align with parent's
Add comments to explain the coordinate system

Change-Id: Ib87ae479e08b4e3c3e7d9a3d1b4ab30718b42cfd
2019-07-23 11:14:18 -07:00
Dan Zhu 05b0b204a7 Merge "Based Class of Motion Field Estimators" 2019-07-23 18:09:56 +00:00
Dan Zhu c7093c18b0 Add Horn & Schunck Estimator
Add Matrix solver
Fix a little bug in MotionEST

Change-Id: I8513475646f4f02df31b245fa750483449de9407
2019-07-23 09:55:03 -07:00
Dan Zhu d7a2451d48 Add Exhaust Search (Neighbor Constrain) Estimator
Change-Id: I1e306979a0d308285155c152837125fb2036091a
2019-07-23 09:55:03 -07:00
Dan Zhu e4096639ff Add Ground Truth Estimator
Change-Id: Iec6c7e49a64610e33a77c7d5d772e6b063a0f1e0
2019-07-23 09:55:03 -07:00
Dan Zhu 3244f4738d Based Class of Motion Field Estimators
Change-Id: Id01ce15273c0cab0cd61d064099d200708360265
2019-07-23 09:49:59 -07:00
Marco Paniconi c096a249c9 vp9-rtc: Add intra speed feature for speed >= 8
Add intra speed feature to force DC only under intra mode
testing when source sad for superblock is not high.
Feature is only enable at speed >=8. With this feature
enabled at speed 8 we now allow for H/V intra check as
well for speed 8.

This helps to redude artifacts for speed 8, by allowing H/V mode
to be checked for blocks when the superblock has high
source sad/content change.

Change-Id: I0495ce96b4cc844e8c625b5183eef180dbaaaa72
2019-07-23 09:43:39 -07:00
Matt Oliver 41039ba75a project: Update for 1.8.1 merge. 2019-07-21 23:02:44 +10:00
Matt Oliver 1329ffa705 Merge commit '8ae686757b708cd8df1d10c71586aff5355cfe1e' 2019-07-21 22:37:08 +10:00
Wan-Teh Chang ad8f8150c0 Merge "Remove unused fb_cb related fields from VP9_COMMON" 2019-07-19 15:45:55 +00:00
Angie Chiang 5362ce31ca Add vp9_non_greedy_mv.c/h
Move vp9_nb_mvs_inconsistency to vp9_non_greedy_mv.c
This is to facilitate following SIMD optimizations.

Change-Id: I8eb8f820368928e0c4fb287e557cddf0bd2c763e
2019-07-18 15:22:44 -07:00
Angie Chiang 6b5db3f9da Merge changes I3216c984,I70d40060
* changes:
  Make vp9_prepare_nb_full_mvs only return valid mvs
  Let vp9_nb_mvs_inconsistency call log2 just once
2019-07-18 22:12:12 +00:00
Wan-Teh Chang 546c273f1a Remove unused fb_cb related fields from VP9_COMMON
Remove the cb_priv, get_fb_cb, release_fb_cb, and int_frame_buffers
fields from the VP9_COMMON struct. They are not being used.

Change-Id: I235194aa8b315cd8ec9405bbba5feb3bee69f7e0
2019-07-18 14:37:32 -07:00
Angie Chiang 706f1f10e0 Make vp9_prepare_nb_full_mvs only return valid mvs
In this case, vp9_nb_mvs_inconsistency doesn't need to check
whether each neighbor mv is valid or not.

non_greedy_mv encoding time is reduced by 1.5%

Change-Id: I3216c98481e777d5e0b917ea20ee39b7ca9c9d23
2019-07-17 13:14:10 -07:00
Angie Chiang ee554c8ceb Let vp9_nb_mvs_inconsistency call log2 just once
The bahavior of this function is to compute log2 of mv difference,
i.e. min log2(1 + row_diff * row_diff + col_diff * col_diff)
against available neghbor mvs.
Since the log2 is monotonic increasing, we can compute
min row_diff * row_diff + col_diff * col_diff first
then apply log2 in the end

non_greedy_mv encoding time is reduced by 1.5%

Change-Id: I70d40060e2621daec27229f1f6d9fea0286aa04e
2019-07-17 13:14:01 -07:00
Wan-Teh Chang 548974b293 Merge "Fix comment typos." 2019-07-17 20:13:10 +00:00
Wan-Teh Chang ce13498b2a Fix comment typos.
Fix comment typos in transpose_s16_4x4q() and transpose_u16_4x4q().

Change-Id: I21bcc1fb3fb880798e5a3927c3dbe81dd518c83b
2019-07-17 11:09:55 -07:00
Angie Chiang 291055812b Add vpx_sad32x32x8_c/avx2
Change-Id: I4dbb7b6c8979c39eb6ffb97750e3cca0f4b7921f
2019-07-16 16:46:59 -07:00
Angie Chiang a662247070 Add unit test for vpx_sadMxNx8
Change-Id: Ica85e3738708e2a6cc7388fd2cbf6a8840a540d5
2019-07-16 16:46:55 -07:00
Johann 53dc2d9d96 Merge remote-tracking branch 'origin/orpington'
BUG=webm:1624

Change-Id: I62e7154d95b3361d6184f0448430bed951f15044
2019-07-16 11:35:06 -07:00
Paul Wilkins 3ab1a4aadb Merge "Limit active best quality of layered ARF frames" 2019-07-16 15:06:05 +00:00
Johann 8ae686757b Release v1.8.1 Orpington Duck
BUG=webm:1624

Change-Id: Ibd63b64058e52448e0916939a3f85eb23c8161b6
2019-07-15 14:55:33 -07:00
Angie Chiang d749bc7b33 Merge changes I9288c88d,Ib1ac6f57,I02fac56a,Id6a8b117
* changes:
  Use sdx8f in exhaustive_mesh_search_single_step
  Sync the behavior of exhaustive_mesh_search
  Refactor exhaustive_mesh_search_new
  Simplify code in exhaustive_mesh_search_new
2019-07-15 18:40:10 +00:00
Yunqing Wang bb407a27b2 Merge "Revert "Set up frame contexts based on frame type"" 2019-07-15 18:31:09 +00:00
Yunqing Wang ed5a5a06bd Revert "Set up frame contexts based on frame type"
This reverts commit affd9921e4.

Reason for revert:  Quality regression
(VP9/EndToEndTestLarge.EndtoEndPSNRTest/195 failed)

BUG=webm:1635

Original change's description:
> Set up frame contexts based on frame type
>
> In single layer ARF case, use different frame
> contexts for KF, ARF/GF, LF, OVERLAY update types.
>
> Change-Id: Iebb7f9bb430e483dea1e75fc122b9b67645ce804

Change-Id: I98a4eaa6ec0ae6616ea5ad35d1580501b7422e1b
2019-07-15 17:16:37 +00:00
Angie Chiang 037d67f684 Use sdx8f in exhaustive_mesh_search_single_step
This speed up non_greedy_mv by 4%

Change-Id: I9288c88db56ea4201a7ec4493ca5c567d76af0f1
2019-07-14 19:09:56 -07:00
Angie Chiang 719fe0bc5f Sync the behavior of exhaustive_mesh_search
Change-Id: Ib1ac6f57519eb4da93e7c75b0c26a372ffc5d524
2019-07-14 19:09:56 -07:00
Angie Chiang ad0f0cbc0c Refactor exhaustive_mesh_search_new
Add the following two functions:
exhaustive_mesh_search_multi_step
exhaustive_mesh_search_single_step

Change-Id: I02fac56a815b091beab2203afce560d7d29aad44
2019-07-14 19:09:27 -07:00
Angie Chiang 43cab8f4e8 Simplify code in exhaustive_mesh_search_new
Change-Id: Id6a8b117b066a56e9312f528ec8f417dd4b2a2d8
2019-07-12 15:10:40 -07:00
Yunqing Wang 468e77b9ea Merge "Adjust the quality of boosted frames" 2019-07-11 15:07:55 +00:00
Yunqing Wang 5372f1e41e Merge "Set up frame contexts based on frame type" 2019-07-11 15:07:35 +00:00
Yunqing Wang 1c875f11f2 Merge "Modify frame context index" 2019-07-11 15:07:17 +00:00
Marco Paniconi 2d63dacfa0 vp9-rtc: Reduce color artifact for speed 8
Push the reduced chroma check to speed > 8.

Change-Id: I92dd0aa9933bb5417b1dc5eef8f805ee51e04ac9
2019-07-10 09:49:41 -07:00
Jerome Jiang 741f5dc77c vp9: Use mb_rows/cols from VP9_COMMON in postproc.
When frame height is not divisible by 16, the calculation of mb_rows in
postproc was wrong.

Change-Id: I69d108f1b8facdd5650b5b7928a0033b268530d2
2019-07-09 16:46:24 -07:00
James Zern 19bda215d0 Merge "Remove android_tools deps" 2019-07-02 18:46:30 +00:00
Yun Liu 4029b8cd40 Remove android_tools deps
Bug: 428426
Change-Id: Ia3c31fe2b513ac995baad15c8376c590fd1104f7
2019-07-02 18:29:58 +00:00
James Zern 01fd66d726 vp9_cx_iface,encoder_encode: fix -Wclobbered for pts
Change-Id: Ia7fd4fedb0dcbb626d0e7f4951360e2462b518e2
(cherry picked from commit ae3c6e9ec7)
2019-07-01 15:18:49 -07:00
James Zern cd9f1763c8 Merge "vp9_cx_iface,encoder_encode: fix -Wclobbered for pts" 2019-07-01 22:17:46 +00:00
Marco Paniconi 18acb6ab8e vp9-rtc: Fix color artifacts for speed >= 8
Fix to avoid color artifacts observed for speed >= 8.
In model_rd_large in non_rd pickmode: always do the
transform skipping test for UV plane.

BUG=b/136198713

Change-Id: Idd91322fb898fe731846d8581b21010096f87680
(cherry picked from commit c33c7ca85f)
2019-07-01 17:54:01 +00:00
Marco Paniconi c33c7ca85f vp9-rtc: Fix color artifacts for speed >= 8
Fix to avoid color artifacts observed for speed >= 8.
In model_rd_large in non_rd pickmode: always do the
transform skipping test for UV plane.

BUG=b/136198713

Change-Id: Idd91322fb898fe731846d8581b21010096f87680
2019-07-01 09:52:16 -07:00
Ravi Chaudhary 16de4d7366 Adjust the quality of boosted frames
As the boosted frames, early in key frame interval,
are used as reference by many subsequent boosted frames,
boosted frames that are closer to the reference key frame
should be allocated with more target bits than the rest.
Similarly, the active best quality should be lower for
boosted frames early in the key interval and vice versa.
Hence, the bits allocation and active best quality are varied
based on their temporal position in the key frame interval.

Change-Id: I1362248560d074b9e209657a23ae73dda0b01d52
2019-07-01 18:08:26 +05:30
James Zern ae3c6e9ec7 vp9_cx_iface,encoder_encode: fix -Wclobbered for pts
Change-Id: Ia7fd4fedb0dcbb626d0e7f4951360e2462b518e2
2019-06-29 18:18:15 -07:00
Matt Oliver d28e00f390 project: Add patch showing changes made in this repo. 2019-06-30 04:11:33 +10:00
Dan Zhu ee097c75ce add flags for empty blocks
Change-Id: Iedf3bdd87d203db5163d3cc47fcbef1fd002218f
2019-06-28 14:08:45 -07:00
Hien Ho 7cb611d221 vp8/encoder/boolhuff: fix integer sanitizer warnings
from sanitizer run:
runtime error: implicit conversion from type 'unsigned int' of value 256
(32-bit, unsigned) to type 'unsigned char' changed the value to
 0 (8-bit, unsigned)

BUG=webm:1615

Change-Id: I9321bbd58a305419bc8669ecd7594adc47e8b116
2019-06-28 20:27:38 +00:00
Angie Chiang d69584261d Merge changes I833c82fb,I05a39165,Ie044bb01,I565f477f
* changes:
  Integerize vp9_full_pixel_diamond_new
  Integerize vp9_refining_search_sad_new
  Integerize diamond_search_sad_new()
  Refactor vp9_full_pixel_diamond_new
2019-06-28 17:50:28 +00:00
James Zern 01cac9f374 Merge "vp9_encodeframe: quiet a few integer sanitizer warnings" 2019-06-28 02:52:20 +00:00
James Zern 951e60c6fc vp9_encodeframe: quiet a few integer sanitizer warnings
implicit conversion from type 'int' of value -2 (32-bit, signed) to type
'uint8_t' (aka 'unsigned char') changed the value to 254 (8-bit,
unsigned)

BUG=webm:1615

Change-Id: I9b8f5a9df3211e344e91d67a45d321e7115f5d4a
2019-06-27 16:01:11 -07:00
James Zern d310bc12b7 timestamp_test: enable TestMicrosecondTimebase
this doesn't cause any overflow issues after:
11de1b838 Fix timestamp overflow issues

BUG=webm:701,webm:1614

Change-Id: I7e1cbfa4264d1661eb9a5baa2b2111a0899360f2
2019-06-27 15:21:23 -07:00
Sai Deng 15511892a3 Merge "Change parameters for highbd tune=ssim" 2019-06-27 16:57:05 +00:00
Angie Chiang 184071fe9f Integerize vp9_full_pixel_diamond_new
Change-Id: I833c82fb910c8274b5a237e26fe0dcda7def9796
2019-06-26 16:28:46 -07:00
Angie Chiang 29382c1a06 Integerize vp9_refining_search_sad_new
Change-Id: I05a39165b9910262eca8fdf644ae982b80d309b4
2019-06-26 16:18:01 -07:00
Angie Chiang 6421d1359b Integerize diamond_search_sad_new()
Change-Id: Ie044bb01e26d871bace309ae1f45aa880ea1de62
2019-06-26 16:02:00 -07:00
Angie Chiang 295fd37912 Refactor vp9_full_pixel_diamond_new
Remove redundant bestsme assignments

Change-Id: I565f477f51c2a13369ebd1532eed05115e774238
2019-06-26 15:46:43 -07:00
Angie Chiang 30e7f9d856 Remove mv_dist/mv_cost from new mv search funcs
The functions are
diamond_search_sad_new()
vp9_full_pixel_diamond_new()
vp9_refining_search_sad_new()

Change-Id: Ied6fe98b8a1401c95f0488faf781c5cd5e8e0db6
2019-06-26 15:09:59 -07:00
Angie Chiang 1a363a8cae Speed up diamond_search_sad_new
The percentage of encoding time spent on diamond_search_sad_new
reduces from 8% to 6%

Change-Id: I1be55b957475d780974cc2e721f8c2d4d266e916
2019-06-26 15:09:59 -07:00
Angie Chiang 5ab6039987 Let full_pixel_exhaustive_new return int64_t
Change-Id: I2c7cd7363a1b61b7aa7c35fd9f4e6b926b67418f
2019-06-26 15:09:52 -07:00
Dan Zhu 7e4b8a86b2 script to compact frames to y4m video
Change-Id: I2d8c3ccf49c172a54181aeb2e2b8169bf5402456
2019-06-24 15:34:44 -07:00
Dan Zhu 729da3df8d add output of frame info
Change-Id: I70d750be13d9a654d1f21d7809d8d44c491ae477
2019-06-24 14:46:28 -07:00
Dan Zhu a709b69443 Add Ray Tracing
Add braces

Change-Id: I5355ccd8f745dfbd4fe3923a81aa3c9f8fda07b3
2019-06-24 14:44:10 -07:00
sdeng cb27e6ad05 Change parameters for highbd tune=ssim
With this CL:
             PSNR   SSIM    MS-SSIM
lowres_10bd  2.8    -5.6    -6.5
midres_10bd  2.6    -5.6    -6.3

Before this CL:
             PSNR   SSIM    MS-SSIM
lowres_10bd  6.1    -6.5    -7.7
midres_10bd  6.2    -6.0    -7.2

Change-Id: Iad0ad96d55ad140db00ce86c34ab85461cd963eb
2019-06-24 09:21:28 -07:00
Deepa K G affd9921e4 Set up frame contexts based on frame type
In single layer ARF case, use different frame
contexts for KF, ARF/GF, LF, OVERLAY update types.

Change-Id: Iebb7f9bb430e483dea1e75fc122b9b67645ce804
2019-06-24 18:16:53 +05:30
Deepa K G bd775bfda9 Modify frame context index
Used separate frame contexts for non-boosted frames.
Adjusted the frame context index grouping for boosted
frames.

Change-Id: I7f6f83f53d46f66a83a6806c2b568bd833ce940d
2019-06-24 17:56:15 +05:30
Dan Zhu 20103ec897 Add Scene module to manage other objects
and calculation

Add interpolation in the Scene

Delete Color interpolation

Build triangle mesh

Reconstruct the code of depth interpolation

Add new data structure Node for back linking

Change-Id: Ibb1e896a2e3623d4549d628539d81d79827ba684
2019-06-21 23:01:32 -07:00
Angie Chiang 99be4b1476 Integerize exhaustive_mesh_search_new()
Change-Id: Ia87ed60f46384e7bb7c5f55e9e28c406562a6f19
2019-06-21 11:36:36 -07:00
Angie Chiang af52e4657c Make vp9_nb_mvs_inconsistency return int64_t
Change-Id: I925156ed45e13a06c449c2fbff8a3c26baf8d835
2019-06-21 11:36:36 -07:00
Angie Chiang 0113fc3516 Make type of lambda int in TplDepFrame
Change-Id: I8fdf1ad4790201b1624c8408d92983aeb0b08302
2019-06-21 11:36:05 -07:00
Angie Chiang e728637b8b Integerize log2_approximation()
Change-Id: If645bf6a90f4bfb5a51ca0a78b88d1eb5bedbec2
2019-06-20 19:44:43 -07:00
Johann 7d9288f5f8 vsx: disable on all builds
The previous change to disable some vsx functions did not clear
the test failures. Disable vsx by default until it is investigated
and fixed.

BUG=webm:1522

Change-Id: I8ba2e7261ea3eee5022832da7e4a22bf8daa0996
2019-06-20 10:43:09 -07:00
Jerome Jiang c94dea71ed Merge "vp8: Allow higher resolution to get periodic keyframe." 2019-06-20 17:20:54 +00:00
Angie Chiang cbf0b1688a Merge "Change log2_fast to log2_approximation" 2019-06-20 17:14:15 +00:00
Ravi Chaudhary 087f131851 Merge "Start with q=active_best_quality for non-forced key frames" 2019-06-20 08:57:43 +00:00
Ravi Chaudhary 7f8ff8e377 Start with q=active_best_quality for non-forced key frames
Change-Id: I435d247ab4d1d160f12f5a3710e6cafb5cfd6610
2019-06-20 10:03:36 +05:30
Jerome Jiang 2da0a7ecb1 vp8: Allow higher resolution to get periodic keyframe.
BUG=webm:1632

Change-Id: Ib05a010245e77f9d502c3e7b8f488fca280ea544
2019-06-19 15:26:54 -07:00
Angie Chiang 9f8ee48611 Change log2_fast to log2_approximation
This reduce non_greedy_mv encoding time by 8.9%

Use linear approximation for value >= 1024

BDRate increases slightly on hdres
lowres: -0.002
midres: 0.007
hdres: 0.057

Change-Id: I55fd5e0bf0ab2206a286e11974f701cc48084be8
2019-06-19 12:10:58 -07:00
Dan Zhu aecad5a313 Merge "3D reconstruction tool build by Processing" 2019-06-19 17:13:50 +00:00
Angie Chiang 9d55a8f679 Merge "Implement log2_fast for vp9_nb_mvs_inconsistency" 2019-06-19 16:45:15 +00:00
Yue Chen ac5e269b94 Merge "Fix timestamp overflow issues" 2019-06-18 23:39:09 +00:00
Angie Chiang 7fb0c5cce9 Implement log2_fast for vp9_nb_mvs_inconsistency
This speed up non_greedy_mv by 8.7%

Change-Id: Ia46e3e7c4d32ec364091fad26cc953c62963e526
2019-06-18 15:13:04 -07:00
Yue Chen 11de1b8381 Fix timestamp overflow issues
- Save the initial user-specified timestamp and rebase all further
timestamps by this value. This makes libvpx internal timestamps to
always start from zero, regardless of the user's timestamps.
- Calculate reduced timestamp conversion ratio and use it to convert
user's timestamps to libvpx internal timestamps and back. The effect
of this is that integer overflow due to multiplication doesn't
happen for a much longer time.

BUG=webm:701

Change-Id: Ic6f5eacd9a7c21b95707d31ee2da77dc8ac7dccf
2019-06-18 10:41:10 -07:00
Jerome Jiang bb9511684f Merge "Fix memory leak for vp8 multi-res encoder." 2019-06-15 00:34:40 +00:00
Harish Mahendrakar 75b0abefd8 vpx_dec_fuzzer: Remove fmemopen dependency
fmemopen is not preferred during fuzzing.
Removed all file operations.

Removed need for allocating a different input buffer.
data buffer is appropriately incremented and passed directly to decoder
This will also test input being sent in an unaligned buffer to the library.

Removed read_frame function and did the required parsing inline.

Change-Id: I32829b0149dba9339f2e8bb4c0249a4987a630c7
2019-06-14 11:08:36 -07:00
Dan Zhu 618a7d68f8 3D reconstruction tool build by Processing
(a java based language for data visualization)

add MotionField module

reformat the code by using newest clang-format version

add necessary comments

add new functions

move basic settings to setup

Change-Id: I64a6b2daec06037daa9e54c6b8d1eebe58aa6de0
2019-06-14 10:23:02 -07:00
Jerome Jiang c06a917501 Fix memory leak for vp8 multi-res encoder.
BUG=webm:1630

Change-Id: I03e74e78aa0ead66eda7506e921b1774b5442ed5
2019-06-14 10:13:26 -07:00
Deepa K G cb2edce5e2 Merge "Use previous ARF as GOLDEN frame for the next GOP" 2019-06-14 06:49:43 +00:00
Johann Koenig 6ae06316c7 Merge "ppc: disable vsx for small predictors" 2019-06-13 22:06:22 +00:00
Johann daf6c47f4e ppc: disable vsx for small predictors
These functions cause test failures when running the entire suite.

BUG=webm:1522

Change-Id: I2c1dc4923e9f149464f365ef63dc59621cfabf5a
2019-06-13 11:45:20 -07:00
Harish Mahendrakar 429d50740e Merge "vpx_dec_fuzzer: Add -fsanitize=fuzzer-no-link" 2019-06-13 18:45:02 +00:00
Johann Koenig 0b1ef32b98 Merge "ppc: disable vsx optimizations with hbd" 2019-06-13 18:40:42 +00:00
Harish Mahendrakar 3ff6417a6e vpx_dec_fuzzer: Add -fsanitize=fuzzer-no-link
Updated build instructions for vpx_dec_fuzzer to include
-fsanitize=fuzzer-no-link while configuring library

Change-Id: Id158256aa1cfe3d847720e8558cb5998ad4fd777
2019-06-12 15:18:50 -07:00
Deepa K G 24e38521a8 Use previous ARF as GOLDEN frame for the next GOP
This patch uses ARF itself as the GOLDEN frame for the
next gf group instead of replacing it with the overlay
frame. By doing so, bits consumed by the overlay frame
will be reduced.

Change-Id: I909ceaa6d501c267d315614075913d45ad426c15
2019-06-11 19:06:00 +05:30
Johann d9f0106d9d sse: remove unused HAVE_SSE files
There are no sse functions which use these files. Cleans up spurious
warnings when building with --disable-sse2

Change-Id: I04d84b8b7ecfe6da7d5d4df63840796c7b04c085
2019-06-10 15:08:40 -07:00
Johann 7bf48f92e8 ppc: disable vsx optimizations with hbd
vsx optimizations do not support 32 bit tran_low_t values.

BUG=webm:1563

Change-Id: I9e6348078f6e4855acfd381133eb840a435b7f81
2019-06-10 14:53:16 -07:00
James Zern 6a7c84a244 update libwebm to libwebm-1.0.0.27-361-g81de00c
81de00c Check there is only one settings per ContentCompression
5623013 Fixes a double free in ContentEncoding
93b2ba0 mkvparser: quiet static analysis warnings

Change-Id: Ieaa562ef2f10075381bd856388e6b29f97ca2746
2019-06-07 15:06:29 -07:00
Jerome Jiang 28cc5f3646 vp8: fix leak in vp8e_mr_alloc_mem
BUG=webm:1596

Change-Id: I09ba00a7b7ad331671a7a285a2ac5630d8b62199
2019-06-06 15:38:13 -07:00
Sai Deng 1140d8f2e2 Merge "Update performance test results for tune=SSIM" 2019-06-06 15:50:14 +00:00
Sai Deng f04764cfe8 Merge "Fix a bug in best RD cost updating" 2019-06-06 06:13:08 +00:00
sdeng ed8e473581 Update performance test results for tune=SSIM
I made a mistake (used the outdated baseline) in the CL I
submitted earlier this week:
https://chromium-review.googlesource.com/c/webm/libvpx/+/1638854

The corrected results are following:
The additional gains/loss on top of the tune=ssim are:
Data Set   Overall PSNR   SSIM       MS-SSIM
 Lowres       3.490      -3.164      -2.267
 Midres       2.245      -2.270      -2.287
 HDres        2.562      -1.804      -1.681
Lowres_10bd   3.477      -2.399      -2.689
Midres_10bd   3.467      -1.534      -1.636

The overall gains/loss comparing to tune=psnr are:
Data Set   Overall PSNR   SSIM       MS-SSIM
 Lowres       6.127      -5.818      -4.783
 Midres       4.574      -5.383      -6.242
 HDres        4.908      -6.218      -7.106
Lowres_10bd   6.115      -6.212      -7.790
Midres_10bd   6.238      -6.064      -7.249

Change-Id: Iae72482f7b30f200e5021a98c920eed841d0972a
2019-06-06 03:51:52 +00:00
sdeng 1f11352c07 Fix a bug in best RD cost updating
This CL fixed a bug that sometimes we calculate the best rd cost using
uninitialized rd_div. This CL also includes a small refactoring of
rd_pick_partition().

Speed change: (the smaller the better)
 Performance counter stats for './vpxenc park_joy_480p.y4m --limit=50
 -o output.webm':

with this CL:       297,086,181,136      instructions:u
without this CL:    299,285,835,104      instructions:u

Quality change: (negative is better)
          avg_psnr ovr_psnr  ssim
(low_res)  0.007     0.005  -0.002
(mid_res)  0.022     0.028   0.007
(hd_res)  -0.008    -0.003  -0.014

Change-Id: I8924d8426364304212bcef3aba13346783e6f1a8
2019-06-05 20:39:03 -07:00
James Zern cb704b95e3 configure: test -Wno-* flags used with libyuv
with g++ this avoids:
command line option ‘-Wno-missing-prototypes’ is valid for C/ObjC but
not for C++

the flag is necessary with clang.

BUG=webm:1584

Change-Id: I250c76483302d913999e5f9e0d09ee6449b052df
2019-06-04 21:57:47 +00:00
James Zern 54abc9bade Merge changes Ib73136b2,Ie514f663
* changes:
  configure: enable -Wmissing-declarations for more files
  vp9_thread_test: quiet -Wmissing-prototypes
2019-06-04 21:12:50 +00:00
Marco Paniconi 85da20aab7 Merge "vp9-rtc: Use speed 5 for postencode drop tests." 2019-06-04 18:35:15 +00:00
Marco Paniconi 3307b009c7 vp9-rtc: Use speed 5 for postencode drop tests.
Test was running at speed 4, which is not used for real-time.
With this change all Datarate tests are now running at
(speed >= 5, 1 pass, real-time mode), which is what they
were intended for.

BUG=webm:1512

Change-Id: I47a721dadd24b73df722c44419df7cfc06c44226
2019-06-04 10:11:59 -07:00
Sai Deng 1d13a6c321 Merge "Hierarchical rdmult scaling when tune=ssim" 2019-06-04 16:10:51 +00:00
James Zern 5bdf43407a configure: enable -Wmissing-declarations for more files
avoid using it with third_party/libyuv as that still requires some work.

BUG=webm:1584

Change-Id: Ib73136b22c89d927b112364e19d725c51768bbb7
2019-06-03 17:20:35 -07:00
sdeng b1072a793d Hierarchical rdmult scaling when tune=ssim
Use different lagrangian multiplier scaling factor for different block
size. The blocks whose sizes are less than 16x16 share the same multiplier
of their parent block.

The additional gains/loss on top of the tune=ssim are:
Data Set   Overall PSNR   SSIM    MS-SSIM
Lowres         2.918     -3.691   -2.596
Midres         1.708     -2.656   -2.624
HDres          1.619     -2.496   -2.391
Midres_10bd    1.518     -3.263   -3.561

The overall gains/loss comparing to tune=psnr are:
Data Set   Overall PSNR   SSIM    MS-SSIM
Lowres         5.583     -6.208   -4.978
Midres         4.024     -5.610   -6.411
HDres          4.102     -6.614   -7.457
Midres_10bd    4.647     -7.181   -8.614

Change-Id: I0e6c5008488734e979b2dacde9fc2a17f3aa620f
2019-06-03 17:15:02 -07:00
James Zern ca440c0a5f vp9_thread_test: quiet -Wmissing-prototypes
BUG=webm:1584

Change-Id: Ie514f6630acfb018a3ac4a05758c8b4119ae28fa
2019-06-03 17:08:55 -07:00
Jerome Jiang 13dc6d671c Merge "Remove unused func for CONFIG_REALTIME_ONLY" 2019-06-03 19:54:28 +00:00
Jerome Jiang 42a83da94c Remove unused func for CONFIG_REALTIME_ONLY
Change-Id: I503e147e20e5b69b910c425d169e59821874f627
2019-06-03 10:11:34 -07:00
Sai Deng 96a024a12a Merge changes I7b1d1482,I01588758,I6f17864e
* changes:
  Update rdcost using the rd_mult in current block
  Use distortion and rate of best_rd as the params
  Use distortion and rate recursively in rd_pick_partition()
2019-06-03 16:19:06 +00:00
sdeng 6cf62dcf8a Update rdcost using the rd_mult in current block
This CL is a preparation for implementing hierarchical SSIM rdmult scaling.
There is very little impact on metrics and speed:
       avg_psnr ovr_psnr  ssim
midres   0.009   0.009   0.015

perf stat -e instructions:u ./vpxenc park_joy_480p.y4m --limit=50
with this cl: 317,722,808,461
before:       317,700,108,619

Change-Id: I7b1d1482ac69f7bc87065a93223a0274bcbe8ce3
2019-06-01 15:12:39 -07:00
sdeng 5cfaf561d9 Use distortion and rate of best_rd as the params
Also added rd calculation for negative rates and distortions.
This CL is a preparation for implementing hierarchical SSIM rdmult scaling.

Little impact on quality and speed:
            avg_psnr  ovr_psnr   ssim
(mid_res)    -0.015    -0.009   -0.018

perf stat -e instructions:u ./vpxenc park_joy_480p.y4m --limit=50
with this cl: 317,700,108,619
before:       317,669,279,763

Change-Id: I01588758b7be2aab32236440ec0e57d7af56e920
2019-06-01 15:11:24 -07:00
Jerome Jiang f0ff0600d0 Merge "Remove RD code for CONFIG_REALTIME_ONLY in vp9." 2019-06-01 00:07:27 +00:00
Jerome Jiang da24d35132 Remove RD code for CONFIG_REALTIME_ONLY in vp9.
This reduces vp9 only binary size by ~5.7%.

Change-Id: I57e46baf591d68b0a0cecbc9319a1190df8b0457
2019-05-31 14:57:01 -07:00
sdeng 7d670d88fe Use distortion and rate recursively in rd_pick_partition()
This CL is a preparation for implementing hierarchical SSIM rdmult scaling,
There is very little impact on metrics and speed:
       avg_psnr ovr_psnr  ssim
midres   -0.04   0.005    0.012

perf stat -e instructions:u ./vpxenc park_joy_480p.y4m --limit=50
with this cl: 317,669,279,763
before:       317,717,562,045

Change-Id: I6f17864e7b17aad06a04ae4f470f75e975549db9
2019-05-31 14:24:11 -07:00
James Zern 3359c1e5dd vp8: restrict 1st pass cpu_used range
< 4 isn't meaningful in the first pass; additional analysis will be
done, but thrown out, unnecessarily increasing the runtime.

Change-Id: Ic3de77e3eaa7a8a3371f76f84693e9655c60fdba
2019-05-30 20:16:46 -07:00
James Zern 78dabb0557 libvpx,vp9_datarate_test: drop one-pass vod mode
this test is only useful for realtime mode testing given the number of
frames and that one-pass vod has never been a primary focus for
development.

BUG=webm:1512

Change-Id: I23208393a5fcc5bcf9b267fab4b0d1aad500918a
2019-05-30 20:15:11 -07:00
Aidan Welch 8c89570488 Merge "added error logging to video_writer.c similar to video_reader.c" 2019-05-30 01:50:10 +00:00
Johann Koenig f7a07b60ad Merge "remove unused svc exports" 2019-05-29 22:16:48 +00:00
Aidan Welch c3312cb6ca added error logging to video_writer.c similar to video_reader.c
Change-Id: Ib56b3e309113574a69ae09db1ee5b0fcc14ebe88
2019-05-29 21:34:48 +00:00
Johann 8198750c15 remove unused svc exports
The spatial svc implementation has moved outside the library:
commit ed8f189ccc
  Refactor: move svc example files to from vpx/ to  examples/

BUG=webm:1629

Change-Id: I31c3ae7b20a6bd50615d1d6e48d4f93beca939e6
2019-05-29 21:14:39 +00:00
Deepa K G f2aaf64bfd Merge "Fix calculations in GF only group case" 2019-05-29 08:02:07 +00:00
Deepa K G 75e31e41d4 Merge "Increase the bits allocated to key frame" 2019-05-29 08:01:23 +00:00
Marco Paniconi 0308a9a132 Merge "vp9-rtc: Update overshoot_detection speed feature" 2019-05-28 15:32:10 +00:00
Paul Wilkins baa2fa7fe7 Merge "Fix section intra rating for first ARF interval" 2019-05-28 10:40:48 +00:00
Marco Paniconi 93bb9d8328 vp9-rtc: Update overshoot_detection speed feature
Keep the overshoot_detection_cbr_rt to the fast mode
(FAST_DETECTION_MAXQ), except for low-resoln at speed 5,
for non-screen content.

The increase in encode time (from using the more accurate
RE_ENCODE_MAXQ) is acceptable for speed 5 at low resoln.

Change-Id: I3089d1505553154ef046056465bc18130f7bd55a
2019-05-27 09:13:20 -07:00
Deepa K G e3a061200e Fix calculations in GF only group case
- Fix the number of frames considered in calculation of
  twopass active worst quality. For GF only group, frames
  considered should be one less than baseline gf interval
  accounting for the golden frame.
- Fix in calculation of normal_frames. As baseline gf
  interval includes the golden frame, the number of
  normal frames should be one less than baseline gf
  interval.

Change-Id: Ic752f7d13d23772687e2fa407698766b3fdf5c67
2019-05-27 18:02:00 +05:30
James Zern 73ab008d26 Merge "Revert "Fix calculations in GF only group case"" 2019-05-25 23:41:30 +00:00
James Zern b4dcfedde5 Revert "Fix calculations in GF only group case"
This reverts commit c87ff4a09d.

Reason for revert: causes division by zero

Original change's description:
> Fix calculations in GF only group case
> 
> - Fix the number of frames considered in calculation of
>   twopass active worst quality. For GF only group, frames
>   considered should be one less than baseline gf interval
>   accounting for the golden frame.
> - Fix in calculation of normal_frames. As baseline gf
>   interval includes the golden frame, the number of
>   normal frames should be one less than baseline gf
>   interval.
> 
> Change-Id: I6c0cd0a39db23586fc390a6fba5d7aebc0dfce08

Change-Id: I522da652587ae7ca4177f6d4bb9f72abcff35637
2019-05-25 19:55:35 +00:00
Jingning Han 1541211f7e Merge "Increase active best quality linearly" 2019-05-24 16:35:27 +00:00
Paul Wilkins f95f687e22 Merge "Fix calculations in GF only group case" 2019-05-24 15:38:56 +00:00
Venkatarama Avadhani e9a30997d9 Merge "Exclude VP9 files from vpx_dsp.mk for VP8 build" 2019-05-24 05:22:16 +00:00
Jingning Han d17961f0fe Merge "Clamp for min_frame_target" 2019-05-22 16:42:54 +00:00
Venkatarama NG. Avadhani cab60a0234 Exclude VP9 files from vpx_dsp.mk for VP8 build
Change-Id: Ifab64a783c205cc79b841a3f77fb77b156b23b23
2019-05-22 16:41:56 +00:00
Jingning Han 85493c1a0a Clamp for min_frame_target
Apply the minimum frame size clamp for all applicable frames. This
avoids bit-rate undershooting issue as reported in

BUG=b/133260125

Change-Id: I59ec028eee999ad5238602adf96465af7c4f4514
2019-05-21 16:47:24 -07:00
Deepa K G 869d82d656 Increase the bits allocated to key frame
Based on the spatial complexity, increase the
bits allocated to key frame.

Change-Id: I4f96990a13bcc3bdb7a22d50e67e2bd622f1ff7b
2019-05-21 12:15:43 +05:30
Marco Paniconi 197827edb8 vp8: Disallow copy flag behavior under forced refresh
Don't allow the setting of copy_buffer_to_arf when the
application/user sets the refresh/update flags. Add new flag
(ext_refresh_frame_flags_pending) to indicate user sets the flags.

Change-Id: I482098c0f2552b04885132a728629ab3e207f08b
2019-05-17 14:58:16 -07:00
Marco Paniconi a8fa1bde72 vp9-rtc: Increase qp thresh for overshoot detection
For video mode (non-screen) in CBR real-time mode:
increase the qp thresh to trigger setting to active_worst
on scene changes. Avoid big overshoots in content with
scene changes.

Change-Id: I74721b07b0d7b742cbef468ece70cca7da0f89eb
2019-05-17 10:55:20 -07:00
Deepa K G a612e4b493 Limit active best quality of layered ARF frames
For higher layer ARF frames, limit active best
quality to the qindex of the lower layer ARF
frame.

Change-Id: I957cbd8ae02313cbc94eda2175e63a26d788459a
2019-05-17 14:23:47 +05:30
Ravi Chaudhary b0509e8868 Increase active best quality linearly
The ARF frames in last few gf intervals, would be
used as a reference by fewer ARF frames in the same
kf interval. Also, the ARF frames in the last GF
group would not be used as a reference in future.
Hence the active best quality for these ARF frames
is increased based on their temporal distance from
the next key frame.

Change-Id: Ice7eaa8a25384104b1d9cc021eec588c03053fc2
2019-05-16 17:00:55 +05:30
Deepa K G c87ff4a09d Fix calculations in GF only group case
- Fix the number of frames considered in calculation of
  twopass active worst quality. For GF only group, frames
  considered should be one less than baseline gf interval
  accounting for the golden frame.
- Fix in calculation of normal_frames. As baseline gf
  interval includes the golden frame, the number of
  normal frames should be one less than baseline gf
  interval.

Change-Id: I6c0cd0a39db23586fc390a6fba5d7aebc0dfce08
2019-05-16 11:52:52 +05:30
Deepa K G 4dce2d0f7d Fix section intra rating for first ARF interval
The section intra rating used for the frames in the
first ARF interval was based on entire key frame
interval. However, for subsequent ARF intervals it was
based on that ARF interval. This discrepancy is fixed.

Change-Id: I3df358861d720e536c9c6f15da1cbd78f2dfffbc
2019-05-16 11:43:14 +05:30
Johann Koenig 3a1d99b3ef Merge "Revert "disable row mt test"" 2019-05-15 18:40:02 +00:00
Johann Koenig 88dd7f97f2 Revert "disable row mt test"
This reverts commit 6d6cc17dc8.

Reason for revert:
This has not been reproduced on hardware. There is a strange
libc bug which may account for the behavior on arm because
the environment qemu is using is somewhat old. See discussion
on the webm bug.

To work around the failures in the nightly test the jenkins
job has been switched to use the hardfloat compiler and qemu
environment. Even though this is the same version, it has
not shown the hanging behavior.

Original change's description:
> disable row mt test
> 
> deadlock is being investigated in attached bug.
> 
> BUG=webm:1626
> 
> Change-Id: Ia6d7020b8b1d274433aa89f36c9ed5b9facc5808

Bug: webm:1626
Change-Id: I104a82696a4c90bfbadfd39407c073adce73af0d
2019-05-15 17:06:22 +00:00
Deepa K G 74dfa752ea Merge "Increase the active best quality in CQ" 2019-05-15 04:26:02 +00:00
Jerome Jiang 78c44e2dc2 Reland "vp9: Enable ml based partition for speed>=8 low res."
Disable in high bitdepth build.

This reverts commit 152358da77.

Change-Id: I9996d0963915ed4db0fde80c6290d91b3ce63719
2019-05-13 17:10:50 -07:00
Paul Wilkins b4d593381b Merge "Fix update of mb_smooth_pct and mb_av_energy" 2019-05-13 08:08:37 +00:00
Jerome Jiang 4d0fe85c19 Merge "Revert "vp9: Enable ml based partition for speed>=8 low res."" 2019-05-10 23:15:04 +00:00
Johann Koenig 7877b911a4 Merge "disable row mt test" 2019-05-10 22:57:50 +00:00
Jerome Jiang 0259778a0f Merge "Cast buffer offset to int64_t" 2019-05-10 21:38:16 +00:00
Jerome Jiang 152358da77 Revert "vp9: Enable ml based partition for speed>=8 low res."
This reverts commit eed8d47769.

BUG=chromium:946409

Change-Id: Iaf9929de841445f63e93792d1fee06d9a1035ef4
2019-05-10 14:29:12 -07:00
Jingning Han 9adb2a4945 Merge "Assign perceptual AQ mode as 5" 2019-05-10 21:28:47 +00:00
Jingning Han ba11607161 Assign perceptual AQ mode as 5
Change-Id: I8f301fab3bedcd71588c57ccd6e49dcb7042e220
2019-05-10 13:25:32 -07:00
Jerome Jiang b90eef04ef Cast buffer offset to int64_t
To prevent integer overflow with extreme frame sizes.

Change-Id: Ib77f1c11f0264257d9e6c162f474d637592e7b09
2019-05-10 11:46:19 -07:00
Jerome Jiang dd9ded767b Reland "vp9-rtc: tx_size selection for intra mode in nonrd"
Reland this change since tsan failure is fixed.

Change-Id: I20e3d3d23e34befabb43a36d491d27dfc2a908b6
2019-05-10 09:47:52 -07:00
Jerome Jiang e742d5ac09 Fix tsan failure in webrtc test.
plane block size is used when computing model rd for uv.
However, it iterates thru sub-blocks based on tx size on uv planes
and plane block size could be bigger than that, which leads to reading
beyond tile boundary when the block is on it.

BUG=b/131414589

Change-Id: I362091484b1325b89d2175039323b235a06ebffc
2019-05-09 21:04:11 -07:00
Johann 6d6cc17dc8 disable row mt test
deadlock is being investigated in attached bug.

BUG=webm:1626

Change-Id: Ia6d7020b8b1d274433aa89f36c9ed5b9facc5808
2019-05-09 11:57:30 -07:00
Jingning Han 14cc2c4709 Merge "Fix key frame detection" 2019-05-08 21:29:07 +00:00
Jingning Han 0df9a18edf Fix key frame detection
This solves the regression issue seen in certain animation clips.

BUG=b/132108583

Change-Id: Ib28413c95160a5f15fbcf9ea6a322fd4f69a57ce
2019-05-08 13:16:10 -07:00
Johann Koenig 8e9a41b30c Merge "android: clarify RTCD usage" 2019-05-08 16:21:24 +00:00
Paul Wilkins 362ca74ef4 Merge "Avoid two GF only groups just before a kf" 2019-05-08 10:55:25 +00:00
Jingning Han 1e2cfa3f0f Merge "Cap arf boost in perceptual quality mode" 2019-05-07 05:29:01 +00:00
Jingning Han fcefd4e1d1 Merge "Increase min arf boost from 240 to 250" 2019-05-07 05:28:43 +00:00
Jerome Jiang da4c8bd04f Merge "vp8: Remove duplicated code in datarate tests." 2019-05-07 04:01:44 +00:00
Jerome Jiang c7dcd6f5d9 vp8: Remove duplicated code in datarate tests.
Duplicated code between *Large and other tests.

Change-Id: I0cea7472c3520175339bc921dfd8a090b5d5484d
2019-05-06 20:05:33 -07:00
Angie Chiang 950fecd01c Add mismatch_debug tool
Change-Id: I045b4cf625d428109688303ced5433d824df2790
2019-05-06 16:09:10 -07:00
Jerome Jiang 1cbcb820ac Merge "Revert "vp9-rtc: tx_size selection for intra mode in nonrd"" 2019-05-06 22:14:58 +00:00
Jingning Han c3689f7ad8 Cap arf boost in perceptual quality mode
When the perceptual AQ mode is enabled, cap the ARF boost to 2.5x
of the regular frame. This allows more consistent frame quality
across consecutive frames and sufficient bit rate allocation at
frame level for AQ mode.

Change-Id: I10f5e2860a3e4b412efe25cca635405bae293ebf
2019-05-06 14:20:24 -07:00
Jingning Han a099d24707 Increase min arf boost from 240 to 250
This imposes nearly zero change in low/mid/hd res test sets.

Change-Id: I121716b96263f2a382d35e7ff05ed8b72e5e6bc7
2019-05-06 14:20:21 -07:00
Johann Koenig e8b07651ab Merge "android: do not attempt standalone builds" 2019-05-06 20:38:04 +00:00
Harish Mahendrakar 152d97e67b Merge "Exclude VP9 assemblies from VP8 builds" 2019-05-06 19:49:56 +00:00
Johann 6522c56c8c android: clarify RTCD usage
Note that when using --disable-runtime-cpu-detect the developer
must keep in mind what devices the library will be run on.

BUG=webm:1623

Change-Id: I0359e226bb678f8e5145bb30cd1cefc7e30c6c79
2019-05-06 12:48:50 -07:00
Jerome Jiang 9acae8a7a3 Revert "vp9-rtc: tx_size selection for intra mode in nonrd"
This reverts commit cdd40d1cd0.

Cause tsan failure in webrtc tests.

BUG=b/131414589

Change-Id: I04f98153bc1f9d013d3d1eb8d06df312fe12f8b4
2019-05-06 12:29:02 -07:00
Johann f30c419d4a android: do not attempt standalone builds
arm builds require too many tweaks to keep up with changes
to the ndk. Recommend ndk-build instead.

Update documentation and drop --sdk-path references. If
--enable-external-build is used instead we do not need the compiler
path.

BUG=webm:1622

Change-Id: Id024345afd7af988321f8f97ebab19c425cb0493
2019-05-06 11:48:45 -07:00
Ravi Chaudhary dddbe22632 Fix update of mb_smooth_pct and mb_av_energy
Values of mb_smooth_pct and mb_av_energy have been updated
correctly in vp9_rc_get_second_pass_params for higher layer
ARF frames.

Change-Id: Ic176e393eb8cc5f418235fee9accee84e9809607
2019-05-06 10:11:02 +00:00
Venkatarama NG. Avadhani 7ef71a46f9 Exclude VP9 assemblies from VP8 builds
Add a macro to to exclude VP9 specific assembly files from build if VP9
is not configured. This would otherwise cause a linking error for VP8
only builds.

BUG=webm:1625

Change-Id: I6d892b7c2837a2574538d18b776fd2b6d706da96
2019-05-06 15:14:08 +05:30
Jerome Jiang 3fd96f7d7d Merge "vp8: clamp uv mv after calculation." 2019-05-03 18:13:46 +00:00
Deepa K G 2b8e273105 Avoid two GF only groups just before a kf
Trap the case where we end up with two short GF only groups just
before a key frame. For example, if the KF is 22 frames away
we are better doing one ARF group of size 16 followed by a GF
only group of 6 than two GF only groups of size 11 (when
min_gf_interval is 12).

Change-Id: Ie598a8a21c6e104cbe381b4792e77fd92d047725
2019-05-03 11:01:52 +05:30
Jerome Jiang 8a800b52ea vp8: clamp uv mv after calculation.
BUG=oss-fuzz:14478

Change-Id: Ia978a1e7829bf486681385cd715ed0b50fe3b072
2019-05-02 16:24:27 -07:00
Johann Koenig 14e1b23528 Merge "vp8: quiet conversion warnings when packing bits" 2019-05-02 19:04:47 +00:00
Angie Chiang 545c4d75d8 Merge "Fix the use of uninitialized value in qsort" 2019-05-02 18:44:13 +00:00
James Zern 8743cdd446 Merge "make vpx_debug_util.c inclusion conditional" 2019-05-01 19:51:41 +00:00
Johann d79ef289f1 vp8: quiet conversion warnings when packing bits
Mask the values to show that we only want to store 1 byte. Switch
to lowercase ff since it's more prevalent in the file.

BUG=webm:1615

Change-Id: Ia8ede79cb3a4a39c868198ae207d606e30cfb1cb
2019-05-01 12:11:46 -07:00
Jingning Han 7a4703e8a9 Rework the wiener variance buffer
Support the potential frame scaling use case. The operation flow
now allows the codec to allocate the memory buffer only when
perceptual AQ mode is enabled.

Change-Id: I7529e63131276dbe3a29f910d3a227f20dbc94a2
2019-05-01 03:40:03 +00:00
Jingning Han 37c8030a2a Deprecate stack_rank_buffer usage
This large buffer is no longer needed.

Change-Id: I9f2b3b28663d299649208f6172bba136103342ad
2019-05-01 03:39:43 +00:00
Jingning Han 1fa82a9117 Refactor perceptual aq control
Move the activation control to vpxenc interface using aq-mode.

Change-Id: Iae406d4f7e74bdc7bfd3b149f0811093454f879e
2019-05-01 03:39:31 +00:00
Jingning Han e5f2e06a20 Merge "Add PERCEPTUAL_AQ tag" 2019-05-01 03:37:04 +00:00
James Zern 204b72c98c make vpx_debug_util.c inclusion conditional
on CONFIG_BITSTREAM_DEBUG. this avoids an object file containing no
symbols which may cause warnings on some platforms.

Change-Id: I02af97d6970de949466c29f50d272733d97ee8d2
2019-04-30 17:45:07 -07:00
James Zern aef0b8808c Merge "vp8cx.h,vpxenc: add note about alt ref ranges" 2019-04-30 22:09:39 +00:00
Jingning Han bf30ae3ad6 Fix the use of uninitialized value in qsort
Search within the effective transform coefficient window.

Change-Id: If432eaab5ffca1cdfe57ee23052bf5dc60a2f893
2019-04-30 20:22:45 +00:00
Johann abb13c0d8e cast ambiguous _mm_set1_epiNN() constants
clang 7 integer sanitizer warns on unsigned->signed conversions when
the highest bit is 1.

BUG=webm:1615

Change-Id: I6381efaff9233254b40cb78f7bcf87090e0ad353
2019-04-30 12:44:51 -07:00
Johann Koenig 1d0dedf759 Merge "vp8: quiet conversion warning when packing sizes" 2019-04-30 16:46:59 +00:00
Deepa K G 9012ebc269 Merge "Refine active best quality of layered ARF frames" 2019-04-30 07:59:01 +00:00
Sai Deng 2c38592ec6 Merge "Call set_error_per_bit in SSIM rdmult update" 2019-04-29 23:38:30 +00:00
Johann 9a81785e42 vp8: quiet conversion warning when packing sizes
The values are or'd together and then stored 8 bits at a time:
9.1. Uncompressed Data Chunk
* 16 bits: (2 bits Horizontal Scale << 14) | Width (14 bits)
* 16 bits: (2 bits Vertical Scale << 14) | Height (14 bits)

BUG=webm:1615

Change-Id: Id2eb3deaccec299a0619990d3a6f1eb4f71e50e2
2019-04-29 16:15:50 -07:00
Johann Koenig 49794235cd Merge "vp8 quantize: silence conversion warning" 2019-04-29 22:53:22 +00:00
Johann Koenig 197ef67eeb Merge "vp8 quantize: use native abs/sign implementations" 2019-04-29 22:32:02 +00:00
James Zern ca18ac8e4f vp8cx.h,vpxenc: add note about alt ref ranges
BUG=webm:1597

Change-Id: I56345ec621a06dfe1eae7f205874f34bfb40e6e5
2019-04-29 15:23:34 -07:00
Angie Chiang 6171b7f342 Merge "Add bistream_debug tool" 2019-04-29 22:22:04 +00:00
Johann 1e01d13fb8 vp8 quantize: silence conversion warning
clang 7 integer sanitizer warns about storing any int16_t value
where the high bit is 1. Treated as an int, such number would
be positive. Treated as an int16_t, it is negative.

BUG=webm:1615

Change-Id: Idf655cd92d26b7c1180910159be3f64164577eca
2019-04-29 14:38:58 -07:00
Johann e3f12b520f vp8 quantize: use native abs/sign implementations
~4% improvement with a very rudimentary speed test

Change-Id: Iad8868327e3276dbead783a79849295b0e4b135c
2019-04-29 14:00:35 -07:00
Jingning Han 5a97d750b4 Add PERCEPTUAL_AQ tag
Refactor the percetual AQ mode control.

Change-Id: I9c00c32139ec98fd6aebc1d5086e042730f3616f
2019-04-29 11:42:54 -07:00
Sai Deng dff4d376ea Merge "Refactor the SSIM based rdmult update function" 2019-04-29 18:35:59 +00:00
sdeng 524ee737ea Call set_error_per_bit in SSIM rdmult update
This CL improves objective metrics: (midres)
avg_psnr ovr_psnr ssim    ms_ssim
-0.149   -0.038   -0.108  -0.129

Change-Id: I21f3e478f81ead5a3bcce6041f32fbceb53828f3
2019-04-29 09:34:46 -07:00
sdeng e779b2f381 Refactor the SSIM based rdmult update function
Change-Id: I335103689659d9a2b291c4da54f07cdd9c2b1a6d
2019-04-29 09:31:13 -07:00
Paul Wilkins 2a26032bd2 Merge "Fix in key frame detection" 2019-04-29 13:50:37 +00:00
Angie Chiang 483b71e8e7 Add bistream_debug tool
Change-Id: I339899cff65c7ef563f9411f2d7af9a32a08a705
2019-04-26 16:58:50 -07:00
Marco Paniconi e50f4e4112 vp9-rtc: Adjust thresh for 4x4 tx selection
For screen content nonrd_pickmode: reduce
threshold to select 4x4 tx_size, under certain
conditions.

Change-Id: If68c30172272868033f0e3011e53c76b4e7c48b6
2019-04-25 18:44:33 -07:00
Marco Paniconi f836d8ba87 vp9-rtc: Fix int conversion error in nonrd_pickmode.
Change-Id: I1be775d8c11f530ff26121f1ffaf1dae100b2510
2019-04-25 09:47:23 -07:00
Marco Paniconi 8dc05779eb vp9-rtc: Pass source variance and mode to select tx
For nonrd-pickmode: pass the source variance and the
mode (intra/inter) to select tx_size, for better tuning.

Neutral change for video mode, speed 7.
Some quality improvement for screen content.

Change-Id: I53336f23fa4f14076aa1cdf8036e9af73c43060a
2019-04-24 14:37:53 -07:00
Marco Paniconi 44449db02f Merge "vp9-rtc: tx_size selection for intra mode in nonrd" 2019-04-24 06:15:25 +00:00
Angie Chiang d2d5174440 Merge changes I1af88144,I9eaf9563,I58c1bc0f,I8d173add
* changes:
  Remove mv_dist and mv_cost from TplDepStats
  Remove inter_cost_arr and recon_error_arr
  Remove RE_COMPUTE_MV_INCONSISTENCY
  Remove unused mv_[dist/cost]_sum
2019-04-23 23:36:50 +00:00
Sai Deng ed478a977c Merge "Revert "Add VPX_TUNE_SSIM and VPX_TUNE_PSNR enums"" 2019-04-23 23:11:54 +00:00
Sai Deng b4da0a527e Revert "Add VPX_TUNE_SSIM and VPX_TUNE_PSNR enums"
This reverts commit 1d8d8f562b.

Reason for revert: change the api names will break existing code.

Original change's description:
> Add VPX_TUNE_SSIM and VPX_TUNE_PSNR enums
>
> Change-Id: I3df5af2c60b774e6d395062077542c52db868236

TBR=jingning@google.com,builds@webmproject.org,sdeng@google.com

Change-Id: Ic94c19739f595f4544e8b68892ab9d9c1bbccd79
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2019-04-23 23:02:55 +00:00
Marco Paniconi cdd40d1cd0 vp9-rtc: tx_size selection for intra mode in nonrd
In nonrd_pickmode for intra modes: add tx_size selection
based on Y prediction signal for the bsize.

The tx selection is done in model_rd, same as inter-modes.

Existing code for intra mode was first setting a tx_size based
only on the bsize, and then in some cases in block_yrd
(during the loop over bsize in units of tx_size) the tx_size
may be set again if model_rd is called in block_yrd.

This CL separates out the tx_size setting (based on Y channel
prediction via model_rd), and then block_yrd is called once
for whole bsize. This allows for better tuning of the tx
selection for intra modes in future change.

Adjust threshold in svc datarate test.

Negligible/neutral change in psnr/ssim metrics
for speed 7 and 8, 1 layer and SVC mode.

Change-Id: I33bc8447afdc3785482e13aac5c3636e13c59644
2019-04-23 15:18:29 -07:00
Sai Deng 9b57a02926 Merge "Add VPX_TUNE_SSIM and VPX_TUNE_PSNR enums" 2019-04-23 21:50:56 +00:00
Johann Koenig 9c64914990 Merge changes I6e837e6f,Ibec70e66
* changes:
  remove WIDE_REFERENCE definition
  remove ARCHITECTURE definition
2019-04-23 21:40:14 +00:00
sdeng 1d8d8f562b Add VPX_TUNE_SSIM and VPX_TUNE_PSNR enums
Change-Id: I3df5af2c60b774e6d395062077542c52db868236
2019-04-23 11:09:32 -07:00
Johann eb9d819ba1 remove WIDE_REFERENCE definition
The last usage was removed in 2011:
https://chromium.googlesource.com/webm/libvpx/+/cbf923b12cec2fe7ceea0b94091d64953e56b1fe%5E%21/#F33

Change-Id: I6e837e6f1e55eeea6bbeb3159ce6ddf861bcbd72
2019-04-23 10:02:49 -07:00
Johann 4bc723fabc remove ARCHITECTURE definition
In the distant past this was used to distinguish between
armv5/6/7 targets when building the assembly files. The
project has not supported armv5/6 for a long time.

BUG=webm:1623

Change-Id: Ibec70e6624b651df0fa6f882ab6f201dc73e92e2
2019-04-23 09:48:35 -07:00
sdeng 8cbea4a049 Add vpx_clear_system_state() in SSIM based rdmult adjustments
Change-Id: I2a0cdec3bfce864e975aaa408cfdcb855db8680f
2019-04-22 16:13:28 -07:00
Harish Mahendrakar cc55568fef [vp9] Fix handling of skip in row_mt=1
For row_mt=1, when mi->skip is set to 1 after parse based on
eobtotal for that partition, dqcoeff and eob need to be restored
as recon_partition doesn't increment these pointers for skip cases

Change-Id: I79711b0c175937aa6da3bba3b3bc053f91a8ce35
2019-04-22 15:55:24 +00:00
Harish Mahendrakar 9a5d6a9d9a Merge "test_vector_test: Add row-mt and lpf-opt tests for vp9 decoder" 2019-04-22 15:54:53 +00:00
Jingning Han 184af622dd Merge "Refine interval for key frame boost calculation" 2019-04-22 04:59:22 +00:00
Jingning Han 19d0153341 Merge "Fix issues with bits allocated and consumed" 2019-04-22 04:59:04 +00:00
Angie Chiang 814cf325d2 Remove mv_dist and mv_cost from TplDepStats
Change-Id: I1af8814449a187e900df9c930dc174f0832b0212
2019-04-21 15:41:58 -07:00
Angie Chiang a40e63f956 Remove inter_cost_arr and recon_error_arr
Change-Id: I9eaf9563f2ee92fcfbe38d0f5e36c82632af468f
2019-04-21 15:37:45 -07:00
Angie Chiang c579f9dab9 Remove RE_COMPUTE_MV_INCONSISTENCY
Change-Id: I58c1bc0f285271ccff163791d35c8c0c6cc8460b
2019-04-21 15:25:20 -07:00
Angie Chiang 21ac7decfa Remove unused mv_[dist/cost]_sum
Change-Id: I8d173add2d1fc599a7915a3c9668870f18a0c59f
2019-04-21 15:19:21 -07:00
Matt Oliver ac1c34083a project: Add VS2019 support. 2019-04-20 17:00:46 +10:00
Harish Mahendrakar 4747e9fd50 test_vector_test: Add row-mt and lpf-opt tests for vp9 decoder
BUG=webm:1619

Change-Id: I4e835a6375523da04a2c4febb2fb441a5f2d56c5
2019-04-19 23:59:40 +00:00
Harish Mahendrakar 4d56e0969e Merge "[CFI] Remove function pointer cast of row_decode_worker_hook" 2019-04-19 23:59:30 +00:00
Jerome Jiang 692789b42e Merge "vp9: Use model rd for large blocks in filter search." 2019-04-19 16:26:31 +00:00
Harish Mahendrakar 19564b9ddc [CFI] Remove function pointer cast of row_decode_worker_hook
This fixes CFI error flagged for this function when row-mt=1

Change-Id: Ic5b427a6b621228280ebe829d00b540b18e2c087
2019-04-19 15:43:20 +00:00
James Zern bd45c003bb Merge "Fix PSNRHVS computation" 2019-04-19 06:10:01 +00:00
Jerome Jiang 82b8a25194 vp9: Use model rd for large blocks in filter search.
Neutral on speed and quality.

Change-Id: Ia1d3929124bb57e31bbab516a994734d2fd51891
2019-04-18 18:09:30 -07:00
Marco Paniconi f96b7117b1 Merge "vp9-rtc: Use correct plane for UV in estimate_intra" 2019-04-19 00:58:16 +00:00
Marco Paniconi 12acbb1552 vp9-rtc: Use correct plane for UV in estimate_intra
For nonrd-pickmode: some PSNR increase observed on
screen content/scroll clips.

Change-Id: Idf1bce9dd434e33d7c35dbeb59e02e2e58ea1aaa
2019-04-18 15:43:46 -07:00
Marco Paniconi 6b1954ee9a vp9-rtc: Move setting for use_model_rd_large
Move the setting to just before the inter-mode loop,
as for screen content the value may change due
to reset of segment.

No change in behavior except for screen-content.

Change-Id: I256795b581ceda352e57b88eba2e86aa18b0fdc4
2019-04-18 10:01:05 -07:00
Deepa K G 66772d2b11 Fix issues with bits allocated and consumed
For show existing frames, set the variables
this_frame_target and projected_frame_size
correctly.

Change-Id: Id5f06eb4ac195f6b63c0199d9d761eaaaea79bbd
2019-04-18 07:39:23 +00:00
Deepa K G 1051e71ea6 Refine interval for key frame boost calculation
In the calculation of boost for key frames, increase number
of frames to be scanned based on the content nature.

Change-Id: Ia4533966a00055d0bec712e073d82d4bd1dc715a
2019-04-18 07:36:13 +00:00
Deepa K G 74cf247d31 Increase the active best quality in CQ
For boosted frames, active best quality is increased.

Change-Id: I282fbefaf16b4216f5d22d344f098e6a5766c4a5
2019-04-18 06:03:08 +00:00
Ravi Chaudhary 4bb51416e1 Refine active best quality of layered ARF frames
Change-Id: If630af68fc3793d579a947d5955c2001c0cf0a8d
2019-04-18 06:02:33 +00:00
Deepa K G 640735025e Fix in key frame detection
The frame next to scene cut frame does not usually have
a high second ref useage. Thus the sec ref useage of the
frame next to scene cut frame is tested against a
threshold for scene cut detection.

With this change scene cut detection is improved for
contents where genuine scene cuts were being missed.

Change-Id: I11190d848fa1c1dcd63aab81da799354371e2a30
2019-04-18 04:47:17 +00:00
James Zern e718cade1d Merge "Revert "Refactor tile boundary condition for intra prediction"" 2019-04-17 14:52:59 +00:00
Jerome Jiang 7ef20bf051 Merge "vp9: refactor condtions for model rd for large blocks." 2019-04-17 01:51:41 +00:00
Angie Chiang 6f594bb8c8 Merge "Refine vp9_kmeans()" 2019-04-17 01:49:56 +00:00
James Zern e232dc86e5 Revert "Refactor tile boundary condition for intra prediction"
This reverts commit 14208ab41e.

This causes test vectors failures with --row-mt=1.

BUG=webm:1617

Change-Id: Icb14bbbb6f38608a73dde0370ad874c0b1b0af8a
2019-04-16 18:18:34 -07:00
Jerome Jiang a9d70ee2fa vp9: refactor condtions for model rd for large blocks.
Change-Id: If474273642b68e29abacb7b87cbb6e3c91bb93a4
2019-04-16 15:37:47 -07:00
Marco Paniconi d310225c19 Merge "vp9-rtc: Add speed feature to force SMOOTH filter" 2019-04-16 21:44:34 +00:00
Angie Chiang da373475ed Refine vp9_kmeans()
Reduce the number of group_idx initialization.
Initialize the center to the median of the data group.

Change-Id: Ie16150610480bf54a6b5e2bc048ba1e940bef10f
2019-04-16 12:07:45 -07:00
Marco Paniconi 972d31480f vp9-rtc: Add speed feature to force SMOOTH filter
Add speed feature for real-time to always force
SMOOTH filter for subpel motion. Can be useful in some
cases for noisy content or high motion at low bitrate.
Also some speedup in avoiding the checking of two filters.

Keep it off always for now.

Change-Id: I843d79aaddef75f9c6ded60906cc75c279a6e37a
2019-04-16 10:19:51 -07:00
Yaowu Xu 6382a95fee Fix PSNRHVS computation
Cherry-pick libaom #1362e15990eccab8101305be418528824fb32245 to fix
PSNRHVS calculation.

BUG=webm:1620

Change-Id: Ife7bd8056fcaed06992ad76e8c1ab734c3956ad7
2019-04-16 10:11:52 -07:00
Jingning Han 16ddf82dc0 Merge "Use uniform sampling as initial centers for k-means" 2019-04-16 16:15:52 +00:00
sdeng da5be113f3 Fix libvpx__nightly_optimization-levels - Build Failing
This CL removes the extra floating math in tune=psnr, I will add
clear_system_state calls in tune=ssim in the next cl.

Change-Id: I7cdd4854b2b8e7e7f872f097c5535f10c80cfe0d
2019-04-15 19:36:47 -07:00
James Zern a708bf5e0f Merge "loop_filter_rows_mt: unify worker count calculation" 2019-04-14 00:21:50 +00:00
sdeng c88b592dc7 Add Tune for SSIM for high bitdepth encoding
midres_bd10 test results:
avg_psnr  ssim     ms_ssim
3.189     -4.083   -5.258

Change-Id: I9faccc02f34692fc304d82241390f92267f5a72c
2019-04-13 11:02:02 -07:00
Sai Deng a9b722e00c Merge "Add Tune for SSIM" 2019-04-13 17:47:46 +00:00
James Zern e8b2750904 loop_filter_rows_mt: unify worker count calculation
fixes a deadlock with an odd number of threads that go from < number of
tiles to >. the previous calculations were out of sync so going from
e.g., 8 tiles to 2 with 3 threads would result in scheduling only 2
workers, but thread_loop_filter_rows() would expect 3.

BUG=webm:1618

Change-Id: I78c967a8c3c927d929e13c949808a5ef443ebacb
2019-04-12 23:50:08 -07:00
Jingning Han 4d14b55ee7 Use uniform sampling as initial centers for k-means
The Wiener variance output has been sorted prior to the clustering,
which allows to directly use the uniform sampling as the initial
center points. It avoids empty cluster situations when the samples
are heavily distributed at two far ends and leave the middle empty.

Change-Id: I159fbfa6bbb4aafd19411fd005666d144cca30fc
2019-04-12 15:59:51 -07:00
sdeng 7dedf4cbc7 Add Tune for SSIM
Implementation with some tuning of the paper:
C. Yeo, H. L. Tan, and Y. H. Tan, "On rate distortion optimization using
SSIM," Circuits and Systems for Video Technology, IEEE Transactions on,
vol. 23, no. 7, pp. 1170-1181, 2013.

Test results:
           avg_psnr      ssim      ms-ssim
lowres      2.516       -2.622     -2.450
midres      2.312       -3.062     -3.882
hdres       2.292       -4.293     -5.246

The encoding time is about the same as the baseline.

Change-Id: Ida2c380ade79b6c15cf12b88bf090069da8765d8
2019-04-12 15:02:05 -07:00
Jingning Han b758ba795a Use qsort to find median value
The list is short enough to use qsort.

Change-Id: I5bb1f2c43eec508bafaf4d1ad7c8a92441f066ce
2019-04-12 10:57:56 -07:00
James Zern 1fa6e2912c Merge "update libwebm to libwebm-1.0.0.27-358-gdbf1d10" 2019-04-12 05:58:12 +00:00
kyslov 4ba3098ecb Fix static analysis warnings
With switching to clang-7.0.1 we got new warnings. With this change the
warnings are back to 0 for all configurations (excluding warnings in
third_party)

BUG=webm:1616

Change-Id: I25ceb592c425394e8f14d333fb5680144f892213
2019-04-11 17:37:15 -07:00
Jerome Jiang e8bfbf5317 Merge "vp9 svc test: test KSVC and other inter layer pred mode." 2019-04-11 20:00:55 +00:00
Jerome Jiang d9c1496bff vp9 svc test: test KSVC and other inter layer pred mode.
Change-Id: I6214eb63737f67bf41753f0705047e0682f3dc70
2019-04-11 11:57:52 -07:00
Angie Chiang 5bcc0017df Merge "Use log-based sse in eval_mv_mode" 2019-04-11 17:25:40 +00:00
Marco Paniconi c46694c1d9 vp9-rtc: Fix to re-eval zero-mv for denoising
Change-Id: I3bb0646661efa06c8d1d688c746e41855c99f408
2019-04-10 21:39:19 -07:00
Angie Chiang b0ab51dc2b Merge changes I9d315e35,Id48f2b65,I4e5ed327
* changes:
  Print mv_mode counts
  Adjust the probs in get_mv_mode_cost
  Add build_inter_mode_cost()
2019-04-11 01:31:41 +00:00
Angie Chiang 7f4387f7f7 Use log-based sse in eval_mv_mode
Up to this point, non_greedy_mv's BDRate change
lowres -0.397%
midres -0.776%
 hdres -0.637%

Change-Id: I5eb2f03d067d172350dad6ba9a9f4dffef5143cd
2019-04-10 18:11:20 -07:00
Jerome Jiang 1669bde761 Merge "vp9 rtc: change PSNR thresh." 2019-04-10 21:13:56 +00:00
Jerome Jiang ed19362e77 Merge "Revert "Disable mismatch check on vp9 svc examples."" 2019-04-10 21:13:46 +00:00
James Zern 34d54b04e9 update libwebm to libwebm-1.0.0.27-358-gdbf1d10
changelog:
https://chromium.googlesource.com/webm/libwebm/+log/libwebm-1.0.0.27-351-g9f23fbc..libwebm-1.0.0.27-358-gdbf1d10

Change-Id: I28a6b3ae02a53fb1f2029eee11e9449afb94c8e3
2019-04-10 19:08:56 +00:00
Jerome Jiang 1b9c527c62 Revert "Disable mismatch check on vp9 svc examples."
This reverts commit a1857812ea.

Change-Id: Ib33f49af7631c9a6917539a58c447624df325f7f
2019-04-10 10:48:03 -07:00
Jerome Jiang d310d2fc8a vp9 rtc: change PSNR thresh.
Change-Id: I07ccc48c76d9871ae01b56ce432f9a6661fb47b9
2019-04-10 10:04:15 -07:00
Marco Paniconi c1b4e5290e vp9-rtc: Adjust cb_pred_filter_search on speed & resoln
Avoid some increase in encode time for higher resoln.

Change-Id: I2b3b745f914f986df18fcde570cdc5bc99806f97
2019-04-10 09:48:53 -07:00
Marco Paniconi 1032aa98de vp9-rtc: Speed feature for filter_search in nonrd_pickmode.
Use chessboard search only for certain speeds/resolns
(speed >= 8) for real-time speed features.

Disable chessboard search for speeds <= 7.
~2.5 gain on rtc set for speed 7.
~1% slowdown.

Change-Id: Ic6898aa475817e128154f691413c73f65306e2a8
2019-04-09 20:57:25 -07:00
Marco Paniconi ef30616a30 Merge "vp9-rtc: Fix to rate cost for switchable filter" 2019-04-10 03:56:04 +00:00
Marco Paniconi da3b24c832 vp9-rtc: Fix to rate cost for switchable filter
Add consistent switchable rate cost, which should be only
when non-integer motion mode is tested.

Neutal/negligible change in metrics.

Also diable the re-evaluation of ZEROMV mode after denoising
feature, as this rate cost fix exposed an exsting issue
with this feature.

Change-Id: I9e5479281810a392b9a409e238c564b2def8e546
2019-04-09 20:01:57 -07:00
Angie Chiang 62d33725ec Print mv_mode counts
Change-Id: I9d315e359e384dc0295c3471d8179bd828fddc1b
2019-04-09 17:28:08 -07:00
Angie Chiang 020d1ea32f Adjust the probs in get_mv_mode_cost
midres's performance is improved by 0.08%
hdres's performance is improved by 0.04%

Change-Id: Id48f2b654d8ae1909fcb6d21eda8bfb69087a18a
2019-04-09 15:13:27 -07:00
Angie Chiang c71cbb8f28 Add build_inter_mode_cost()
Change-Id: I4e5ed327e39cef1dd8c1a8f55fcbe90a5181814d
2019-04-09 15:13:27 -07:00
“Michael 32d633ba17 Add fast-adapt mechanism for VNR
Change-Id: Ia1d9cde418cc981ee08dc94a2e375d6cb4542250
2019-04-09 16:30:32 -05:00
Marco Paniconi b2f29a9a71 vp9-rtc: Fix to active_best for non-SVC
For 1 pass CBR non-SVC encoding, on golden refresh:
condition lower/boosted active_best_quality setting
only if gf_cbr_boost_pct is set.

Reduces overshoot for hard clips.

Neutral change on rtc metrics.

Change-Id: I10f7e27767a3f80d63958a7e137155f7bc20504b
2019-04-09 10:06:20 -07:00
Paul Wilkins af3630eef4 Merge "Film Mode: Bias based on sub block energy." 2019-04-09 13:37:33 +00:00
Paul Wilkins e6ad1e4681 Merge "Film mode: Raise threshold for intra modes" 2019-04-09 13:37:09 +00:00
Paul Wilkins 8cf656e3f2 Film Mode: Bias based on sub block energy.
Apply a bias for film mode against intra coding
(especially DC_PRED) and compound modes if the sub
blocks of the current block have significantly different
variance in the source.

Change-Id: Iac1fc0510141be5c472a0ec57567bab3d2fc4164
2019-04-08 11:11:10 +01:00
Jingning Han 6fd9a17c92 Merge "Allow macroblock_plane to have its own rounding buffer" 2019-04-06 16:36:32 +00:00
sdeng 287d1d64ea tiny_ssim: Fix an 'Uninitialized argument value' bug
found by clang-7.0.1 static analysis.
BUG=webm:1616

Change-Id: I2f7d1376e82e35227ad96d34417014ce5680ad96
2019-04-06 02:55:06 +00:00
Sai Deng ed595c406a Merge "tiny_ssim: Fix an 'Uninitialized argument value' bug" 2019-04-06 02:54:25 +00:00
James Zern e8822a6459 Merge "vp8cx.h: s/VP8E_(SET_MAX_INTER_BITRATE_PCT)/VP9E_$1/" 2019-04-06 00:32:46 +00:00
James Zern e9499074d3 Merge "svc_encodeframe: check strdup return" 2019-04-06 00:32:11 +00:00
James Zern e1a7f8456b svc_encodeframe: check strdup return
BUG=webm:1616

Change-Id: Ic9de589154485ad2de30b0b044991e1f9b852d74
2019-04-05 15:03:35 -07:00
sdeng 39ce1c0ab1 tiny_ssim: Fix an 'Uninitialized argument value' bug
found by clang-7.0.1 static analysis.
BUG=webm:1616

Change-Id: I7fb318aa7d4c8dd0a96bb20c6f8706ca1a632696
2019-04-05 21:51:41 +00:00
James Zern 1545bbcccb vp8_rd_pick_inter_mode: clear static analysis warning
uv_intra_rate is undefined by default, it is safe to use if
uv_intra_done is true.

BUG=webm:1616

Change-Id: I02e5f6c9e5cc6ed0b41619b4a59e55ea398bad41
2019-04-04 20:35:09 -07:00
Frank Galligan 65e5ba89b3 Add a test to test rollover of int64 in encoder interface.
The current libvpx encoder interface can potentially rollover an int64_t
value used to calculate the current timestamp. If the timebase was set
to microseconds and first timestamp was 0, then the rollover would
occur in about 10.675 days.

BUG=webm:701

Change-Id: I8d5aab46f8dcf250c1d4d43d5f3d27363c19cd54
2019-04-04 17:53:42 +00:00
Jerome Jiang 4117995a8e Merge "vp9 svc: point ref to last when not used in KSVC." 2019-04-04 16:53:23 +00:00
Jingning Han 9dcc57e4a5 Bypass skip check in tune for sharpness mode
The sharpness mode is enabled for hvc visual quality. Bypass the
skip block check that could potentially force all zero block in
sharpness mode. This resolves the patchy blockiness issue raised
in the 4K SDR HVC encode.

Change-Id: I0538a1b774b80c6b0899c921e80edecd4a440d5c
2019-04-03 21:08:27 -07:00
James Zern ddbbba41d9 vp8cx.h: s/VP8E_(SET_MAX_INTER_BITRATE_PCT)/VP9E_$1/
this was renamed in:
268f10669 Provide information on codec controls
but the corresponding type checked control call was missed.

Change-Id: I151cb42516b10e551b31273327de4ec1bac3c81b
2019-04-03 20:06:04 -07:00
Jerome Jiang 7b90a88f54 vp9 svc: point ref to last when not used in KSVC.
Change-Id: Ia648a2221890dae5357ec8e64a8431fb0f77f2fc
2019-04-03 15:49:58 -07:00
Michael Horowitz dd2942e80f Merge "Histogram-based noise estimation algorithm" 2019-04-03 22:18:55 +00:00
Jerome Jiang d73a5a437c Merge "use 64bit integer for memory offset." 2019-04-03 21:27:48 +00:00
Michael Horowitz f698ea61a7 Histogram-based noise estimation algorithm
Histogram-based noise estimation algorithm leveraged that low-noise sequences
tend to populate lower-valued histogram bins and high-noise sequences tend to
populate higher-valued histogram bins in a predictable/repeatable manner. The
algorithm compensates for histogram flattening and skewing toward zero as the
scene darkens.

Change-Id: Ia5acb611f0cc6d726280bd5ea5f45d42ff0dc2dd
2019-04-03 15:55:44 -05:00
Marco Paniconi 39ea3d72f5 Merge "vp9-rtc: Move noise estimation to after scene change detection" 2019-04-02 18:01:26 +00:00
Jerome Jiang 8f95c82653 use 64bit integer for memory offset.
Change-Id: I3d27286202e26ceecf4e551732b7d536d224d920
2019-04-02 10:06:20 -07:00
Marco Paniconi ce9e52b13c vp9-rtc: Move noise estimation to after scene change detection
This allows to use result from scene chage detection to exclude
the current frame from noise estimation analysis if the frame has
scene/ big content change (i.e., high_source_sad flag is set).

The behavior change for noise estimation may be small in practice,
since in the current code, a scene change would have blocks excluded
due to thresh_sum_diff, and the subsequent frames would also be mostly
excluded due to (past) non-zero motion vectors (until the
consec_zeromv > thresh_consec_zeromv is satisfied again).
But its better to completely exclude current frame if its a scene change.

Change-Id: Icd08bab7a8e1b994c7accced89697e0b2d7f50c5
2019-04-02 09:14:17 -07:00
Marco Paniconi f67b2b8929 vp9-rtc: Speed feature changes for speed 9.
Add threshold multipler for variance partitioning
as speed feature, and increase it by 2x for speed >= 9
for resoln >= VGA. Also only allow simple_interpol
filter when avg_low_motion is below threshold.

Better tradeoff of speed/quality comparing to speed 8.

Change-Id: I6bd29ad3cced470b32d04f60771120531112a5d9
2019-04-02 09:13:23 -07:00
Paul Wilkins 4d1637ff67 Film mode: Raise threshold for intra modes
Substantially increase the threshold for applying variance
adjustment in  rd_variance_adjustment() for intra modes
only, especially for DC_PRED.

Change-Id: Idb3f0c5aca5ab58c9b79c3e993247719054d79c9
2019-04-02 16:32:41 +01:00
Paul Wilkins 1c0857d648 Merge changes Ied3992f4,I83ca669b,I8f1c1a29
* changes:
  Further Adjustments to film mode bias.
  Add GF group noise weighting in rd_variance_adjustment()
  Adjustment to low variance block bias in rd_variance_adjustment()
2019-04-02 12:00:03 +00:00
Paul Wilkins 68fbf82ef0 Merge "Change to thresholding in rd_variance_adjustment()" 2019-04-02 09:39:59 +00:00
Jingning Han 5359ae810c Allow macroblock_plane to have its own rounding buffer
Add 8 bytes buffer to macroblock_plane to support rounding factor.

Change-Id: I3751689e4449c0caea28d3acf6cd17d7f39508ed
2019-04-01 23:06:07 +00:00
James Zern e59f78b484 tool/set_analyzer_env: disable implicit-integer-truncation
with clang-7 this causes additional warnings in x86 intrinsics and
elsewhere. disabling for now to unblock new changes.

BUG=webm:1615

Change-Id: Ide9cacee5547ed432f980f6804e1414f32639121
2019-04-01 12:40:53 -07:00
Paul Wilkins efab8f3d2e Merge "Use reconstructed variance in rd_variance_adjustment" 2019-04-01 11:57:13 +00:00
James Zern 46923558df Merge "update .clang-format for version clang-7.0.1 update." 2019-03-29 18:59:40 +00:00
Hien Ho 073b326565 update .clang-format for version clang-7.0.1 update.
added files that are affected by clang-format version 7.

BUG=b/120815481

Change-Id: I40662ce962e4f4b1fcdf183b700f85cc5c0f9f82
2019-03-29 18:25:26 +00:00
Jerome Jiang ecae7f8f81 Merge "Revert "Wrap macro definition in do-while(0)"" 2019-03-28 18:00:57 +00:00
Jerome Jiang 26ef25a364 Revert "Wrap macro definition in do-while(0)"
This reverts commit aa04b6f9a7.

It caused big regression on webrtc VP8 tests.

Change-Id: I937e769d133abeca62ba063e59a58b5c461f5b5e
2019-03-28 10:05:11 -07:00
Jerome Jiang bb117189e0 Merge "Disable mismatch check on vp9 svc examples." 2019-03-28 16:26:49 +00:00
Paul Wilkins 0ef5016878 Further Adjustments to film mode bias.
Stronger bias against variance below source than above.

Change-Id: Ied3992f4204e14433c6841d51c192118be954f0a
2019-03-28 14:09:04 +00:00
Paul Wilkins 247a28c280 Add GF group noise weighting in rd_variance_adjustment()
For film mode add a weighting to the thresholds used
in rd_variance_adjustment() based on noise measured in the
first pass.

Change-Id: I83ca669bb55aa52f1d34f03a2268b79fba890770
2019-03-28 13:39:54 +00:00
Paul Wilkins d6cd3b7ec0 Adjustment to low variance block bias in rd_variance_adjustment()
Adjust the extra bias applied for very low variance blocks to focus
mainly on DC_PRED.

Change-Id: I8f1c1a29932f319535807046846b604b5b8827c1
2019-03-28 13:30:11 +00:00
Paul Wilkins dc6e6fbdcc Change to thresholding in rd_variance_adjustment()
Always test thresholds using a scaled block variance value.

Source pixel variance no longer used so delete it as a parameter
to the function

Change-Id: I9e251edac6ebb15da98e40dcfa43333fe8b6ba55
2019-03-28 12:46:03 +00:00
sdeng 19941fcc11 Use reconstructed variance in rd_variance_adjustment
to replace the variance from .dst which is the prediction buffer in
inter mode.  Only enable it in tune-content-film mode at the moment.

Change-Id: I647b4a524a0849fda42541887ebc34091f152073
2019-03-28 10:59:28 +00:00
Paul Wilkins 6753efd235 Merge "Allow more Intra choices in film mode." 2019-03-28 10:49:55 +00:00
Jerome Jiang a1857812ea Disable mismatch check on vp9 svc examples.
Change-Id: I49902a750758ba0ffe733be9b1efd0cdea44f936
2019-03-27 17:02:44 -07:00
Marco Paniconi e031f49f89 vp9-rtc: Fix noise estimation thresh for 1080p
Use the same thresholds as for 720p for now,
leads to better noise estimation on test clips.

Change-Id: I55e11346a747fe149f521315a38d75e28b3e774e
2019-03-27 14:21:35 -07:00
Paul Wilkins b09f628df8 Allow more Intra choices in film mode.
Disable part of a speed feature that blocks all intra modes
except DC_PRED when the source variance is low.

Change-Id: I2956951fd05933a39f7225d4dfe14e019410fee3
2019-03-27 18:21:29 +00:00
Marco Paniconi c09e95fc7d vp9-rtc: Disable cyclic refresh under some conditions
cyclic refresh does not work for speeds <= 4, so disable
it for this case. And dynamically disable it when
average_qp is close to MAXQ (only for non-svc), to improve
quality/rate control at very low bitrates.

Change-Id: I447be43aef0fbb80f4a30d81e11658b58744eae5
2019-03-26 11:01:30 -07:00
Jingning Han cf728f8913 Merge "Remove deprecated code for vp9_fdct8x8_quant()" 2019-03-25 23:50:01 +00:00
Jingning Han d14306c16c Merge "Unify the transform and quantization process" 2019-03-25 23:49:52 +00:00
Jingning Han 873e38b72d Remove deprecated code for vp9_fdct8x8_quant()
Change-Id: If146bbf24f446f71be9147402e6d30533eee99d1
2019-03-25 15:06:57 -07:00
Jingning Han 2c9d78f0b0 Unify the transform and quantization process
Unify the transform and quantization process for 4x4 - 16x16
transform block sizes. This doesn't affect the encoding speed
visibly. Remove it to reduce the maintenance load.

Change-Id: Ifbf20bf8554ecf7970a6279a2b783b1c58fac6e4
2019-03-25 14:57:15 -07:00
Jerome Jiang 0d2299c1ee Change PSNR threshold for high bitdepth.
BUG=webm:1609

Change-Id: I1aa18d58c20532f657059a2df3646fad1625e3ae
2019-03-25 13:42:31 -07:00
Jerome Jiang d6e40b68bc Use high bd path to substract blocks in hbd build.
BUG=webm:1609

Change-Id: Ifc15d616e7cfb247b399def64ef7691589d90075
2019-03-24 20:13:49 -07:00
Jingning Han d8dbd853e9 Merge "Use most common segment index for 64x64 level decision" 2019-03-22 05:01:57 +00:00
Jingning Han 46c96ee9c7 Merge "Add wiener variance log function" 2019-03-22 05:01:51 +00:00
Jingning Han e9ad53cd84 Merge "Refactor Wiener variance based segmentation" 2019-03-22 05:01:43 +00:00
Jerome Jiang 4bfa583c6e fix redundant cast in examples.
Change-Id: I84280de82053f9056bda9d813baa6165ca9bcd1e
2019-03-21 19:04:38 -07:00
Jingning Han 15e3a1f176 Use most common segment index for 64x64 level decision
Find the most common segment index among all 16x16 blocks in a
64x64 block and use that as the 64x64 block level decision.

Change-Id: I67e85869d9fee0fc05450928f1eeaebe511cab6a
2019-03-21 22:44:32 +00:00
Jingning Han 27bec1c761 Add wiener variance log function
Factor common mapping function from wiener variance to its log
form.

Change-Id: I25c955c8a3e25b9af1d65a0f0c3f695547c13453
2019-03-21 15:44:00 -07:00
Jerome Jiang dc3b508f57 Merge "vp9 postproc neon: Remove the condition on mb cols." 2019-03-21 22:24:20 +00:00
Jingning Han c5f4a315a0 Refactor Wiener variance based segmentation
Separate the k-means clustering stage and the segmentation parse
stage to save unnecessary steps in a common function.

Change-Id: I60083e1d970e744f9a64112f856892d450f86669
2019-03-21 15:07:16 -07:00
Jerome Jiang f4828c1512 vp9 postproc neon: Remove the condition on mb cols.
VP8 and VP9 have different padding on buffer stride.
VP8 microblock is 16x16 so the buffer stride needs to be divisible by
16. Thus UV buffer stride is divisible by 8.
VP9 microblock is 8x8 so the buffer stride is only extended to be
divisible by 8. Then UV buffer stride isn't divisible by 8.

Change-Id: I6fa953feb951f2fb2e48f72a623786b85e23822f
2019-03-21 14:17:15 -07:00
Jerome Jiang 674b2cb40c Merge "vp9: remove condition on high bitdepth using simple block yrd." 2019-03-21 19:49:51 +00:00
Marco Paniconi 80ea2788b1 vp9-screen: Adjust speed features at speed 8
Keep loopfilter on, and use half-pel instead of full.
This reduces big quality gap between the speed 8 and 7,
but still keeps speed 8 about 30-40% faster than speed 7.
Tested on screenshare clips with scroll and slide changes.

Change-Id: Id63b44f59655f3e3dc1b49d89291d97e7323081a
2019-03-21 10:39:06 -07:00
Marco Paniconi 5cde05fda8 Merge "vp9-rtc: Fixes for low-resoln" 2019-03-20 23:06:05 +00:00
Jingning Han 0670c7ad8f Merge "Properly reset memory in hbd setting" 2019-03-20 21:31:21 +00:00
Jingning Han 4161a55f7f Merge "Drop Wiener variance diff scale from 8 to 4" 2019-03-20 21:31:12 +00:00
Jingning Han 3b6b98e201 Merge "Enable all the 8 segments by default for perceptual AQ mode" 2019-03-20 21:31:03 +00:00
Marco Paniconi 840dc6966d vp9-rtc: Fixes for low-resoln
The force smooth_filter should only be used
for noisy content, so for now keep it off and
add TODO. Also fix/adjust low-resoln condition
and threshold in cyclic refresh.

Change-Id: I6c456dc9f23daabba20badd65a2f7ee6c5e259c4
2019-03-20 13:42:20 -07:00
Jingning Han 3e2c6e2a63 Properly reset memory in hbd setting
This avoids a segmentation failure issue in high bit-depth case.

Change-Id: I9fbb3ec24b1735678f110cb084a29b15e3ec1a12
2019-03-20 13:19:30 -07:00
Jingning Han 585b5dd817 Drop Wiener variance diff scale from 8 to 4
Change-Id: If6df62069c1b30de4d52aba33c20c6e854524905
2019-03-20 11:14:57 -07:00
Jingning Han c6eb185331 Enable all the 8 segments by default for perceptual AQ mode
Change-Id: I8999ee74216785c22568a09bce7590c9fc6905c1
2019-03-20 11:13:39 -07:00
Jerome Jiang 1492ccbe6e Merge "Wrap macro definition in do-while(0)" 2019-03-20 17:27:36 +00:00
Paul Wilkins 67556170bf Merge "Add block size scaling in rd_variance_adjustment()" 2019-03-20 11:41:39 +00:00
Paul Wilkins 5a2b1962a4 Merge "Changes to rd_variance_adjustment()" 2019-03-20 11:41:17 +00:00
Paul Wilkins 06033be020 Merge "Change to mode early breakout rules for FILM mode." 2019-03-20 11:41:01 +00:00
Paul Wilkins 7a298d0b03 Merge "Store a group level noise metric from first pass." 2019-03-20 11:40:46 +00:00
Marco Paniconi 3e2b4b3d48 Merge "vp9-rtc: Adjustments for base layer in screen." 2019-03-20 03:23:51 +00:00
Jerome Jiang aa04b6f9a7 Wrap macro definition in do-while(0)
Change-Id: Id654a48d2fa40355552d7267e58461e6cc1c6998
2019-03-19 17:49:41 -07:00
Marco Paniconi 28881dcfda vp9-rtc: Adjustments for base layer in screen.
On scene/content changes for base layer of screen:
reduce 32x32 split threshold, bias rdcost for flat
blocks if sse_y is non-zero, and avoid early exit on
intra-check.

Reduce artifacts in scroll content.

Change-Id: I144357a61462351173af900e0b8a47dac4aad6ca
2019-03-19 17:33:56 -07:00
Jerome Jiang 633cd97ee9 vp9: remove condition on high bitdepth using simple block yrd.
Speed is similar between non HBD vs 8bit with HBD build.

BUG=webm:1541

Change-Id: I8b5f7eff87ec7dc4710d31744155a60e50b0f0a9
2019-03-19 16:06:23 -07:00
Angie Chiang 60f76593e3 Merge "Compute count_ls in vp9_kmeans" 2019-03-19 18:41:46 +00:00
Jingning Han a7b6c4503a Normalize the Wiener variance for ranking
Normalize the Wiener variance calculation for stack ranking. Remove
potential dependency on blocks at frame boundary.

Change-Id: I37e8634d714a1c34e99f9f7c4f1bb6ea81d56112
2019-03-18 15:35:17 -07:00
Jingning Han 9e2c09cc2a Add perceptual AQ mode control to RD search
Make the rate-distortion optimization search support the perceptual
quality AQ mode.

Change-Id: Iee507ccfda90ac39b3623de705f187b1459e57e1
2019-03-18 15:32:43 -07:00
Jingning Han 8af27dcd90 Merge "Add rdmult adjustment for perceptual AQ mode" 2019-03-18 22:30:17 +00:00
Angie Chiang 3e52b96753 Compute count_ls in vp9_kmeans
Change-Id: Id5a83554f2037b03a3a7d86f83e47cac311fbe1d
2019-03-18 15:19:15 -07:00
Angie Chiang 34576dab60 Merge "Fix race condition in wiener_var_rd_mult" 2019-03-18 19:39:04 +00:00
Jingning Han 8c52776ec1 Add rdmult adjustment for perceptual AQ mode
Compute the Lagrangian multiplier for the adaptive quantization
settings.

Change-Id: Ieebe074d6f8163e7541264cb0ead22432273e338
2019-03-18 10:11:01 -07:00
Jingning Han d2c9d92ce5 Merge "Improve key frame detection" 2019-03-18 14:11:18 +00:00
Paul Wilkins e458ca3d52 Improve key frame detection
Improve detection of key frames especially in low contrast
and low motion regions.

This patch adds a function to the key frame detection to test
for specific patterns in the intra signal in the first pass stats
that tend to be indicative of a key frame.

This is intended to compliment the existing code and finds some
scene cuts that were previously being misssed.

Tested on two clips where the existing code was struggling to
identify the key frames this patch improved detection as follows.

Film clip 1: (detected / actual)
Old (2/5) New (5/5)

Film Clip 2
Old 4/11 and one false +,  New 7/11 and 1 false +.

Short 4K Film Scene
Old 1/2 New 2/2

In testing so far I have not seen many extra false +'s though
it is likely that there will be some cases and this may need
further tweaking.

One one of our longer form film test reels ~20k frames)
the change picked up around 35 key frames that were
previously missed, mainly in darker scenes. There were a few
extra (or different) false positives cause by bright flashes or
explosions but these were cases where there was little
difference between inter and intra coding.

Awaiting testing on standard sets.

Change-Id: I1ff4a587e0a47667eb93b197f39b79a1130faeca
2019-03-17 22:40:15 -07:00
Jingning Han 5461ec5501 Merge "Unify the rd_mult use in rd_pick_partition" 2019-03-18 04:03:38 +00:00
Jingning Han c5b32838f4 Merge "Setup AQ mode for perceptual quality" 2019-03-18 04:03:28 +00:00
Angie Chiang d887413eca Fix race condition in wiener_var_rd_mult
Change-Id: Id5e9c2cbfe35809ac99a3bc9ba93cf462a6b1a34
2019-03-15 18:33:16 -07:00
Jingning Han 813cd26495 Unify the rd_mult use in rd_pick_partition
Abstract the control outside rd_pick_partition function. No need
to switch between x->cb_rdmult and the cpi->rd.RDMULT here.

Change-Id: Ia3104ebe15b5e59a4f29ffe6e8c7d718ecb998a8
2019-03-15 16:23:43 -07:00
Marco Paniconi 5df8d048c6 Merge "vp9-rtc: Some adjustments for low-resolns real-time" 2019-03-15 23:10:50 +00:00
Jingning Han aec12b0976 Setup AQ mode for perceptual quality
Adapt the quantization to provide higher quality at smooth regions
where the Wiener variance is smaller.

Change-Id: Ibfd594d1de2ba34d2440d0aa7991b0fdac057ea5
2019-03-15 15:47:59 -07:00
Jingning Han 15a849b5d0 Merge "Refactor tile boundary condition for intra prediction" 2019-03-15 22:43:20 +00:00
Marco Paniconi 3134b50211 vp9-rtc: Some adjustments for low-resolns real-time
Force smooth_interpol filter for low resolutions at high Q,
avoid the loopfilter strength reduction for similar conditon,
and reduce thresh_motion for cyclic refresh turnoff.

Change-Id: I4e9121d1cdc7d1b04992c741dc4f0cec281592f7
2019-03-15 15:13:54 -07:00
Jingning Han 2337285a2e Merge "Setup frame segmentation for perceptual quality target" 2019-03-15 16:47:43 +00:00
James Zern d2828f773f set_analyzer_env.sh: improve cfi diagnostics
use -fno-sanitize-trap=cfi to allow a diagnostic to be printed rather
than aborting with a SIGILL.

https://clang.llvm.org/docs/ControlFlowIntegrity.html#trapping-and-diagnostics

Change-Id: I4517cafe3c7b7305ba4845dbadf9fb679c686843
2019-03-14 23:07:23 -07:00
Jingning Han bb01ac96ec Setup frame segmentation for perceptual quality target
Build the frame segmentation for perceptual quality target.

Change-Id: Icdea28aea02d84ce9c8011f63a3f97133f94a141
2019-03-14 20:29:24 -07:00
Jingning Han 3a0d78272f Merge "Let vp9_kmeans provide boundary for each group" 2019-03-15 03:24:23 +00:00
Angie Chiang c309cccb28 Let vp9_kmeans provide boundary for each group
boundary_ls[j] is the upper bound of data centered at ctr_ls[j]

Add vp9_get_group_idx() for computing group_idx

Change-Id: I3b1b488edf8acbfb63c469eeeba15f3e42b0a645
2019-03-14 16:50:51 -07:00
Jerome Jiang effbd82f6a Merge "Enclose macro arguments in parentheses" 2019-03-14 23:08:43 +00:00
Jingning Han 14208ab41e Refactor tile boundary condition for intra prediction
Explicitly compare the block location against tile coordinate to
decide if intra prediction boundary is available. No coding stats
will be changed by this refactoring.

Change-Id: I80b3a131366bb2c5f8ea53a139ed6e9b0b7ddb68
2019-03-14 15:12:42 -07:00
Jerome Jiang 1ece42aaf7 Enclose macro arguments in parentheses
BUG=webm:1606

Change-Id: I661485b860243c95b6450035dbac77b0dd4d9ff4
2019-03-14 14:57:33 -07:00
Angie Chiang 0fdc9af7b2 Merge changes Ifc006890,I753920a6
* changes:
  Apply kmeans on log of wiener_variance
  Add vp9_kmeans()
2019-03-14 17:22:09 +00:00
Marco Paniconi 181631ac16 Merge "vp9-svc: Reorganize the simulcast mode" 2019-03-14 04:51:46 +00:00
Marco Paniconi 1533bd84f1 Merge "vp9-rtc: Avoid TM intra on big blocks for screen" 2019-03-14 01:59:49 +00:00
Angie Chiang 316e7f62db Apply kmeans on log of wiener_variance
Change-Id: Ifc0068902e036cd5d94fc5d39f0349a5875469b7
2019-03-13 18:01:29 -07:00
Angie Chiang 523f1f470c Add vp9_kmeans()
Change-Id: I753920a65fb14a252617444f49feb40dc332d766
2019-03-13 17:59:53 -07:00
Angie Chiang 30fc1a814d Merge "Safe guard zero median filter outcome case" 2019-03-14 00:18:04 +00:00
Angie Chiang 7e96649346 Merge "Refactor speed feature settings" 2019-03-14 00:13:47 +00:00
Marco Paniconi 0ebc986a33 vp9-rtc: Avoid TM intra on big blocks for screen
For screen content real-time mode: don't check TM
intra for bsize >= BLOCK_32X32.

Small speedup and avoid some artifacts seen
in scrolling screen content.

Change-Id: I72d7731eeb6ac9ee96e65af522c1a9aabb6dc4ef
2019-03-13 15:59:20 -07:00
Marco Paniconi ae365775e7 vp9-svc: Reorganize the simulcast mode
Set the lst/gld/alt_fb_idx and refresh flags for
key frames at the start of encoding (in svc_set_params).
This then avoids new code/function in update_references()
and in copy_flags_ref_update().

Change-Id: Id3503c0c628540c20f11a540c118c4ee4cf04848
2019-03-13 15:37:55 -07:00
Marco Paniconi ddafa2a11e Merge "vp9 simulcast: update buffer slot flag used in API." 2019-03-13 20:16:17 +00:00
Jingning Han e2435ff1f8 Safe guard zero median filter outcome case
In case median filter returns 0 value, bypass the Wiener filter
stage. This avoids potential divided by 0 case.

In the same place use a temp variable to take the Wiener filter
output instead of returning to the coeff array.

Change-Id: I45f57c515b4062a0aa1f312eda852462cb655d8e
2019-03-13 12:11:04 -07:00
Jerome Jiang 8a05413b0b vp9 simulcast: update buffer slot flag used in API.
Also add #spatial_layers > 1 to simulcast_mode

Change-Id: I6234da81801176ac8678f9f5e1323f8b289cb663
2019-03-13 11:47:53 -07:00
Jingning Han 1c07e79ef1 Refactor speed feature settings
Make the speed feature setup functions take speed argument as
their input.

Change-Id: I542e8f6e04658e5d99e972380a31baab99a4fc23
2019-03-13 11:21:14 -07:00
Jingning Han 776daa071e Adaptive multiplier based on the Wiener variance
Adapt the Lagrangian multiplier based on the Wiener variance at
64x64 block level.

Change-Id: Ica195ed6f706daf6eee156d4b1a55bda65a92f7b
2019-03-13 10:41:34 -07:00
Jingning Han b8794de05b Merge "Add normalization over frame level Wiener variance" 2019-03-13 16:18:09 +00:00
Jingning Han 1222293604 Merge "Set up Wiener variance for macroblocks in a frame" 2019-03-13 16:17:52 +00:00
Jingning Han 6db96a7f9c Add normalization over frame level Wiener variance
Normalize the block level Wiener variance based decision according
to the frame level Wiener variance.

Change-Id: Ic2bdf1b322a65661775541dd6c174ba71579461a
2019-03-12 22:25:08 -07:00
Jingning Han ff36b9c78b Set up Wiener variance for macroblocks in a frame
This commit introduces a Wiener variance term. For each block in
the source frame, we first estimate its film grain noise level
using median filter in the transform domain. Each transform
coefficient is then processed using Wiener filter to account for
the impact on the energy level due to film grain noise. The result
leads to a second moment of the denoised signal.

Change-Id: Ibce7cb1b0cb8fe1aba807d95289712271d576948
2019-03-12 22:23:15 -07:00
chiyotsai 7969c6e0b7 Remove highbd_temporal_filter_sse4.c when REAL_TIME_ONLY is on
Change-Id: I3ca7442b10cbea4dd5dbabe147687d1cb3cce4d8
2019-03-12 13:55:02 -07:00
Jerome Jiang e751879df2 Merge "vp9-decoder: use long int for buffer offset." 2019-03-12 00:55:34 +00:00
Jerome Jiang c72dc3963e Merge "vp9: map speed > 9 to speed 9." 2019-03-11 22:45:47 +00:00
Jerome Jiang 59578327a5 vp9-decoder: use long int for buffer offset.
integer overflow when frame size too big.

BUG=webm:1603

Change-Id: Ifbb81b5fb6a2043d09d403e7c50ab8d7bf125dca
2019-03-11 15:24:37 -07:00
Paul Wilkins 9b96d2846b Add block size scaling in rd_variance_adjustment()
Scale the block variance values used in this function
to a common block size.

Change-Id: I73ad7d48b2621f312d771ee0dd7b6fc59cfc1652
2019-03-11 13:52:34 +00:00
Paul Wilkins b5707b6f14 Changes to rd_variance_adjustment()
Always calculate per block variance values vs per pixel.

Change-Id: I760b3ba1a250d7544813a1b93923eedc207cbd60
2019-03-11 13:51:48 +00:00
Paul Wilkins c48b2c96ad Change to mode early breakout rules for FILM mode.
The biggest offender in terms of preventing retention of film
grain in high rate film content is the use of DC-PRED mode.
Some of the directional modes whilst not strictly preserving
grain do better at at least preserving some texture.

This change blocks the early breakout of the rd loop based
on the reference frame giving the best result so far. In practice,
unless DC-PRED was chosen as the best mode so far, the other
directional intra modes would not even be considered.

As the film grain mode also tends to bias against DC-PRED (or
intra in general) this was pretty much blocking all use of directional
intra modes.

The patch also allows for a broader spectrum of DC modes at the
16x16 transform level than previously.

Change-Id: I860b7726ea9f5fcbb3ec1a90edbdd8cade2e8b28
2019-03-11 13:50:33 +00:00
Paul Wilkins 9ce7bf835a Store a group level noise metric from first pass.
Adds a variable to GF group structure to store a noise
energy metric for the current group that can be used
in things like film grain retention code.

Change-Id: I81b07630d3242f7928110f19a6c1ed4c86125f05
2019-03-11 13:49:14 +00:00
Marco Paniconi 855a71dfda vp9-screen: Rework the mode skip logic for screen
Don't force skip of zero-golden reference when
zero_temp_sad_source = 0, as it be may the
inter-layer reference. And remove the flatness conditon
when superblock is static.

Change-Id: I6b4b6eac0f6a2abc862c23d0e5467c7cf61995ef
2019-03-10 18:43:54 -07:00
Marco Paniconi a7b8461992 Merge "vp9-screen: Fix to screen wth layered encoding" 2019-03-08 19:38:43 +00:00
Marco Paniconi 8d488e8c3b vp9-screen: Fix to screen wth layered encoding
zero_temp_sad_source is only computed when
compute_source_sad_onepass and sf->use_source_sad are
on, which currently is only for the top layer of the
layered encoding. So qualify the usage of
zero_temp_sad_source on those flags.

This affects the quality/speed of the lower layers of
screen content mode when SVC (quality layers) are used.

Change-Id: I54167265a05a4b918ce015931375aa42d3e75cf5
2019-03-08 10:41:36 -08:00
Angie Chiang af5e630915 Merge changes Ib8e635fc,If868f67c,Ibfeae411
* changes:
  Include mv_mode_arr info in dump_tpl_stats
  Include gf_frame_offset in dump_tpl_stats
  Refactor dump_tpl_stats
2019-03-08 18:31:57 +00:00
Chi Yo Tsai 47f8ce0f7c Merge "Optimize SSE4_1 lowbd temporal filter implementation" 2019-03-08 17:46:08 +00:00
Jerome Jiang ae8763f153 Merge "vp9 svc: add simulcast mode when inter-layer pred is off." 2019-03-08 03:41:44 +00:00
chiyotsai 85fbf6569c Optimize SSE4_1 lowbd temporal filter implementation
- Change some unaligned loads to aligned loads
 - Preload filter weights

BUG=webm:1591

Change-Id: I4e5e755e1fa5613d1c14191265bf80b0bfd0b75c
2019-03-07 15:55:06 -08:00
Jerome Jiang 3529526e11 vp9 svc: add simulcast mode when inter-layer pred is off.
Force all upper spatial layers to be key frame if the base layer is key.
Mode only works for inter-layer pred=off and non-flexible mode.

Add flag to write out bitstream for each spatial layer in example
encoder.

Change-Id: I5db4543cf8697544ae49464f2157e692640d5256
2019-03-07 15:20:13 -08:00
Chi Yo Tsai 75e175eec2 Merge "Add SSE4_1 highbd version of temporal filter" 2019-03-07 22:24:26 +00:00
Marco Paniconi f93d5dd0e6 vp9-svc: Fix to sample encoder for 1 layer
Fix to vp9_spatial_svc_encoder to run case of
1 spatial, 1 temporal layer.

Change-Id: I93675c3c4bd2c55cb1c971679588525a8e5b889d
2019-03-07 12:31:59 -08:00
Marco Paniconi 8256c8b297 vp9-rtc: Improve mode check on flat blocks in screen mode
For nonrd-pickmode in screen content mode: modify logic for
inter and intra mode check for spatially flat blocks.
Condition skip of non-zero/zero inter mode based on
zero_temp_sad_source, and force intra/DC check regardless.

Reduces artifacts in scrolling motion.

Change-Id: Iee75cd19d03296afeb649c5bce628806103769ae
2019-03-06 20:48:01 -08:00
Johann Koenig 71032fde79 Merge "add -Wmissing-prototypes" 2019-03-06 23:59:20 +00:00
Marco Paniconi df7039cf9a vp9-denoiser: Bias to last for golden long term
If golden referene is selected as long-term reference,
bias the denoiser filter to use last reference.
Fixes visual artifact.

And reduce the thresh_svc_golden, which was used
to reduce the artifact occurrence.

Change-Id: I08f24160ca11bd8f5f70acaefe989d5f92988132
2019-03-06 11:29:39 -08:00
Angie Chiang cbd966f1ab Include mv_mode_arr info in dump_tpl_stats
Change-Id: Ib8e635fc7522d27ff7fdb62661597115e5bbc9b8
2019-03-05 18:13:45 -08:00
Jerome Jiang 0b3d922688 vp9: map speed > 9 to speed 9.
Report warning in example encoder.

Change-Id: Iec4cdffce9faa65241756fbdac498214c8b93cc1
2019-03-05 15:22:16 -08:00
Angie Chiang e14958ea73 Include gf_frame_offset in dump_tpl_stats
Change-Id: If868f67ccc1c73189bc4c139a807d7341e59b668
2019-03-05 15:13:41 -08:00
Jerome Jiang d64e328624 Merge "vp9 svc example: use CONFIG_VP9_DECODER guarding decoding." 2019-03-05 22:50:39 +00:00
Angie Chiang 543aeef873 Refactor dump_tpl_stats
Only dump stats when ref frame exists.
Dump ref_frame_idx

Change-Id: Ibfeae4111697b8ce97d7fe9b56c2487623615748
2019-03-05 11:59:00 -08:00
Jerome Jiang 269b36ef43 vp9 svc example: use CONFIG_VP9_DECODER guarding decoding.
Change-Id: I91f2955f2936303c3e09e9b2dc60e32305ebae17
2019-03-05 11:11:16 -08:00
Angie Chiang a8a6f86e58 Add rd_diff_arr to store future blocks' rd diff
Change-Id: Id996c1a427fb22a32b7a521cadf9f1523e5cf068
2019-03-05 18:52:10 +00:00
Angie Chiang a0944ba19f Merge "Fix compile error when DUMP_TPL_STATS is 1" 2019-03-05 18:51:42 +00:00
Johann 4b357bd15b add -Wmissing-prototypes
clang treats -Wmissing-declarations differently than gcc. This
provides similar coverage for clang.

Fix vpx_clear_system_state() warning on 32bit builds:
  note: this declaration is not a prototype; add 'void' to make it a
  prototype for a zero-parameter function

Change-Id: I5a424bc38d47c0a3dc751d65c1efea5733907785
2019-03-04 18:02:09 -08:00
Angie Chiang 03a0ced87a Fix compile error when DUMP_TPL_STATS is 1
Change-Id: Ie859c2c8e377b6b0982293833ddc657855b18091
2019-03-04 17:20:20 -08:00
Marco Paniconi aba995832f vp9-rtc: Speed feature changes to speed 9
Compared to speed 8 for low resolutions:
quality loss is ~8-10%, and encoder fps is ~15%
higher on ARM for 1 thread.

Change-Id: I4f12390d2917a5c4045114ef81a05edb2a3b9c96
2019-03-04 11:45:20 -08:00
chiyotsai 50f0dd8ee9 Add SSE4_1 highbd version of temporal filter
The SSE4_1 version of temporal filter does not distinguish between bd 10
and bd 12.

Speed up:
 Function Level:
       | !SS_X |  SS_X
 !SS_Y | 6.44X | 6.37X
  SS_Y | 6.56X | 6.63X

 Video Level:
  2.5% speed up on basketballpass_240p over 150 frames on speed 1,
  bitdepth 10, auto-alt-ref=1

BUG=webm:1591

Change-Id: I49aa2ed4acfe80a8d627038322de66cbe691296e
2019-03-04 11:09:01 -08:00
Marco Paniconi c73a44c725 vp9-rtc: Adjust force split logic for screen mode
In variance partition for screen content mode:
force split to 32x32 if source pre-process detects
non-zero temporal sad.
Reduce artifacts in scroll motion content.

Change-Id: Ifbe2b500eb03ae853faa28a045ce4f1185443939
2019-03-01 22:49:07 -08:00
Marco Paniconi 43fc087908 vp9-rtc: Fix for scroll motion for rtc
Increase threshold to detect frames with high
num of motion blocks, and fix conditions to detect
horiz & vert scroll and avoid split below 16x16 blocks
in variance partition.

Reduces artifacts in horizonal scroll screenshare testing.

Change-Id: Icf5b87f69971d7331c660fc2727c9246c6cbf8b5
2019-03-01 11:20:38 -08:00
Paul Wilkins 8971779e60 Merge "Strengthened film grain setting." 2019-03-01 13:42:48 +00:00
Paul Wilkins 6b7873826a Merge "Fix RD multiplier bug impacting AQ1." 2019-03-01 13:42:11 +00:00
Marco Paniconi 503cb8e63a vp9-rtc: Set init noise level based on resoln
Avoid the kLow init level for lower resolns.

Change-Id: I1c9968a6891668b5887e35695f2a44158a4b3a18
2019-02-28 19:59:13 -08:00
Marco Paniconi b4fa2bdd92 vp9-rtc: Reduce thresholds for skip golden
For nonrd-pickmode CBR mode: reduce the skip
golden ref thresholds, to reduce some psnr
regression in some clips, while still effectively
reducing flashing block artifact occurrence.

Change-Id: I468dcf5354411aeb54ac3ef56c6fb73267d93fde
2019-02-28 17:57:08 -08:00
Jerome Jiang dc6306e587 Merge "Set segment ID from ROI map if enabled." 2019-02-28 19:00:25 +00:00
Jerome Jiang ade3bbf8be Set segment ID from ROI map if enabled.
Segment ID was overwritten.

Change-Id: I99603dce02a94f3a9076d1743b108a81289ad0e5
2019-02-28 09:32:49 -08:00
Marco Paniconi a5d72da1e7 Merge "vp9-rtc: Change init level of denoiser & noise level" 2019-02-28 17:21:24 +00:00
Paul Wilkins 59b17593cd Strengthened film grain setting.
This patch increases the preference for maintaining similar variance
between source and reconstruction and thus helps improve film grain
retention.

The changes are only active when film mode is selected

Change-Id: I3bc082dca678a0f32ec00f30f5d90d0f95ca2381
2019-02-28 13:13:26 +00:00
paulwilkins 135fe47602 Fix RD multiplier bug impacting AQ1.
Change to the default RD multiplier computation in set_segment_rdmult()

The default here is wrong as for modes like AQ 1 setting the rdmult based on the
segment ID for bsize will tend to result in the RD loop favoring partition sizes where
the resulting segment assignment has the lowest Q, as these partition sizes will be
then evaluated with a lower value of rdmult. For a valid rd comparison between
partition sizes within a single SB64 we need to use the same value of rdmult.

This change fixes an observed issue with AQ 1 where almost all the blocks were being
assigned to segment 0.

Change-Id: Ibf87e8ca60bca45b8fee866ac6fd53feae11dab4
2019-02-28 13:03:42 +00:00
Paul Wilkins 0907bfef4c Merge "Change to direction of scan for GF only group boost." 2019-02-28 12:58:20 +00:00
Marco Paniconi 75da091036 vp9-rtc: Change init level of denoiser & noise level
Change to init/reset level of the denoiser from
kDenLow to kDenMedium, and the init noise level to kLow.
This affects the denoiser level during the initialization
stage of the noise estimation.

Improves denoising for noisy content during init stage of
noise estimation, with little effect for low noise/clean content.

Change-Id: I247a17b0f01f646fc2e91a4a070ad69bdb788cae
2019-02-27 16:25:22 -08:00
Marco Paniconi dfcf95162b Merge "vp9-rtc: Add cyclic_refresh condition to segment reset" 2019-02-27 22:45:56 +00:00
Marco Paniconi 7ff1e40a66 Merge "vp9: Remove unused function in cyclic refresh" 2019-02-27 22:08:09 +00:00
Marco Paniconi 0c3e23f140 vp9: Remove unused function in cyclic refresh
Change-Id: I4b1d02c661ccbad2a1e346df623e78334a3a3a39
2019-02-27 12:07:16 -08:00
Marco Paniconi 539230a075 vp9-rtc: Add cyclic_refresh condition to segment reset
In non-rd pickmode for screen content:
this logic to reset segment should only be for cyclic_refresh
mode on, so add that condition explicitly.

There may be other uses of segments, like ROI, so we
should condition this reset logic on cyclic_refresh,
as it was intended for that mode only.

Change-Id: I954e6cee968fbca35b34286c4a7ca2531c8e9823
2019-02-27 12:03:22 -08:00
Marco Paniconi 5f84090ab3 vp9-rtc: Modify skip golden mode check for 1 layer
For real-time CBR mode: golden reference mode testing is
skipped under certain conditons based on sse of zero-last mode.
This was done for svc mode. Here we add similar condition
for non-svc/1 layer encoding.

Reduces flashing block artifacts that can occur in background
areas with noise.

Change-Id: I93f71ea9507af8c9153fc6c0ba7dcc7a0fa8810d
2019-02-27 11:49:46 -08:00
Marco Paniconi 2d403737b8 Merge "vp9-rtc: Fix to Q clamp in adjust_q_cbr" 2019-02-25 04:43:13 +00:00
Marco Paniconi 892c855313 vp9-rtc: Fix to Q clamp in adjust_q_cbr
For CBR mode: clamp the Q to worst/best quality in
adjust_q_cbr().

Under certain conditions, when the worst/best quality is
suddenly changed by a large amount mid-stream, the Q
adjustment from the final Q from adjust_q_cbr may not respect
the worst/best qualiy limits.

Change-Id: I3776129325d89882d422b22e6247d44660dd90ac
2019-02-24 19:09:38 -08:00
Marco Paniconi d2fb6a1785 vp9-rtc: Decrease noise estimation thresh for kLow
Increases denoising for noisy content.

Change-Id: Iff8573f8dca7b177ef53ee6682d691e6cd8e2740
2019-02-22 20:05:37 -08:00
kyslov 7e1c766158 Fix 32-bit unsigned integer overflow in block_variance
When compiled for High Bitdepth SSE can overflow 32-bit unsigned
integer, so change it to 64 bit. Also fixing unit/int mismatch of sum

BUG=webm:1601

Change-Id: Ib576ed1d5579b0c2b4661058aa64119560b652bf
2019-02-22 15:55:18 -08:00
Marco Paniconi d84bf53f60 vp9-rtc: Adjust thresholds for noise estimation.
Lower the frame_motion and consec_zeromv thresholds,
this make the noise estimation and denoiser have more effect
on noisy clips.

Change-Id: I49cf5d78a04d00fcf8538bee6f3b2980efe6b3b5
2019-02-21 19:34:37 -08:00
Jerome Jiang 986b2bef7f Merge "vp9: Enable ml based partition for speed>=8 low res." 2019-02-20 23:03:13 +00:00
James Zern d6f7529d15 vpx_thread.h: remove unused sched_yield()
usage was removed with:
c1b024b48 Modify map read/write to sync logic in row_mt case

Change-Id: I515fe397083079a4f11702e67c322fd04bdcf410
2019-02-19 21:01:17 -08:00
Ranjit Kumar Tulabandu 097afb6f43 Merge "Fix integer overflow issue in bits allocated" 2019-02-20 04:35:35 +00:00
Yaowu Xu b0ddf48c24 Disable encoder builds when checking coeff ranges
BUG=webm:1031

Change-Id: I28f4e8cdec170393b2d22cd8cb0b73a32204e09c
2019-02-19 17:27:06 -08:00
Jerome Jiang eed8d47769 vp9: Enable ml based partition for speed>=8 low res.
~10% speed up with no quality change for speed 8.
7% quality gain for speed 9 with no speed change.

Change-Id: I7eaaa4b82f7b082c9b15aa1d7624765ecc5082e7
2019-02-19 16:10:35 -08:00
Matt Oliver c2c34ca8d1 project: Update for 1.8.0 merge.
Closes #4
2019-02-17 22:02:57 +11:00
Matt Oliver 2031ad483b Merge commit 'b85ac11737430a7f600ac4efb643d4833afd7428' 2019-02-17 18:47:07 +11:00
Aidan Welch a90944ce79 added error messages in vpx_video_reader_open().
Change-Id: I3e521b62a2f99902c4be80fe25d3869121673e43
2019-02-14 11:29:01 -08:00
Angie Chiang 2543f37a33 Fix compile warnings in non-greedy-mv
Change-Id: Ib2bd9a74473ccb00e9ad71e0b186c8ddc0ee7b3c
2019-02-13 11:10:48 -08:00
Angie Chiang d804d8b964 Only discount new mv mode when tpl model is ready
Change-Id: I3326f0912627981fd604b16ddbf668d2262d4287
2019-02-13 11:10:48 -08:00
Angie Chiang 1e12d19bbf Check the new mv diff in discount_newmv_test()
Change-Id: I38c5d4de93bebfd3f46bcc01716a0cc4a76af950
2019-02-13 11:10:48 -08:00
Angie Chiang e3fae04785 Use mv_mode_arr to decide the newmv discount place
Change-Id: I98c32aba4c9e81380b588dcdbfa991468487ce73
2019-02-13 11:10:48 -08:00
Angie Chiang 36f42a3769 Fix the bug for feature_score computation
The visited is not set to 1 after an item is pushed into the heap.
This may cause one item being pushed into the heap multiple
times, which may incur buffer overflow and memory corruption.

Change-Id: I443f1e5693856bb4066542403f98492d4daec69d
2019-02-13 11:10:48 -08:00
Jerome Jiang e2381829e9 Merge "vp9: ML var partition as speed feature & cleanup." 2019-02-12 23:24:48 +00:00
Marco Paniconi 6581817991 vp8: Limit Q change for screen content CBR mode
Add last_q[] to layer context, and add limit on
Q change from previous layer/frame. For now put
hard limit of 12 for decrease.

For 1 pass CBR sreen content mode.

Change-Id: Ifb972c9b6831440c80b1cb07a054c577ece930ec
2019-02-12 11:22:35 -08:00
Jerome Jiang 91a9935717 Merge "Test decode and find mismatch in vp9 svc example encoder." 2019-02-12 17:45:16 +00:00
Jerome Jiang c10833faa6 Test decode and find mismatch in vp9 svc example encoder.
Also write it to opsnr.stt when internal stats is enabled.

Removed some redundant code in vpxenc.c and vp9cx_set_ref.c

Change-Id: I3700137fff0be92a23e4ab75713db72da1dc4076
2019-02-11 20:18:07 -08:00
Jerome Jiang 793c45305e vp9: ML var partition as speed feature & cleanup.
Remove it from runtime flag.

Add new struct for rd ml partition.

BUG=webm:1599

Change-Id: I883edbba83c65b7e557b8832419e212cffc85997
2019-02-11 14:03:50 -08:00
Marco Paniconi a192428449 vp8: Fix condition for update of last_pred_err_mb
For 1 pass cbr screen-content mode: quantity should
only be updated on delta frames.

Change-Id: I16fc47b2805c7527ab4ff25bd8b5a5bd9c2b8976
2019-02-11 12:07:47 -08:00
Jerome Jiang db50d72945 Merge "refactor vp9 svc example encoder." 2019-02-11 17:42:53 +00:00
Jerome Jiang 37e1295c61 refactor vp9 svc example encoder.
Put rc stats related code into a separate function.

Change-Id: I11808bb947079b5fd9e53dfa5894bf227ed0c4c6
2019-02-11 09:37:37 -08:00
Angie Chiang b5be251510 Merge changes Ibbe12dad,I4bf9e2ad
* changes:
  Add tpl_bsize to VP9_COMP
  Compute future_rd_diff in predict_mv_mode
2019-02-09 01:20:19 +00:00
Jerome Jiang 2221ec88d5 vp9 svc example encoder accept -o (--output) for output.
Make it same as vpxenc so easier to run on borg.

Change-Id: Ie19db6e828ced773cba9aef715c8fbd0f4715b27
2019-02-08 15:47:17 -08:00
Fyodor Kyslov 51da38f8ab Merge "Fixing ClangTidy issues reported by downstream integration" 2019-02-07 23:37:33 +00:00
kyslov 4f50dc943d Fixing ClangTidy issues reported by downstream integration
ClangTidy reported 16 issues. All are around typecasting and
straightforward

Change-Id: Ie8f9fc2ba7992dd44fef65b121fe65966a1a1297
2019-02-07 13:35:34 -08:00
Hui Su f4822bb957 Merge "Refactor the model for rect partition pruning" 2019-02-07 21:30:47 +00:00
Jerome Jiang af9aa25eb3 Merge "vp9: Write height and width to ivf header in SVC example encoder." 2019-02-07 19:31:50 +00:00
Jerome Jiang b1ef3919b9 vp9: Write height and width to ivf header in SVC example encoder.
Write height and width of top layer to ivf header in SVC.

vpxdec Can't decode it correctly when output is y4m.

Change-Id: I9b2f1d54696611a30e252bdfd182897d191d92b5
2019-02-07 10:38:14 -08:00
Hui Su 33218c2034 Refactor the model for rect partition pruning
Remove the block variance and skip flags from the input features. They
do not seem to reduce the average loss of the model. Also decrease the
number of hidden nodes. The model size is reduced significantly.

Compression quality and speed are both neutral.

Change-Id: Ic62f73c4f4c0a3148285f575747f0423ff568c64
2019-02-07 09:52:26 -08:00
Yaowu Xu c852765d7d Use wide integer to avoid overflow
BUG=webm:1270

Change-Id: I7d56667d946196bbbe355303de805422e40b0763
2019-02-06 16:02:39 -08:00
Angie Chiang 32b75587a7 Add tpl_bsize to VP9_COMP
Change-Id: Ibbe12dade04b218a41de9b65bbedba0054a69d83
2019-02-05 18:44:00 -08:00
Angie Chiang e43f0a4bef Compute future_rd_diff in predict_mv_mode
The future_rd_diff computes the future rd difference between
new mv mode and ref mv mode.

Change-Id: I4bf9e2ad34257ba9cfec95419c2c5eca469584e9
2019-02-05 18:10:06 -08:00
Jerome Jiang ce4336c2ab Merge "No vpx_img_alloc for y4m input in example encoders." 2019-02-06 00:57:28 +00:00
Jerome Jiang e05cea7878 Merge "Fix VPX_KF_DISABLED." 2019-02-06 00:33:30 +00:00
Johann Koenig f7779e7390 Merge "enforce some c89 restrictions" 2019-02-05 23:29:51 +00:00
Jerome Jiang 9b2378bdf7 Fix VPX_KF_DISABLED.
VP9 encoder still inserts key frame periodically when VPX_KF_DISABLED is
set in non SVC for 1-pass CBR.
BUG=webm:1592

Change-Id: Ie99d7c5b95230d739e263a2d87879693c53f620e
2019-02-05 14:58:23 -08:00
Hui Su b5c8117385 Merge "Improve the partition split prediction model" 2019-02-05 22:20:10 +00:00
Jerome Jiang a4525dccec No vpx_img_alloc for y4m input in example encoders.
Y4M reader has its own allocation.

Change-Id: Ie02440a183126072ea773860f4e9dc9b412772f5
2019-02-05 14:12:54 -08:00
Johann 060491ca21 enforce some c89 restrictions
Block "for (int i;;)" style declarations.

Use --std=gnu89 to avoid enforcing c89-style comments.

Change-Id: Ia7d1405eac647d04e92513c047773695e8d7dc6e
2019-02-05 14:03:21 -08:00
Johann Koenig 088f2d7992 Merge "ppc: use c89 loop declaration" 2019-02-05 21:51:03 +00:00
Johann 794d994f05 ppc: use c89 loop declaration
Change-Id: Ib8ca37f1b58e9903e7efa29689a0a49f14b4d73a
2019-02-05 12:20:54 -08:00
Marco Paniconi ba38d25c3d Merge "vp8: Add extra conditon for overshoot-drop" 2019-02-05 18:30:57 +00:00
Angie Chiang 5316cfe95a Merge "Add dist scale in get_mv_dist" 2019-02-05 18:27:15 +00:00
Hui Su 64abf8daf0 Improve the partition split prediction model
Include the sizes of the above and left partition block as additional
features.

This affects speed 0 and 1.

Compression change is almost neutral(about 0.03% on average).

Average encoding speedup is 3~6% depending on QP and resolution.

Change-Id: I8bddfadf6072ae757c124da0819302850d8c6fe7
2019-02-05 09:39:24 -08:00
chiyotsai 171ae2cbae Fix an inline varible declaration in temporal filter
bug=webm:1595

Change-Id: I7fbb16444a8526eb9479007772fbf52b09ff8338
2019-02-04 18:31:12 -08:00
Chi Yo Tsai 55677372a0 Merge "Add operator<< to a struct in yuv_temporal_filter_test.cc" 2019-02-04 23:52:24 +00:00
Marco Paniconi 2cdc78fc61 vp8: Add extra conditon for overshoot-drop
For drop due to large overshoot feature (in 1 pass CBR):
add additional condition that current prediction error
is larger than that of last encoded frame. This make the
drop due to sudden overshoot more robust, and improves
rate convergence for steady hard content.

Change-Id: If20027d26b4dcd290e4f788ae8e2760d95b536a5
2019-02-04 14:57:23 -08:00
chiyotsai 2020b170b8 Add operator<< to a struct in yuv_temporal_filter_test.cc
This should resolve valgrind's warning on aceessing uninitialized
values.

BUG=webm:1591

Change-Id: I678cadf448c12b598c9ea09490a7eb4e13e4118c
2019-02-04 14:24:24 -08:00
chiyotsai 4cab8adc9e Some cosmetic fixes to temporal filter
BUG=webm:1591

Change-Id: I34fd7e6cbe6f3d5486a669d0895402fd21de7641
2019-02-04 14:12:11 -08:00
Johann 3248ea0e45 Merge remote-tracking branch 'origin/northernshoveler' into HEAD
BUG=webm:1573

Change-Id: Ie92df3adfac44d7e9c143994ef4f69cd1a04e4b8
2019-02-04 11:46:43 -08:00
Johann b85ac11737 Release v1.8.0 Northern Shoveler
BUG=webm:1573

Change-Id: I2884d0d8198f937a9d14428cc9f5f7e86f4ec450
2019-02-04 09:02:33 -08:00
Harish Mahendrakar 844ca16ab1 Merge "Fix segmentation fault when num tile cols change in row-mt." 2019-02-02 04:01:57 +00:00
Harish Mahendrakar 50a93e8539 Merge "vpx_dec_fuzzer: Remove dependency on tools_common.c" 2019-02-02 02:00:39 +00:00
Ritu Baldwa 9bbbda0387 Fix segmentation fault when num tile cols change in row-mt.
Change-Id: Ifc165d76a71fcdb7c19c158c940a8d273be0d95f
2019-02-01 22:34:11 +00:00
Harish Mahendrakar 42f8012025 vpx_dec_fuzzer: Remove dependency on tools_common.c
Instead of calling get_vpx_decoder_by_name(), derive
decoder interface directly.

This will avoid dependecy on tools_common and hence any potential
updates needed to build fuzzer, when tools_common uses functions
defined in a different file

With this dependency removed, fuzzer no longer needs to enable examples
when building vpx_dec_fuzzer binaries

Change-Id: I05753edf041b4bc742a6dc06e809a8a2929d379f
2019-02-01 14:06:08 -08:00
chiyotsai c447a1d51f Remove old version of temporal_filter_apply
BUG=webm:1591

Change-Id: I926566ac1bf4bac8cb1ce1c6ded9ba940109283e
2019-02-01 09:20:19 -08:00
chiyotsai 16b0221943 Add highbd test cases for apply_temporal_filter
BUG=webm:1591

Change-Id: I61dfcecc2efccdfa15b739fd6d97a24ddff05757
2019-02-01 09:20:19 -08:00
James Zern 936ada3304 vp9_temporal_filter: convert blk_fw[0] || ... to |
this matches what is done to reduce the cost of the test of filter
values in convolve.

Change-Id: I692b58801a962b593b810c1d1dac42f72c78caf9
2019-01-31 17:52:48 -08:00
Marco Paniconi a7ccc91e3a Merge "vp9: Tune qp_thresh to disable cyclic refresh for screen" 2019-02-01 01:42:34 +00:00
Chi Yo Tsai bdb1ac0822 Merge "Add highbd yuv_temporal_filter" 2019-02-01 00:11:28 +00:00
Marco Paniconi 63ceb97c0e vp9: Tune qp_thresh to disable cyclic refresh for screen
For screen-content mode, with aq-mode=3: increase the
qp thresh for disabling the cyclic refresh.
Improves bitrate convergence for content that has been
static for long period.

Change-Id: Ica63a741402923a611ab1b86c0900f75d2d5f941
2019-01-31 15:26:53 -08:00
Angie Chiang 2b6f7cb3f2 Add dist scale in get_mv_dist
Add MACRO VP9_DIST_SCALE_LOG2 represents distortion's log scale

Change-Id: Ic496a31e6d3f04626510f8c4661af464a002e361
2019-01-31 15:01:06 -08:00
Angie Chiang 7593588eb1 Merge changes I62702bdb,Ice6d06c5,I60204c62,Ib9fdf65e
* changes:
  Implement get_mv_cost()
  Add assertion in get_block_src_pred_buf
  Fix bug in predict_mv_mode
  Allocate memory for mv_mode_arr[]
2019-01-31 21:21:16 +00:00
Marco Paniconi 02f35e7e57 vp9: Adjust intra check for short_circuit_flat_blocks
For non-rd pickmode: include H and V intra mode check for
spatially flat blocks when the sf->short_circuit_flat_blocks
speed feature is set.
Small improvement on screen content tests.

Change-Id: I3391d02cce6a46160be6ccc8a1e33fd8547eb467
2019-01-31 11:46:29 -08:00
Paul Wilkins 54c0f0de44 Change to direction of scan for GF only group boost.
When coding a GF only group it makes more sense to scan forward
from the GF to choose the boost level rather than backwards from
the end of the group towards the GF.

In practice we do not often code GF only groups in normal 2 pass
encodes and when we do the video is usually almost static which means
the direction does not matter much. However,  a forward scan makes
more sense and is how things used to work before we started using
arfs most of the time.

Change-Id: I64a5a731ff579c8af86d8a6718830d426b16a755
2019-01-31 17:03:55 +00:00
Deepa K G 492cdb48be Fix integer overflow issue in bits allocated
When encoding at high bitrates, integer overflow
occurs in the the calculation of bits allocated
for layered ARF frames.

Change-Id: I94ad9eea759367a222235a3b5d1c777578dc6ba9
2019-01-31 11:31:53 +05:30
Jerome Jiang cde3da57b9 Merge "Add y4m input to vp9 example encoder tests." 2019-01-31 05:35:58 +00:00
Angie Chiang 9d86430ed5 Implement get_mv_cost()
The mv_cost contains mv_mode cost and mv_diff cost.

The mv_mode cost is inferred from default_inter_mode_probs.
The mv_diff cost is estimated used the log2 function.

Change-Id: I62702bdb5c3fec018e3302765f5dd749fceebc12
2019-01-30 18:30:56 -08:00
chiyotsai 967ab8b5fc Add highbd yuv_temporal_filter
This changes the highbd version of temporal filter to information from
both luma and chroma planes.

Performance:
 AVG_PSNR | OVR_PSNR |  SSIM
  -0.144% |  -0.165% | -0.150%

The performance is evaluated on lowres_bd10.

Change-Id: I89d1bd46cd60c26d658b6a53aa63835e90d8e291
2019-01-30 18:22:13 -08:00
Marco Paniconi 037c8e6b13 vp9-svc: Fix to non-rd pickmode for screen content
For screen content mode: always force intra check
for spatially flat blocks that have moved. Also
adjust/fix condition for forcing check of
zeromv-golden for quality layers.
Reduces artifacts in screensharing tests.

Change-Id: Iafd62fb24a4e05f5b12af663dde2805fdb4c7b36
2019-01-30 16:47:52 -08:00
Jerome Jiang 97a031de43 Add y4m input to vp9 example encoder tests.
Change-Id: Ie64a3ee22e6b21e5b3a0cef4734930db3144bea0
2019-01-30 15:25:03 -08:00
Jerome Jiang e7d45715dc Merge "add y4m support to vp9 example encoders." 2019-01-30 22:06:12 +00:00
Jerome Jiang 7199f78783 add y4m support to vp9 example encoders.
vp9_spatial_svc_encoder and vpx_temporal_svc_encoder.

Change-Id: I8dfa1dfad83c83a26ddac4e7c57b5f1ff161e588
2019-01-30 12:47:10 -08:00
Angie Chiang 5dcdeb4cf8 Add assertion in get_block_src_pred_buf
Print error message and assert when ref_frame_idx is invalid

Change-Id: Ice6d06c53ddae0a77d578671b896c4e4d04d5366
2019-01-30 11:21:24 -08:00
Harish Mahendrakar 4c9316f4fa Merge changes I49a760ea,I792df86e
* changes:
  Modify map read/write to sync logic in row_mt case
  Revert "Revert "Add Tile-SB-Row based Multi-threading in Decoder""
2019-01-30 18:21:42 +00:00
Hui Su 15ea853ccd Merge "Add some const qualifiers where applicable" 2019-01-30 18:17:11 +00:00
Hui Su 6796c0cfc3 Merge "Reuse simple motion search results" 2019-01-30 18:17:05 +00:00
Chi Yo Tsai 735ac5547f Merge "Reland "Enable SSE4 version of apply temporal filter"" 2019-01-30 17:52:41 +00:00
Marco Paniconi bf59fb119e vp9-svc: Modify early breakout for non-rd pickmode
Modify early breakout condition for non-rd pickmode
for quality layers: when lower layer has lower QP force
test of zeromv on golden (lower layer reference) before
breakout due to skip.

Reduce artifacts, observed in cases of scrolling content.

Change-Id: Id834b1eb024a4c97f0e74d8b7f7a0351459e088f
2019-01-29 20:17:11 -08:00
Ritu Baldwa c1b024b488 Modify map read/write to sync logic in row_mt case
Adds conditional wait/signal instead of sched_yield.

Change-Id: I49a760eacdd6b6ac690e797ea5f10febf6a1a084
2019-01-30 09:40:29 +05:30
Angie Chiang 00e49e4e68 Fix bug in predict_mv_mode
Use kMvPreCheckLines in the for loops.

Change-Id: I60204c6294560d47421a8621d907dfa95c9dde18
2019-01-29 16:00:13 -08:00
Angie Chiang 93d1822d5f Allocate memory for mv_mode_arr[]
Change-Id: Ib9fdf65e263dbaace8d4c86766eba2c6f35f652b
2019-01-29 15:36:38 -08:00
Jerome Jiang dd08a11093 Merge "Clean up TODOs for vpx_img_* functions." 2019-01-29 23:04:33 +00:00
Hui Su 866242bb09 Add some const qualifiers where applicable
Change-Id: Ib820f625e0b616fd57a2722ec3614b4fccf307f8
2019-01-29 14:22:47 -08:00
Hui Su 82648835eb Reuse simple motion search results
In the ML based partition search speed feature, use MV result of
previous simple motion search as the starting point for the next one.

Compression change is neutral; encoding speed becomes slightly faster.

Change-Id: Iea554f28f7966fc5b5857e12b06de58e3fa312a6
2019-01-29 14:22:41 -08:00
Angie Chiang 9bd666b42c Merge changes I7dcfcdb3,Ie0b2c67b
* changes:
  Add predict_mv_mode_arr()
  Add predict_mv_mode()
2019-01-29 21:25:24 +00:00
Jerome Jiang 5e5f6db657 Clean up TODOs for vpx_img_* functions.
They should stay in tools_common.{c,h}

Change-Id: I34bd05e8b000ce780bb1f77abcb8cbfd1e83158f
2019-01-29 12:27:40 -08:00
Chi Yo Tsai 9c3557e6d5 Reland "Enable SSE4 version of apply temporal filter"
This reverts commit a4d2f59b69.

Reason for revert: Re-enables SSE4_1 version of apply temporal filter now that the mismatch is fixed in fa540837aa,

Original change's description:
> Revert "Enable SSE4 version of apply temporal filter"
> 
> This reverts commit 4f3cd48bfe.
> 
> Reason for revert: Found a mismatch with c version
> 
> Original change's description:
> > Enable SSE4 version of apply temporal filter
> > 
> > Evaluating on 5 midres clips with 4 bitrates over 30 frames on speed 1
> > auto_alt_ref=1, there is a speed up of 1.660%.
> > 
> > BUG=webm:1591
> > 
> > Change-Id: Idbda58548679e6f7b8fc0d7f6144f7be057ef690
> 
> TBR=yunqingwang@google.com,builds@webmproject.org,chiyotsai@google.com
> 
> Change-Id: Ibca973576d72d6db4b647a08aef23389d5d6605a
> No-Presubmit: true
> No-Tree-Checks: true
> No-Try: true
> Bug: webm:1591

TBR=yunqingwang@google.com,builds@webmproject.org,chiyotsai@google.com

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug: webm:1591
Change-Id: I26effdbaf4d52e4650c263b6ed9d3d80e505f5cb
2019-01-29 20:24:29 +00:00
Chi Yo Tsai 8dc6fff605 Merge "Fix mismatch between SIMD/C version of vp9_apply_temporal_filter" 2019-01-29 20:21:41 +00:00
Hui Su f79cf2c81e Merge "Refactor ml_predict_var_rd_paritioning()" 2019-01-29 17:56:29 +00:00
chiyotsai fa540837aa Fix mismatch between SIMD/C version of vp9_apply_temporal_filter
Change-Id: I6503ebc79beaac2947992437ac133f3ac4379019
2019-01-28 17:53:31 -08:00
Chi Yo Tsai 47c018930d Merge "Fix test case name for yuv_temporal_filter" 2019-01-28 23:14:07 +00:00
chiyotsai 14052cd9d8 Fix test case name for yuv_temporal_filter
This should fix valgrind's failure.

BUG=webm:1591

Change-Id: Idab2d6281484c36e6de193d6f45d13f97762625e
2019-01-28 14:04:10 -08:00
Hui Su 77279692e6 Refactor ml_predict_var_rd_paritioning()
Refactor out code about simple motion search.

Change-Id: Ie6895db2aff3c13e7a45554d6bc1c7c0af8f2d51
2019-01-28 13:46:30 -08:00
Angie Chiang 33a39c5a1a Add predict_mv_mode_arr()
The function predicts the mv_mode for each prediction block in
diagonal order.

Change-Id: I7dcfcdb317ffa334cb40bb435baa71b5db62252b
2019-01-28 12:18:52 -08:00
Angie Chiang 9259bb370a Add predict_mv_mode()
This function evaluate the impact of setting NEW_MV_MODE on a
block and its neighbor blocks.

Change-Id: Ie0b2c67bdc5cd14e0efd8ebc5dc3f3f873bcf3fe
2019-01-28 12:11:22 -08:00
Chi Yo Tsai d4534704e0 Merge "Revert "Enable SSE4 version of apply temporal filter"" 2019-01-26 03:36:25 +00:00
Chi Yo Tsai a4d2f59b69 Revert "Enable SSE4 version of apply temporal filter"
This reverts commit 4f3cd48bfe.

Reason for revert: Found a mismatch with c version

Original change's description:
> Enable SSE4 version of apply temporal filter
> 
> Evaluating on 5 midres clips with 4 bitrates over 30 frames on speed 1
> auto_alt_ref=1, there is a speed up of 1.660%.
> 
> BUG=webm:1591
> 
> Change-Id: Idbda58548679e6f7b8fc0d7f6144f7be057ef690

TBR=yunqingwang@google.com,builds@webmproject.org,chiyotsai@google.com

Change-Id: Ibca973576d72d6db4b647a08aef23389d5d6605a
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: webm:1591
2019-01-26 01:30:20 +00:00
Sai Deng 2f82761b13 Merge "Fix a bug in tune-content film mode" 2019-01-25 18:51:23 +00:00
Angie Chiang d3f3f42216 Merge changes Ia1b3ec7e,I58b8c713,Ibeb43400
* changes:
  Add find_best_ref_mv_mode()
  Add get_mv_dist
  Add get_mv_from_mv_mode()
2019-01-25 17:54:13 +00:00
Chi Yo Tsai 763e664ef2 Merge "Enable SSE4 version of apply temporal filter" 2019-01-25 17:35:48 +00:00
Chi Yo Tsai e20eeb5484 Merge "Add SSE4 version of new apply_temporal_filter" 2019-01-25 17:35:34 +00:00
Paul Wilkins 0c2aef0759 Merge "Adjustment to noise factor in first pass." 2019-01-25 17:09:39 +00:00
Yunqing Wang c4305ba57d Adjustment to noise factor in first pass.
Adjustments to the calculation and use of a noise estimate in
the first pass Q estimate and adaptation of temporal filtering.

This change was tested and gave gains for both auto-alt-ref=1
and auto-alt-ref=6 as follows:

Results are Av PSNR, Overall PSNR, SSIM  and PSNR-HVS

auto-alt-ref=1
low_res 0.007, -0.042, -0.018, 0.074
mid_res -0.142, -0.239, -0.173, -0.129
hd_res -0.322, -0.405, -0.397, -0.367
NF_2K -0.058, -0.099, -0.201, 0.028

auto-alt-ref=6
low_res  -0.058, -0.171, -0.188, -0.027
mide_res -0.149, -0.155, -0.171, -0.137
hd_res -0.252, -0.339, -0.259, -0.297
NF_2K -0.015, -0.068, -0.120, 0.092

In all sets there were some winners and losers but significantly
more winners. The biggest change was Stockholm in the
hd set with an improvement of 5-6%

Change-Id: Ieec71e1c4e3e09b76c288efa7b4d1b00015b3a11
2019-01-25 16:00:35 +00:00
chiyotsai 4f3cd48bfe Enable SSE4 version of apply temporal filter
Evaluating on 5 midres clips with 4 bitrates over 30 frames on speed 1
auto_alt_ref=1, there is a speed up of 1.660%.

BUG=webm:1591

Change-Id: Idbda58548679e6f7b8fc0d7f6144f7be057ef690
2019-01-24 18:42:32 -08:00
chiyotsai b580f76dbc Add SSE4 version of new apply_temporal_filter
This adds a preliminary version of vp9_apply_temporal_filter in SSE4.1.
This patch merely adds the function and does not enable it yet.

Speed Up:
         | ss_x=1 | ss_x=0 |
  ss_y=1 | 19.80X | 19.04X |
  ss_y=0 | 21.09X | 20.21X |

BUG=webm:1591

Change-Id: If590f1ccf1d0c6c3b47410541d54f2ce37d8305b
2019-01-24 18:42:08 -08:00
Angie Chiang 9a49848245 Add find_best_ref_mv_mode()
This function compute the rd cost for each  mv_mode and return the
one with minimum rd cost.

eval_mv_mode()
Evaluate the rd cost for a given mv_mode.

Change-Id: Ia1b3ec7e1dd538e443e1bc79f2cab352408cd0a0
2019-01-24 17:55:28 -08:00
Angie Chiang ef3cb03d8b Add get_mv_dist
Given an mv_mode, get_mv_dist() obtains the mv and uses it
to compute distortion.

Change-Id: I58b8c7137b99c2736d651e678f0cd013dbd94877
2019-01-24 17:55:19 -08:00
Angie Chiang 0a922ffaf6 Merge changes I0fb46784,I6c89fae4
* changes:
  Add set_block_src_pred_buf()
  [cleanup] Move get_feature_score to a proper place
2019-01-25 01:47:43 +00:00
Johann 219ea62d00 add -Wmissing-declarations
This is useful for catching functions which should be static and
instances where the relevant rtcd file was not included.

BUG=webm:1584

Change-Id: Ied395847a664eedce59e8ed5180bd16d059ab0ac
2019-01-24 22:22:12 +00:00
Johann Koenig c03152d449 Merge "mips: resolve missing declarations" 2019-01-24 22:21:37 +00:00
Angie Chiang 4f36bea7e9 Add get_mv_from_mv_mode()
Given an mv_mode, this function will return the corresponding mv.

find_ref_mv()
A helper function finds the nearest and near mvs from the neighbor
blocks.

select_mv_arr[]
An array used for storing selected motion vectors.

Change-Id: Ibeb434007f65b2c6e461360f208d99455e76bcbf
2019-01-24 10:14:59 -08:00
sdeng c3ceb45aae Fix a bug in tune-content film mode
Avoid recursively decreasing 'strength'.

       avg_psnr ovr_psnr ssim
midres -0.224   -0.195   -0.115

Change-Id: Ie74c069cda76873ac38e9c1a9162b1ddfb9b103d
2019-01-24 04:43:38 +00:00
Angie Chiang a047b31e2f Add set_block_src_pred_buf()
This function sets src and pre buffer of MACROBLOCK
and MACROBLOCKD.
Will add static decorator once this function is called.

Change-Id: I0fb46784dd97839e4d87c9e027fe8c59683e70d8
2019-01-23 16:52:56 -08:00
Angie Chiang 292f6789cb [cleanup] Move get_feature_score to a proper place
Add static decorator to it as well.

Change-Id: I6c89fae456561b6975ab49af139a45a7483507c6
2019-01-23 15:48:30 -08:00
Johann 44ce4e91e4 mips: resolve missing declarations
Exclude low bit depth optimizations from high bit depth builds.

BUG=webm:1584

Change-Id: I86a7ebafa557d262257358e1e055a06d52659977
2019-01-23 15:35:57 -08:00
chiyotsai e3210930f5 Fix a typo in the test cases for convolve test
BUG=webm:1591

Change-Id: I34aedcb5336a96e33932ce34967c12f187ee52e2
2019-01-23 15:15:30 -08:00
Chi Yo Tsai c9e2befce5 Merge "Clean up code for yuv_temporal filter_test.cc" 2019-01-23 17:43:44 +00:00
Jon Kunkee cc6d6a3b3f Fix Windows SDK and VS version checks
If WindowsTargetPlatformVersion is not set, the Visual Studio 15 (2017)
toolchain assumes that Windows 8.1 is being targeted. Since ARM64
support is only present and unlocked in Windows SDKs >= Windows 10 1809,
set that SDK as required in the vcxproj files.

Note that this will not be an issue in Visual Studio 16 or greater,
hence the -eq major version check.

https://developercommunity.visualstudio.com/content/problem/128836/windowstargetplatformversion-to-use-the-latest-ava.html

Bug: chromium:893460

Change-Id: Ib069501ad384d91349b1f635722dedd31a4edd97
2019-01-23 00:06:50 +00:00
chiyotsai 2082077001 Clean up code for yuv_temporal filter_test.cc
Some cosmetic changes to make the code google c++-style compliant.

BUG=webm:1591

Change-Id: Icef3ccc8ebed7210b6b6f915885d5f648e62da72
2019-01-22 14:31:15 -08:00
Johann Koenig 89046321d3 Merge "ads2gas: remove DO1STROUNDING" 2019-01-22 22:31:13 +00:00
Ritu Baldwa 5818014b69 Revert "Revert "Add Tile-SB-Row based Multi-threading in Decoder""
This reverts commit 06983668cf.
Fixes Visual Studio build errors introduced by earlier row mt commit

BUG=webm:1587

Change-Id: I792df86e8254cd6b2a511955b691af619a569cd0
2019-01-19 10:20:32 +05:30
chiyotsai f4b7004967 Change temporal filter's search_method on speed 1
This commit introduces a new speed feature that determines the
SEARCH_METHOD used by temporal filter when doing 16x16 block on
full_pixel_motion_search. On speed 0, the most exhaustive method MESH is
used. On speed 1 and above, a faster method NSTEP is used.

Performance:
        | AVG_PSNR | AVG_SPDUP | AVG_SPDUP:AVG_PSNR
 MISRES |  0.007%  |   2.818%  |        402:1
  HDRES |  0.004%  |   4.897%  |       1224:1

In the case of midres, there is a small quality gain of -0.021% on
OVR_PSNR.

Performance measurement is done on speed 1 with auto_alt_ref=1.
Quality is measured on full midres set over 60 frames. Speed is measured
on 5 midres clips over 4 bitrates over 30 frames.

STATS_CHANGED

Change-Id: Ic1879d2237f8734529e194767a6cf5e43e20b47b
2019-01-18 18:05:59 -08:00
Chi Yo Tsai 5cbd333f3b Merge "Add unit speed test for vp9_apply_temporal_filter" 2019-01-18 23:13:01 +00:00
Angie Chiang 73cb14553d Merge changes Id99ca6fc,I34cdbc6e,Iac7fee46
* changes:
  Correct pyramid_mv_arr's memory size
  Adjust lambda with bsize in build_motion_field()
  Free pyramid_mv_arr properly
2019-01-18 23:05:51 +00:00
Yunqing Wang 77c1b9d259 Merge "Use longer test clips in y4m_test" 2019-01-18 01:20:11 +00:00
Yunqing Wang 51a0c40b2c Merge "Use longer videos in end-to-end tests" 2019-01-18 01:19:56 +00:00
Chi Yo Tsai 51de46479e Merge "Add unit test for temporal filter on VP9" 2019-01-18 00:34:37 +00:00
Johann dd8ccda37a ads2gas: remove DO1STROUNDING
Change-Id: Iacd1ad5673c71d350cad235e504da0e066dfc4a0
2019-01-17 10:06:21 -08:00
chiyotsai 826d6f3c3d Add unit speed test for vp9_apply_temporal_filter
This patch adds unit speed test for vp9_apply_temporal_filter.

BUG=webm:1591

Change-Id: I4792dfc6ecd4a82775b9a895a90aafdc2a199f86
2019-01-17 01:05:35 +00:00
chiyotsai da5e2463fa Add unit test for temporal filter on VP9
The current unit tests for temporal filtering only tests single
channel version of temporal filter. Since VP9 currently uses both luma
and chroma channel information for temporal filtering on low bitdepth,
there is no unit case in this scenario.

This commit adds some basic unit tests to facilitate further development
on temporal filtering.

BUG=webm:1591

Change-Id: Id38ceba5305865d7148e9b2bc636acddae54d6c2
2019-01-16 15:35:50 -08:00
Jerome Jiang 858fe955ae vp9: fix definition for VP9E_SET_POSTENCODE_DROP
(cherry picked from commit 24614dd9cb)

Change-Id: Ie763cf801107639ad11ad625408670d8d70b7628
2019-01-16 05:20:56 +00:00
Johann Koenig 9ecc0e779a Merge "mips highbd: resolve missing declarations" 2019-01-16 05:06:59 +00:00
Johann Koenig 6e318fa569 Merge "mips: add rtcd.h to resolve missing declarations" 2019-01-16 05:06:53 +00:00
Jerome Jiang aa465f2486 Merge "vp9: fix definition for VP9E_SET_POSTENCODE_DROP" 2019-01-16 00:16:15 +00:00
Jerome Jiang 24614dd9cb vp9: fix definition for VP9E_SET_POSTENCODE_DROP
Change-Id: I667be78eb7c41154bf44c242992f622f12c31b80
2019-01-15 15:02:39 -08:00
Marco Paniconi d8c3021d0b Merge "vp9-svc: Fix to buffer update under frame_drops" 2019-01-15 23:02:18 +00:00
Johann afa9ba79cb mips: add rtcd.h to resolve missing declarations
BUG=webm:1584

Change-Id: Ifdebf33356abcc6869f695d129165ba17e042dcd
2019-01-15 14:24:28 -08:00
Johann 9ca2ec2277 mips highbd: resolve missing declarations
BUG=webm:1584

Change-Id: I4cbfafe8ea72b3d4523aabcaed4848fa29bb19fe
2019-01-15 14:24:15 -08:00
Marco Paniconi 7e2d732b8b vp9-svc: Fix to buffer update under frame_drops
For svc with frame dropping in full_superframe_drop or
constrained dropped mode: the buffer level for a given layer
may be capped from increasing too much. This is because that layer
may be dropped even though its buffer is stable (the dropped is forced
due to underflow in other layers in full/constrained svc-drop mode).
This capping is needed to prevent decrease in qp over consecutive
frame drops.

The capping already exists and has been used, but this change
introduce an error that prevented its usage:
https://chromium-review.googlesource.com/c/webm/libvpx/+/1330875

The fix here is to also cap the bits_off_target as well, since after
the change mentioned above, its the bits_off_target that is used to
update buffer on next frame (which in turn affects qp for next frame/layer).

Change-Id: Ifdab5d478e91cce20ecec51faa574eed375ee36b
2019-01-15 14:09:15 -08:00
chiyotsai c182725cbc Remove unnecessary calculation in 4-tap interpolation filter
Reduces the number of rows calculated for 2D 4-tap interpolation filter
from h+7 rows to h+3 rows.
Also fixes a bug in the avx2 function for 4-tap filters where the last
row is computed incorrectly.

Performance:
           | Baseline |  Result  | Pct Gain |
bitdepth lo| 4.00 fps | 4.02 fps |   0.5%   |
bitdepth 10| 1.90 fps | 1.91 fps |   0.5%   |

The performance is evaluated on speed 1 on jets.y4m br 500 over 100
frames.

No BDBR loss is observed.

Change-Id: I90b0d4d697319b7bba599f03c5dc01abd85d13b1
2019-01-15 20:02:19 +00:00
Johann Koenig 19882cdbf9 Merge " highbd_iadst16_neon: resolve missing declaration" 2019-01-15 18:19:31 +00:00
Marco Paniconi 3915f0616a Merge "vp9-svc: Rate control fix for key base layer" 2019-01-15 05:05:18 +00:00
Marco Paniconi 48d045057d vp9-svc: Rate control fix for key base layer
After encoding key frame on base spatial layer,
if the overshoot is significant, reset the
avg_frame_qindex[INTER] on base spatial layer for
all temporal layers.

This forces the active_worst_quality to increase
on subsequent frames/layers and reduces frame dropping.

Change-Id: I53a3cd14131d69120e59a649b7ed1bfde3e940ee
2019-01-14 19:49:42 -08:00
Jerome Jiang 4caea79ffa Merge "Fix typo: exhuastive" 2019-01-15 01:42:07 +00:00
Jerome Jiang ccef52944d Merge "clean up debug print." 2019-01-15 01:41:55 +00:00
Yunqing Wang 7cee31fd27 Use longer test clips in y4m_test
Used 20-frame clips to replace 10-frame clips in y4m_test. Also, removed
unused 10-frame clips.

Change-Id: Ib82ad2c3718f1f5f31478957b9ee970593536940
2019-01-14 17:12:54 -08:00
Yunqing Wang 5640bfbb5b Use longer videos in end-to-end tests
Used 20-frame clips got from Deb in end-to-end unit tests to improve
the test coverage.

TODO: remove 10-frame clips.

Change-Id: I06ec2d35f5c5c47263d3be61623c80f52fd18ffe
2019-01-14 16:59:48 -08:00
Jerome Jiang 100ee65613 clean up debug print.
printf -> assert(0 & ...)

Change-Id: I7bd6c0127ad816e8a5b555e86d54961b33da2bc4
2019-01-14 15:46:04 -08:00
Jerome Jiang 0f6dbd721f Fix typo: exhuastive
Change-Id: Ia00570a00b871eb1f929bd7e0af221d2c0b5ed21
2019-01-14 15:42:20 -08:00
Wan-Teh Chang ad479d9641 Merge "Change "ximage" to "vpx_image_t" in comments." 2019-01-14 22:53:08 +00:00
Deepa K G fa5083e8e2 Fix segmentation fault in hbd path
When CONFIG_VP9_HIGHBITDEPTH is enabled,
lowbd modules were called in the hbd path.
This patch fixes the issue.

(cherry picked from commit 797ec1cd66)

BUG=webm:1589

Change-Id: I1caf701514dbf80eb75b953f40b1e7238f265a2c
2019-01-14 22:28:50 +00:00
Wan-Teh Chang 902773f49c Merge "Reset buffer_alloc_sz after freeing buffer_alloc." 2019-01-14 22:03:50 +00:00
Wan-Teh Chang 50b38cf1aa Reset buffer_alloc_sz after freeing buffer_alloc.
ybf->buffer_alloc and ybf->buffer_alloc_sz should ideally be kept in
sync. If ybf->buffer_alloc is reset to NULL after being freed, then
ybf->buffer_alloc_sz should be reset to 0.

Change-Id: I7e7566b563ddf145d0e46050c5b6bd141084f8b3
2019-01-14 11:54:59 -08:00
Jerome Jiang 32bcc4ae16 Merge "Fix typo." 2019-01-14 19:19:08 +00:00
Wan-Teh Chang 6d3729ca1a Change "ximage" to "vpx_image_t" in comments.
In test/external_frame_buffer_test.cc, rename CheckXImageFrameBuffer()
to CheckXImageFrameBuffer().

Change-Id: Ifea3910445673be465d7536a69f85f1a2e2bce6e
2019-01-14 11:13:43 -08:00
Jerome Jiang 26aecea758 Fix typo.
Blocking libvpx update into google3.

Change-Id: I18c29f0a68568e65ae5e0c7fcdb5097b08b586a6
2019-01-14 10:32:58 -08:00
James Zern 495282774e convolve_test: Add missing init of HBD buffers
this resolves some msan errors.
the same change was done in libaom:
5ab58722c Add missing initializations of HBD buffers

Change-Id: I8882af45b95c90ba43bf138c7d305a6c3b99e61c
2019-01-11 16:36:45 -08:00
Yunqing Wang 6b02a123bc Merge "Fix segmentation fault in hbd path" 2019-01-11 22:53:31 +00:00
Johann Koenig 5bd279db7c Merge "highbd idct: resolve missing declarations" 2019-01-11 19:39:34 +00:00
Deepa K G 797ec1cd66 Fix segmentation fault in hbd path
When CONFIG_VP9_HIGHBITDEPTH is enabled,
lowbd modules were called in the hbd path.
This patch fixes the issue.

Change-Id: I59820180fbed120697b6ef1fc1a02be0d35ac1d5
2019-01-12 00:08:41 +05:30
Angie Chiang 0437a885a8 Correct pyramid_mv_arr's memory size
Change-Id: Id99ca6fc846ebe11a9f5363da4e6449e976303a1
2019-01-10 22:44:01 -08:00
Angie Chiang 9525a59f21 Adjust lambda with bsize in build_motion_field()
Change-Id: I34cdbc6e8625c0de8595860af02ca277c3448a19
2019-01-10 19:35:10 -08:00
Jerome Jiang 759d1de9d0 vp8 dec: Add flag to bring up threads.
Instead of creating a new decoder instance when restarting all threads
after they were shut down, re-create threads on the new flag.

BUG=webm:1577

(cherry picked from commit 7be8d2df6c)

Change-Id: I80211d47e8d4beaa361416b58e99dd65d8da39c4
2019-01-10 22:01:43 +00:00
Jerome Jiang 7205a52c28 Merge "vp8 dec: Add flag to bring up threads." 2019-01-10 21:57:07 +00:00
Jerome Jiang 7be8d2df6c vp8 dec: Add flag to bring up threads.
Instead of creating a new decoder instance when restarting all threads
after they were shut down, re-create threads on the new flag.

BUG=webm:1577

Change-Id: I6272ecaa1b586afdaa5ed8d6eab80aff8f5eb673
2019-01-10 12:55:06 -08:00
Angie Chiang f33045f6b1 Free pyramid_mv_arr properly
Change-Id: Iac7fee461759599a7e167f8e6716ae3c6414a7d1
2019-01-10 05:55:08 -08:00
Johann 03005821bf highbd_iadst16_neon: resolve missing declaration
Only used in a local array. Similar to lowbd iadst16 naming.

BUG=webm:1584

Change-Id: Ie07c2fb9599fb54fab221e5c0ccec0e95d69b893
2019-01-09 15:35:48 -08:00
Johann 06d70c119a highbd idct: resolve missing declarations
BUG=webm:1584

Change-Id: I596f5f0e1a1c152493cd8177b32d416cc79937e0
2019-01-09 11:19:21 -08:00
Angie Chiang 4bb1cc06fd Merge changes Icec98e6f,I63614e65,I25ea05f4
* changes:
  Add full_pixel_exhaustive_new
  Add sse cost in vp9_full_pixel_diamond_new
  Use motion field for mv inconsistency in mv search
2019-01-09 16:11:04 +00:00
Johann Koenig 0087f5c74f Merge "ppc: resolve missing declarations" 2019-01-09 15:38:38 +00:00
Johann 744a1d8f4c ppc: resolve missing declarations
Add rtcd headers and make local functions static.

BUG=webm:1584

Change-Id: Ic19aec1dc90703b0b89d1092baee487d0fd0cb4e
2019-01-08 10:51:34 -08:00
Johann 52f318d4a4 vp8 arm loopfilter: resolve missing declarations
BUG=webm:1584

Change-Id: I3270e6efe79fe9728e8d11f4c352deefc3cea00b
2019-01-08 10:24:21 -08:00
Johann Koenig c1d523d860 Merge "vp8 idct: remove return" 2019-01-08 15:39:10 +00:00
Johann Koenig e7aa3bcd95 Merge "vp8_copy32xn: resolve missing declaration" 2019-01-08 05:23:24 +00:00
Johann df755d55b0 vp8 idct: remove return
Change-Id: Ib1648e1f6559e65ddf11cb54266c7eeff37a6ea6
2019-01-07 20:38:13 -08:00
Johann Koenig 16664b539e Merge "vp8 idct dequant: resolve missing declarations" 2019-01-08 04:36:45 +00:00
Johann Koenig ab32d419ea Merge "vp8 blend: resolve missing declarations" 2019-01-08 04:29:03 +00:00
Johann Koenig 8e397425c3 Merge "vp8 overlaps: resolve missing declaration" 2019-01-08 04:28:38 +00:00
Johann 38d98d870f vp8 idct dequant: resolve missing declarations
BUG=webm:1584

Change-Id: Iecd2a0154c523fa61349c456befdf6c37d980efc
2019-01-07 18:12:18 -08:00
Johann e93046a49c vp8_copy32xn: resolve missing declaration
BUG=webm:1584

Change-Id: I9898a6e2f977acd4e26b09222a1eb2ab4f37f0af
2019-01-07 16:34:04 -08:00
Johann 0aeaf29b18 vp8 overlaps: resolve missing declaration
BUG=webm:1584

Change-Id: I67fa7460cb90b9bbe8583b60340d7bbf615a11f2
2019-01-07 16:24:20 -08:00
Johann f485c67d0f vp9_get_blockiness: resolve missing declaration
BUG=webm:1584

Change-Id: I719c64734f4eae07def2d700006834a2420891a7
2019-01-07 16:21:06 -08:00
Johann 2e39962d71 vp8 blend: resolve missing declarations
Remove unused functions.

BUG=webm:1584

Change-Id: If7a49e920e12f7fca0541190b87e6dae510df05c
2019-01-07 16:10:01 -08:00
Johann Koenig 3476e40718 Merge "vp8 multi dimensional search: resolve missing declarations" 2019-01-07 23:56:07 +00:00
Johann Koenig bd8b783aa0 Merge "vp8_copy32xn: resolve missing declaration" 2019-01-07 23:54:39 +00:00
Johann Koenig 29c4fcfee0 Merge "vpx_filter: resolve missing declarations" 2019-01-07 23:26:54 +00:00
Johann Koenig 6fc3524682 Merge "vpx_clear_system_state: resolve missing declaration" 2019-01-07 23:26:43 +00:00
Johann Koenig 080bb149a8 Merge "vp9 intra pred test: resolve -Wuninitialized warning" 2019-01-07 23:26:32 +00:00
Johann Koenig 69872a76ba Merge "arm neon: resolve missing declarations" 2019-01-07 23:26:06 +00:00
Johann 8466728f81 vp8_copy32xn: resolve missing declaration
BUG=webm:1584

Change-Id: I8279e099fb9595edad858bf7332bf2b40fecae02
2019-01-07 14:58:27 -08:00
Johann 57d476bd8f arm neon: resolve missing declarations
BUG=webm:1584

Change-Id: I2dcf39f2327b72b58be72c27f952ea781a790dd3
2019-01-07 14:05:58 -08:00
Johann f479beeeba vpx_filter: resolve missing declarations
BUG=webm:1584

Change-Id: I1be768446b9304123da7b1ea0aed0db056db31c5
2019-01-07 12:34:53 -08:00
Johann 948e516969 vpx_clear_system_state: resolve missing declaration
BUG=webm:1584

Change-Id: I0770fc97055b98cdf9ff7bd7a93ae3a5e19b8180
2019-01-07 11:58:43 -08:00
Fyodor Kyslov 5a9f6420cd Merge "Fix OOB memory access on fuzzed data" 2019-01-07 19:30:21 +00:00
Johann 9cf2e851d9 vp9 intra pred test: resolve -Wuninitialized warning
BUG=webm:1584

Change-Id: I58505e04bd248697047d4957cebe495dada670a0
2019-01-07 11:27:53 -08:00
Johann 5aeaf43c42 vp8 multi dimensional search: resolve missing declarations
BUG=webm:1584

Change-Id: I5c3fb5ab00bff66a8e8f4b8d27cbcea4946eced0
2019-01-07 11:15:01 -08:00
kyslov 46e17f0cb4 Fix OOB memory access on fuzzed data
vp8_norm table has 256 elements while index to it can be higher on
fuzzed data. Typecasting it to unsigned char will ensure valid range and
will trigger proper error later. Also declaring "shift" as unsigned char to
avoid UB sanitizer warning

BUG=b/122373286,b/122373822,b/122371119

Change-Id: I3cef1d07f107f061b1504976a405fa0865afe9f5
2019-01-07 10:35:34 -08:00
Johann 6efdd9ad48 fix vp9 fdct_quant
Values in [q]coeff1 were not correctly stored. This caused a segfault
in the sse2 libvpx__nightly_optimization jobs.

Broken in:
commit 85032bac38
Author: Johann <johannkoenig@google.com>
Date:   Fri Dec 21 00:27:00 2018 +0000

    fdct_quant: resolve missing declarations

BUG=webm:1584

Change-Id: I5f5fad34ec5e32023f5b40ff3691125754c11ced
2019-01-07 09:53:18 -08:00
Urvang Joshi b625feb358 Merge "VP9 firstpass: Bugfix when mi_col_start/end is odd" 2019-01-04 23:03:57 +00:00
Urvang Joshi ad57c72b9f VP9 firstpass: Bugfix when mi_col_start/end is odd
Before this patch, if mi_col_end was odd, then the for loop for 'mb_col'
was looping once LESS than it should have been.

For example, if mi_col_end = 47, then the loop was terminating when
mb_col == 23. However, the correct behavior would be to terminate  when
mb_col == 24.

The issue was introduced in:
https://chromium-review.googlesource.com/c/webm/libvpx/+/423279

This can lead to many of the stats being inaccurate, for such videos
(with mi_col_start/end having an odd value).

As an example:
Even for very static content, fp_acc_data->intercount can never reach the
same value as num_mbs. And in turn, pcnt_inter can never reach the value 1
(that is, 100%). This would lead to very static videos NOT being marked
static, and encoded like regular videos.

Note: this is just one possible effect based on observation. Other
issues are also possible based on other stats.

Improvement on some test clips:
-------------------------------
- One test clip saw a gain of -2.580% in VBR mode (and -3.153% in Q
mode). The reason for improvement: a wrongly detected scene cut was
avoided due to corrected value in 'this_frame->pcnt_inter'.
- Some very static clips correctly marked as having 100% zero motion.
This avoided addition of unncecessary alt-refs, thereby reducing the
bitrate.

BDRate (PSNR) on regular sets (VBR mode):
-----------------------------------------
lowres: 0.0
midres: -0.027 (some clips were better/worse, but I double checked that
changes were as expected, given correction in stats calculation).
hdres: 0.0

STATS_CHANGED for the types of videos described above.

Change-Id: Ifbc2c0c0815d23ec4015475680bdf8886f158dcc
2019-01-04 12:47:16 -08:00
Angie Chiang fc165fbe00 Add full_pixel_exhaustive_new
Add full_pixel_exhaustive_new() and exhuastive_mesh_search_new().
The two functions are variants from full_pixel_exhaustive() and
exhuastive_mesh_search().

In the new versions, we use mv inconsistency in place of
mv entropy cost.

Change-Id: Icec98e6fae24f2771806a3e78276734624ec0303
2019-01-04 09:10:26 -08:00
Angie Chiang 66bbd53882 Add sse cost in vp9_full_pixel_diamond_new
Change-Id: I63614e652686557652985bde882889eea9ecbcad
2019-01-04 09:09:04 -08:00
Angie Chiang 14d91ac515 Use motion field for mv inconsistency in mv search
Change-Id: I25ea05f4bfe3c6f420e967c33763909c979a0d1b
2019-01-04 09:08:56 -08:00
Angie Chiang 3271a7ed6c Increase memory size in non-greedy-mv
The smallest block size of motion field is 4x4, but the mi_unit
size is 8x8, therefore the number of units of motion field is
"mi_rows * mi_cols * 4".

Change-Id: I95292904d757705d39b78af5d0cf2d25f376c642
2019-01-03 17:53:14 -08:00
Angie Chiang 23f8b83177 Build pyramid motion field
Change-Id: I43fd61f7946a8a96d444dab5e94a9b01483ffab7
2019-01-03 17:53:14 -08:00
Jerome Jiang c4c5c1d7e4 vp9: psnr diff thres for single vs multi threading.
Change the threshold from 0.1 to 0.2.

BUG=webm:1588

Change-Id: I1ca20b360bcae66d09dc898c3266c9f5ac346561
2019-01-02 12:55:26 -08:00
Matt Oliver ed22f823e5 project: Fix linking errors due to DATA def. 2019-01-01 22:36:16 +11:00
Yunqing Wang 95ac0cc9f7 Adaptively choose block sizes in temporal filtering
Use variable block sizes in temporal filtering. Based on prediction
errors of 32x32 or 16x16 blocks, choose the block size adaptively.
This improves the coding performance, especially for HD resolutions.

Speed 1 borg test result:
        avg_psnr:  ovr_psnr:    ssim:
lowres:  -0.090     -0.075      -0.112
midres:  -0.120     -0.107      -0.168
hdres:   -0.506     -0.512      -0.547

Change-Id: I8f774e29ecb2e0dd372b32b60c32d8fa30c013a8
2018-12-27 11:02:17 -08:00
Johann Koenig 8010d20b01 Merge "fwd_dct32x32 avx2: resolve missing declarations" 2018-12-24 02:06:17 +00:00
James Zern e6d61ce49d Merge "Revert "Add Tile-SB-Row based Multi-threading in Decoder"" 2018-12-22 17:46:54 +00:00
James Zern 7ae34aeb1e test-data: add missing test data entries
invalid-bug-1443-v2.ivf{,.res}
invalid-vp80-00-comprehensive-s17661_r01-05_b6-.v2.ivf{,.res}

missed in:
6dbf738a4 vp8: kill all threads on corrupted frame.

Change-Id: I6481f4ad7544ecc069d0e0442888e97e9638fdd3
2018-12-22 09:45:15 -05:00
James Zern 06983668cf Revert "Add Tile-SB-Row based Multi-threading in Decoder"
This reverts commit 02b3ef7fae.

Reason for revert: fails to build under visual studio

Original change's description:
> Add Tile-SB-Row based Multi-threading in Decoder
>
> Add the multi-thread function that decodes a video row by row instead
> of a tile at a time. Create a job queue for queueing all parse and recon jobs.
> Each SB row of a tile is a job.
>
> Performance Improvement:
>
> Platform        Resolution      3 Threads       4 Threads
> ARM             720p            36.81%          18.37%
>                 1080p           32.27%          14.76%
>
> ARM Improvement measured on Nexus 6 Snapdragon 805 Quad-core  @ 2.65 GHz
>
> Change-Id: I3d4dd7a932fc2904c90d9546b2de99c809afd29e

BUG=webm:1587

Change-Id: Ia4c8f5128922a205cd9fd83aaef8a2e73764d4a7
2018-12-22 09:26:29 -05:00
Fyodor Kyslov 6aa308abfb Merge "Bound the total allocated memory of frame buffer" 2018-12-21 23:24:29 +00:00
Johann Koenig 313ca263fa Merge "fwd_dct32x32 sse2: resolve missing declarations" 2018-12-21 22:02:49 +00:00
kyslov e6d2dc12ad Bound the total allocated memory of frame buffer
This CL allows to limit memory consumption of the frame buffer pool. As
the result if compiled with VPX_MAX_ALLOCABLE_MEMORY set codec will fail
if frame resolution requires more memory
This is backported CL aae2183cb58b60d01b8e4e15269ee9f48dd72908 from
aomedia

Tested:
configure --extra-cflags="-DVPX_MAX_ALLOCABLE_MEMORY=536870912"
make
./test_libvpx

BUG=webm:1579

Change-Id: Ic62213b600a7562917d5a339a344ad8db4b6f481
2018-12-21 14:00:02 -08:00
Johann Koenig 9b45ae55f0 Merge "vp9_decodeframe.c: resolve missing declarations" 2018-12-21 21:46:00 +00:00
Johann Koenig e59d0b43e5 Merge "vp9_highbd_block_error_sse2: resolve missing declarations" 2018-12-21 21:40:49 +00:00
Johann Koenig edef57efe9 Merge "convolve avx2: resolve missing declarations" 2018-12-21 21:37:53 +00:00
Elliott Karpilovsky ba4dc85441 Merge "Improve accuracy of benchmarking" 2018-12-21 21:31:49 +00:00
Johann 4d41adacaa fwd_dct32x32 avx2: resolve missing declarations
BUG=webm:1584

Change-Id: Iaba854952534a95e710a985acfcab46e093872c2
2018-12-21 12:44:49 -08:00
Johann 7f90729f87 fwd_dct32x32 sse2: resolve missing declarations
BUG=webm:1584

Change-Id: Ia2d9fcbccbad0c2142a3759e610670b86af0fef4
2018-12-21 12:24:09 -08:00
Johann bf86ee7317 vp9_highbd_block_error_sse2: resolve missing declarations
BUG=webm:1584

Change-Id: I43d051c538bf4a6f6210eefa398dc0901ab8d157
2018-12-21 11:59:07 -08:00
Johann 967a3a84e6 vp9_decodeframe.c: resolve missing declarations
BUG=webm:1584

Change-Id: Ie0d26b745ab1f5907a6a2dc10fbc5083f3fb0b8d
2018-12-21 11:54:21 -08:00
Johann 90f0954689 convolve avx2: resolve missing declarations
BUG=webm:1584

Change-Id: I5990c0100af83d13f7a4800147473bc997f5e5d1
2018-12-21 11:50:30 -08:00
Johann Koenig 1cb039529d Merge "subpixel_8t sse2: resolve missing declarations" 2018-12-21 19:30:04 +00:00
Johann Koenig 2f7c4d276a Merge "vpx{dec,enc}: resolve missing declarations" 2018-12-21 19:29:38 +00:00
Johann Koenig 170a4bb4b8 Merge changes I48b9a9cd,I92504ed4
* changes:
  subpixel_8t ssse3: resolve missing declarations
  subpixel_8t avx2: resolve missing declarations
2018-12-21 19:29:25 +00:00
elliottk a81768561c Improve accuracy of benchmarking
For small code regions, readtsc can give inaccurate results because it does
not account for out-of-order execution. Add x86_tsc_start and x86_tsc_end
that account for this, according to the white paper at

https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-32-ia-64-benchmark-code-execution-paper.pdf

Using x86_tsc_start/end will also add in several more instructions; I imagine
this is negligible.

Change-Id: I54a1c8fa7977c34bf91b422369c96f036c93a08a
2018-12-21 11:28:37 -08:00
Johann 7779e9c911 subpixel_8t ssse3: resolve missing declarations
BUG=webm:1584

Change-Id: I48b9a9cdcfe52536f685c41fb2d3c0f3e9192d34
2018-12-21 09:23:14 -08:00
Yunqing Wang fea1bd10a4 Merge "Refactor temporal filtering" 2018-12-21 17:15:23 +00:00
Johann c67a2e76a1 subpixel_8t sse2: resolve missing declarations
vpx_asm_stubs.c only references these sse2 functions. Combine the files
similar to the way the ssse3/avx2 files are set up.

Mark the intrinsics as static because they are only used within the
macros here. It is unfortunate that the assembly functions can not be
marked static as well.

BUG=webm:1584

Change-Id: I342687a1046ae6ca46ae58644a7c170440de1dfb
2018-12-21 08:35:10 -08:00
Jerome Jiang 0c806817c2 Merge "vp8: kill all threads on corrupted frame." 2018-12-21 16:24:09 +00:00
Johann 7192aec277 subpixel_8t avx2: resolve missing declarations
BUG=webm:1584

Change-Id: I92504ed4a2e54129c981b7380249962afb7966df
2018-12-21 07:51:47 -08:00
Johann 7badbcc421 vpx{dec,enc}: resolve missing declarations
BUG=webm:1584

Change-Id: I81e53e579e6fd22b7b21f432256abbe91bf77b15
2018-12-21 15:50:36 +00:00
Johann Koenig d12f5d3ef7 Merge "highbd quantize: resolve missing declarations" 2018-12-21 14:49:38 +00:00
Johann Koenig 0741ef2ec0 Merge "fdct_quant: resolve missing declarations" 2018-12-21 14:48:07 +00:00
Johann Koenig 1ecc66f26c Merge "highbd variance: resolve missing declarations" 2018-12-21 14:43:50 +00:00
Jerome Jiang 6dbf738a43 vp8: kill all threads on corrupted frame.
If decoder keeps going, threads will be brought up.

BUG=902650,webm:1577

Change-Id: I7765ba134aeed76ec0f58bd05e3a35383e6861c3
2018-12-21 07:20:47 +00:00
Harish Mahendrakar 4bab0eca07 Merge "Add Tile-SB-Row based Multi-threading in Decoder" 2018-12-21 05:48:12 +00:00
James Zern 3e852ff6ee Merge "vpx/{vp8,vpx_encoder}.h: fix some typos" 2018-12-21 04:00:21 +00:00
James Zern d33705d13a Merge "vp9: limit lpf workers to min(threads,tiles,sb_rows)" 2018-12-21 03:59:56 +00:00
James Zern 37e7c8a7a8 Merge "vpx_integer.h: remove VPX_EMULATE_INTTYPES" 2018-12-21 03:59:14 +00:00
Johann Koenig f375eefde1 Merge "svc examples: resolve missing declarations" 2018-12-21 01:14:17 +00:00
Johann 85032bac38 fdct_quant: resolve missing declarations
Store outputs using store_tran_low()

BUG=webm:1584

Change-Id: I213abe047e14625c5ef80df7fa6fdc2a31e40fb6
2018-12-21 00:27:03 +00:00
James Zern aecddea4f9 vpx_integer.h: remove VPX_EMULATE_INTTYPES
platforms supported by the library all offer stdint.h

BUG=webm:1573

Change-Id: I2ad95dfbcfc2d1890c1b7e503340fda8a9849635
2018-12-20 16:07:41 -08:00
Johann 17c6f62628 svc examples: resolve missing declarations
BUG=webm:1584

Change-Id: Icb7ba5bb5a6d460c4d0419b76ee54af461ca4a52
2018-12-21 00:03:15 +00:00
Johann c10f9adbc0 highbd quantize: resolve missing declarations
BUG=webm:1584

Change-Id: Ia3f152bf2a37f8a1ea4178eeb1a6a262ea034a8d
2018-12-20 23:57:52 +00:00
Johann Koenig 1895855641 Merge "tiny_ssim.c: resolve missing declarations" 2018-12-20 23:12:17 +00:00
Johann 22c2bd3e06 highbd variance: resolve missing declarations
The optimizations were accidentally disabled during the move from vp9

commit  c3bdffb0a5
author  Johann <johannkoenig@google.com>        Fri May 15 18:52:03 2015
Move variance functions to vpx_dsp

subpel functions will be moved in another patch.

BUG=webm:1584

Change-Id: Ia7899ee0cfad13a0e1516b89756552064846e81c
2018-12-20 20:06:43 +00:00
Yunqing Wang 3b013f7ff8 Refactor temporal filtering
Refactored temporal filtering, so that it was not hard-coded to
16x16 block size.

Change-Id: I06d0787660ff6eee6a8f02a846ad0e26c6825f54
2018-12-20 20:02:49 +00:00
Johann Koenig 61fcdc3821 Merge "vp9/encoder: resolve missing declarations" 2018-12-20 19:42:58 +00:00
Johann 850e014a8e tiny_ssim.c: resolve missing declarations
-Wmissing-declarations exposed several unused functions.

BUG=webm:1584

Change-Id: I88dfeb8ffa31253a0fb7674f6fe5fcd496179f96
2018-12-20 17:10:44 +00:00
Johann Koenig 096fc82b33 vp9/encoder: resolve missing declarations
Mark local functions as 'static.' Found with -Wmissing-declarations

BUG=webm:1584

Change-Id: Icbdb0ceca3dbf3005ca29bfda05d533d241577d0
2018-12-20 17:10:34 +00:00
Yaowu Xu 03a5457671 Merge "Remove a special case" 2018-12-20 16:54:08 +00:00
James Zern 3035d5c547 vp9: limit lpf workers to min(threads,tiles,sb_rows)
this implementation does not scale well beyond that. this restores the
performance in v1.7.0.

BUG=webm:1574

Change-Id: I8f3464cfe871988fa06ebefe9954811fd002584e
2018-12-20 00:22:56 -08:00
Jingning Han 640b3195cb Merge "Unify AQ mode rdmult update interface" 2018-12-20 07:07:32 +00:00
Jingning Han f1f4f88305 Merge "Add control interface to PSNR_AQ mode" 2018-12-20 07:07:23 +00:00
James Bankoski 851554a055 Merge "vpxenc : fix misleading documentation about sharpness." 2018-12-20 01:28:41 +00:00
Yaowu Xu 7a19474d5d Remove a special case
The special case was put in to prevent a lossless test failure, the
issue has been dealt with by a recent fix of skip condition in
lossless mode.

Change-Id: Ia25d2bf6beead2208841b4f012171dffac15f411
2018-12-20 01:14:07 +00:00
Jingning Han 4a5c340a9c Merge "Refactor aq mode segment_id assignment" 2018-12-20 00:18:20 +00:00
Jim Bankoski 57be693521 vpxenc : fix misleading documentation about sharpness.
Change-Id: I792c178736a9fc02a84aa83f351e12b7227259b0
2018-12-19 15:52:23 -08:00
James Zern 1b8788e6af Merge "vpx/*.h: rm some deprecated defines/enum vals/typedefs" 2018-12-19 23:46:51 +00:00
James Zern 07ec384795 Merge "vpx/vp8cx.h: fix some typos" 2018-12-19 23:45:50 +00:00
James Zern c0662b4556 Merge "vpx_integer.h: drop VS2010 workaround" 2018-12-19 23:45:22 +00:00
James Zern 413763fe18 vpx_integer.h: drop VS2010 workaround
visual studio 2015 is the current minimum

BUG=webm:1573

Change-Id: I22139925c0a322b1da214c38d8f74fadbc34d2de
2018-12-19 13:01:28 -08:00
James Zern 34a8467e9c vpx/{vp8,vpx_encoder}.h: fix some typos
BUG=webm:1573

Change-Id: I5cbb29c89955aa1548ea2a2b3da5763bd38dd978
2018-12-19 13:00:59 -08:00
Yaowu Xu c17d6997cd Merge "Correct condition for skip" 2018-12-19 20:57:24 +00:00
Jingning Han 53ca139537 Unify AQ mode rdmult update interface
Handle the rdmult update for all AQ modes in a single function
call.

Change-Id: Ia0dfce637cf70d646bd3cd0abe3064e9491b81b8
2018-12-19 12:55:54 -08:00
Jingning Han ee1fbb91af Add control interface to PSNR_AQ mode
Change-Id: I760c69189fb8d8d85b5daffc86064c66913c0220
2018-12-19 12:53:11 -08:00
James Zern 99e65258cb vpx/vp8cx.h: fix some typos
BUG=webm:1573

Change-Id: I46faa216a4a8278a363a8111237342f73e8467eb
2018-12-19 12:27:53 -08:00
James Zern 9667a42475 vpx/*.h: rm some deprecated defines/enum vals/typedefs
most predate 1.4.0 the DBG enums were deprecated in 1.6.1. VPX_KF_FIXED
is left as it's still fairly widely used

BUG=webm:1573

Change-Id: Iacaad28a6fe7251f042a2b45507b00fc5b7a0eac
2018-12-19 12:26:56 -08:00
Yaowu Xu 632ee6aa76 Correct condition for skip
Do not skip without check when lossless is requested.

Change-Id: Iceda428e7bf5ab19202b1dcb598e389fcaf6978d
2018-12-19 10:48:37 -08:00
Jingning Han 26b9d08591 Merge "Rework set_offsets() for rd search" 2018-12-19 18:46:53 +00:00
Paul Wilkins 5a7bb4e20c Merge "Improve rd_variance_adjustment() for low variance blocks." 2018-12-19 09:47:27 +00:00
Jingning Han 535e4039cd Refactor aq mode segment_id assignment
Factor out the segment_id assignment for various AQ modes.

Change-Id: I34a86524048621cd369baf4bafbdfac621994563
2018-12-18 22:43:34 -08:00
sdeng d931eb556b No need to shift in SSIM calculations
We only need to shift in the encoder when the input bit depth
does not equal to the encoder internal bit depth.

Change-Id: If9af62382ac6824f33dc7dcdd3d3ff7802b92e9a
2018-12-18 16:46:29 -08:00
Sai Deng ac858f0a78 Merge "Disallow the comparison between videos with different bit depth" 2018-12-19 00:40:49 +00:00
Jingning Han 181dd52a23 Rework set_offsets() for rd search
Factor out the segment_id setup from mi array alignment.

Change-Id: I345ad7ea7b6c9edb6f86224e1941f2c954d68ff3
2018-12-18 15:15:47 -08:00
Marco Paniconi 57f7c6f191 vp9-svc: Adjust step_param for screen-content
Use same step_param for all spatial layers for now.
Some improvement in quality on scrolling for spatial
enhancement layer.

Change-Id: Ic9eed8ba5dd44493e9f5e81f6115df2a25825d16
2018-12-18 11:01:33 -08:00
Jingning Han 3756a3b658 Merge "Localize x->encode_breakout setup to non-rd mode search" 2018-12-18 17:46:05 +00:00
Jingning Han 98fc585cba Merge "Add frame header control to turn on PSNR_AQ mode" 2018-12-18 15:49:48 +00:00
Jingning Han 2057c4e495 Merge "Add PSNR_AQ mode" 2018-12-18 15:49:36 +00:00
Ritu Baldwa 02b3ef7fae Add Tile-SB-Row based Multi-threading in Decoder
Add the multi-thread function that decodes a video row by row instead
of a tile at a time. Create a job queue for queueing all parse and recon jobs.
Each SB row of a tile is a job.

Performance Improvement:

Platform        Resolution      3 Threads       4 Threads
ARM             720p            36.81%          18.37%
                1080p           32.27%          14.76%

ARM Improvement measured on Nexus 6 Snapdragon 805 Quad-core  @ 2.65 GHz

Change-Id: I3d4dd7a932fc2904c90d9546b2de99c809afd29e
2018-12-18 17:39:38 +05:30
Jingning Han bd26b8aa71 Merge "Relocate tpl buffer allocation" 2018-12-18 06:34:49 +00:00
Jingning Han e0d406586a Relocate tpl buffer allocation
Move it to deeper stages where all the encoder configurations have
been set. This avoids the encoding failure when the buffer is
allocated before the encoder is fully configured.

Change-Id: I6723966fd2c7c36fbab9a92d1f3bd59c83ed95f0
2018-12-17 21:01:13 -08:00
Marco Paniconi 368200a807 vp9-svc: Fix condition in real-time speed setting
Remove the "spatial_layer_id == 0" condition in
the speed features for setting the motion search
for screen content.

Change-Id: Ib47aea3af5f3b2e04226694b4126b2ae2f458f13
2018-12-17 18:41:30 -08:00
Jingning Han 181aa3d7fd Localize x->encode_breakout setup to non-rd mode search
The breakout speed feature is currently only used by the non-rd
mode search path. Localize it to simplify set_offset() logic.

Change-Id: I27e7519c987a7caac2e4bd6be0ede1b9c8320e55
2018-12-17 16:50:15 -08:00
Jingning Han 7ec4a4b3eb Add frame header control to turn on PSNR_AQ mode
Change-Id: I46f695b15153c8c508f525a5673db24326371977
2018-12-17 15:34:59 -08:00
Jingning Han 1358fdf2cb Add PSNR_AQ mode
Placeholder to support adaptive quantizer for PSNR and SSIM coding
quality improvement.

Change-Id: Id967c9914bb1d72a6f480ef1ba9d6650914dd658
2018-12-17 15:33:51 -08:00
sdeng 9d8122dd72 Disallow the comparison between videos with different bit depth
Change-Id: I1fd8e991f2440925e989d8e7ab33fdf5f6b1d36b
2018-12-17 15:25:37 -08:00
Marco Paniconi db41138a09 vp9-svc: Adjust search step param for spatial layers
For non-base spatial layer in screen-content mode:
use nstep but with larger step_param value than sl0,
to avoid increase in encode_time.
Some improvement on scrolling slides content.

Change-Id: Ica918ac01664431d1fabb3c674d857cf6ad87414
2018-12-17 15:24:59 -08:00
Marco Paniconi 581eed2bc0 Merge "vp9-svc: Define rc scene change flag per superframe" 2018-12-17 23:09:22 +00:00
Johann Koenig b5b1a77530 Merge "doxygen: fix --disable-examples" 2018-12-17 23:07:38 +00:00
Jerome Jiang 2504c94848 Merge "Remove -Wextra suppression." 2018-12-17 23:03:04 +00:00
Marco Paniconi e3e770dbf7 vp9-svc: Define rc scene change flag per superframe
Define the rc->high_num_blocks_with_motion, set in the
scene change analysis, to be defined per superframe.
This is used for increasing motion search area on
some (super)frames, e.g., for scrolling.

Also some code cleanup in rt_speed_feature_.

No change in behavior.

Change-Id: I1a5c04b9cd4aef1723ce42f82e981a2ca15c8b9d
2018-12-17 13:42:05 -08:00
Jerome Jiang 967042c929 Remove -Wextra suppression.
BUG=webm:1246

Change-Id: Iae78e266faa9c4989500fc919b24f2f584ac0550
2018-12-17 13:40:06 -08:00
Angie Chiang c88a39bfee Merge "Add build_motion_field()" 2018-12-17 19:19:21 +00:00
Jerome Jiang d8f89c49e1 Merge "vp8: Fix potential use-after-free in mfqe." 2018-12-15 01:00:46 +00:00
Sai Deng eb930cf3f7 Merge "Remove unused code in tiny_ssim" 2018-12-15 00:50:20 +00:00
Jerome Jiang 0e408ea67c vp8: Fix potential use-after-free in mfqe.
Similar issue to 842265.

The pointer in vp8 postproc refers to show_frame_mi which is only
updated on show frame. However, when there is a no-show frame which also
changes the size (thus new frame buffers allocated), show_frame_mi is
not updated with new frame buffer memory.

Change the pointer in postproc to mi which is always updated.

BUG=913246

Change-Id: I5159ba7134a06db472c29a1d84b8d39bb60c7254
2018-12-14 15:00:29 -08:00
sdeng de38e2c36f Remove unused code in tiny_ssim
Change-Id: Ife6eb3f8651daa209eeeb8eff85158f00d418647
2018-12-14 08:47:46 -08:00
Marco Paniconi 18d260d13f vp8-mfqe: Increase initial frame# threshold
Increase the initial frame number threshold
for the mfqe, as using the running average of
last_base_qindex doesn't work well after very
first frame.

Only affects the very first few frames.
Fixes an issue with a test.

Change-Id: Ia249924257b44263e0b9f43cbff473902f08e28c
2018-12-13 19:38:29 -08:00
Marco Paniconi 15389ab11d vp9-svc: On scene change: only reset TL in flexible mode.
On scene/slide change detected on TL > 0 frame, only
reset the temporal layer pattern for flexible/bypass mode.

Change-Id: Ib848778addc10ef6981b92839af397833fd4a908
2018-12-13 15:43:33 -08:00
Johann 52e1465e99 doxygen: fix --disable-examples
Only include the sample code link when they are built.

BUG=webm:1565

Change-Id: If13126b59953b51a76c964da4a8c58eb367f2dd7
2018-12-13 13:04:09 -08:00
Jingning Han 890c8a15d1 Merge "Make the use of tpl model controlled by the encoder params" 2018-12-13 18:13:42 +00:00
Jingning Han 97875fa840 Make the use of tpl model controlled by the encoder params
The control has been exposed to the vpxenc input parameter. Remove
the internal hard coded control that disables it at speed 1 and
above settings.

Change-Id: Ib17772cb895f24da5a7d0487e748cc1a9c6740b3
2018-12-13 09:21:28 -08:00
James Zern c62d9d568f Merge "update libwebm to libwebm-1.0.0.27-352-g6ab9fcf" 2018-12-13 00:03:03 +00:00
James Zern f00890eecd update libwebm to libwebm-1.0.0.27-352-g6ab9fcf
https://chromium.googlesource.com/webm/libwebm/+log/af81f26..6ab9fcf

Change-Id: I9d56e1fbaba9b96404b4fbabefddc1a85b79c25d
2018-12-12 14:43:55 -08:00
Angie Chiang 4063d8053a Merge "Replace mv_arr by pyramid_mv_arr" 2018-12-12 22:43:44 +00:00
Angie Chiang 198dbeaf03 Add build_motion_field()
Move the related code into the function.
This is to facilitate of building pyramid motion field.

Change-Id: I879db2271e227af63c5eac76b0c70c985b86a2da
2018-12-12 12:37:49 -08:00
Angie Chiang 4f8eda3603 Replace mv_arr by pyramid_mv_arr
We plan to compute mv field in different scale.

Change-Id: I49a92d948f8b5dbab78e38c61f5f4f879bbe269f
2018-12-12 11:34:51 -08:00
Angie Chiang 4c2cfc19cf Merge changes I44da4884,I36e3bcae
* changes:
  Change interface of motion_compensated_prediction
  Move prepare_nb_full_mvs to vp9_mcomp.c
2018-12-12 19:25:53 +00:00
Marco Paniconi 763f8318de vp8: Fix to enabling MFQE
Remove the unused *_DEBUG_* enum values in vpx/vp8.h

This fixes issue with enabling MFQE, which was
caused in 4807f15, where the unused DEBUG flags
were removed from common/ppflags.h but not in vp8.h.

BUG=913246

Change-Id: I47f114ef20adc084cb4883add5ac3ebf58ae9f1d
2018-12-11 22:03:50 -08:00
Deepa K G e9d2f44d12 Merge "Rescale arf bit budget calculation" 2018-12-12 04:47:56 +00:00
Deepa K G 4c79c3b922 Merge "Use undamped adjustment for rate correction factors" 2018-12-12 04:47:30 +00:00
James Zern d84c2ddbb2 Merge "test/svc_end_to_end_test: fix SetConfig() signature" 2018-12-12 01:59:42 +00:00
Angie Chiang dd606c1595 Change interface of motion_compensated_prediction
Change-Id: I44da4884eea26f0feb7b17f4100db7e5bddd14b4
2018-12-11 16:19:50 -08:00
Angie Chiang f835e9f693 Move prepare_nb_full_mvs to vp9_mcomp.c
Change-Id: I36e3bcae60751a9caeac03a3c94cb752b73a010b
2018-12-11 14:05:12 -08:00
Jerome Jiang 69d75c55cf Merge "Refactor svc_*_test.cc" 2018-12-11 18:39:41 +00:00
Paul Wilkins cfb1f2e93d Merge "Fix intra_count_low calculation in first pass" 2018-12-11 13:33:56 +00:00
Deepa K G e0454d3689 Use undamped adjustment for rate correction factors
Undamped adjustment is used for the first frame
of each frame type while updating the rate
correction factors.

Change-Id: I42f80daa123c4cd4e45c18c6960cc7a67e7df7e6
2018-12-11 17:38:06 +05:30
James Zern 7fb77b33ef test/svc_end_to_end_test: fix SetConfig() signature
make the parameter constant to match the base class and mark the
function virtual. virtual is used to match the rest of the code base,
but now that c++11 is required all such functions could be changed to
override.

since:
bb3a82ec3 vp9 svc: add test for scaling partition on 1080p crash.

Change-Id: I4717f0116a231ea954b34da9cfec69c462c21699
2018-12-10 23:01:01 -08:00
Jingning Han 16282c36f3 Merge "Clean up condition logics in rc_pick_q_and_bounds_two_pass()" 2018-12-11 06:42:18 +00:00
Jingning Han 761609bff4 Clean up condition logics in rc_pick_q_and_bounds_two_pass()
Factor out common conditions for better readability.

Change-Id: I2a2b576e7d3e5cf036e9e355fc7ce0509ecb3d7e
2018-12-10 16:21:11 -08:00
Hui Su 842afb26f2 Merge "Remove redundant code about motion vector test" 2018-12-11 00:14:32 +00:00
Jerome Jiang 36f523e213 Refactor svc_*_test.cc
Put test classes into svc_test namespace.
Make num_nonref_frames_ and mismatched_nframes private, as they're
computed by encoder/decoder hooks which shouldn't be modified outside
the class.
Add accessor to num_nonref_frames_.

Change-Id: I3836a45426796ba6a8c98dd31e21b5aec4b8abf4
2018-12-10 12:53:20 -08:00
James Zern d9872c5a5a Merge "test/svc_*_test: fix SetConfig() signature" 2018-12-10 19:49:14 +00:00
Hui Su 68b3459819 Remove redundant code about motion vector test
Only need to set find_fractional_mv_step once.

Change-Id: Ib59dd1e3bb8bc973f2e0f3fc436738bfaf2fad81
2018-12-10 10:33:49 -08:00
Hui Su c172de2f28 Merge "Add enum definition for subpel search precision" 2018-12-10 18:17:57 +00:00
Johann Koenig 1db36f4d38 Merge "apply -Wextra to third_party/" 2018-12-10 16:47:03 +00:00
Jerome Jiang 132c0af352 Merge "vp9 screen: Update motion search offset when set to NSTEP." 2018-12-10 01:44:43 +00:00
Jerome Jiang f7ea14c80a Merge "vp9 svc: add test for scaling partition on 1080p crash." 2018-12-10 01:44:20 +00:00
James Zern 70391e8fd6 test/svc_*_test: fix SetConfig() signature
make the parameter constant to match the base class and mark the
function virtual. virtual is used to match the rest of the code base,
but now that c++11 is required all such functions could be changed to
override.

Change-Id: I551a05bbd9d05a9eddb653f42eaad68880c88141
2018-12-09 07:17:31 +00:00
Sai Deng 0b7e4af744 Merge "Add satd avx2 implementation" 2018-12-08 18:43:55 +00:00
Jerome Jiang bb3a82ec36 vp9 svc: add test for scaling partition on 1080p crash.
BUG=webm:1578
Change-Id: Ie03ed454394933fa89f751edc6928651393f3f12
2018-12-07 21:38:30 -08:00
Jerome Jiang 715c30f034 vp9 screen: Update motion search offset when set to NSTEP.
Search method and step parameter might be changed in speed settings.
In this case, we should update the search area offset due to the change
of search method.

Change-Id: I51dc584bbf35e998757da326355dd4b8a4d0093f
2018-12-07 21:33:49 -08:00
James Zern 673ebe8d2b Merge "test/*: use std::*tuple" 2018-12-08 04:10:38 +00:00
James Zern 8f03f719af test/*: use std::*tuple
since:
77fa51003 Replace deprecated scoped_ptr with unique_ptr

c++11 has been required so <tuple> is safe to use

Change-Id: I873cb953104b361a8503b5839a3372ce2b99e73c
2018-12-07 17:55:21 -08:00
Hui Su b23a05422e Add enum definition for subpel search precision
To improve readability.

Change-Id: Idc08b2068c7d8ba9dadc0d559a3b4d61c2a88c94
2018-12-07 15:48:11 -08:00
Yaowu Xu 3338c878a7 Merge "Optimize RDMult" 2018-12-07 22:53:43 +00:00
Angie Chiang a46bb83e0c Merge changes Id10f72b3,Icde1b01e,I391aa322
* changes:
  Implement find_prev_nb_full_mvs
  Implement get_full_mv()
  Pass mv_num into vp9_nb_mvs_inconsistency()
2018-12-07 22:27:11 +00:00
Johann 381eb3c799 apply -Wextra to third_party/
googletest builds cleanly with -Wextra

Remove comments about webm:1069. The vp8 issue is tracked in webm:1246.

Change-Id: I8bbb01d34503cc9c342f5c3aa78e9476f72b94c2
2018-12-07 09:28:02 -08:00
sdeng 64c4cedd3a Add high bit Hadamard 32x32 avx2 implementation
Speed test:
[ RUN      ] C/HadamardHighbdTest.DISABLED_Speed/2
Hadamard32x32[          10 runs]: 9 us
Hadamard32x32[       10000 runs]: 8914 us
Hadamard32x32[    10000000 runs]: 8991776 us

[ RUN      ] AVX2/HadamardHighbdTest.DISABLED_Speed/2
Hadamard32x32[          10 runs]: 5 us
Hadamard32x32[       10000 runs]: 4582 us
Hadamard32x32[    10000000 runs]: 4548203 us

Change-Id: Ied1b38b510bd033299f05869216d394e3b7f70f1
2018-12-07 09:05:06 -08:00
Sai Deng b02ac73d8c Merge "Add high bit Hadamard 16x16 avx2 implementation" 2018-12-07 17:00:03 +00:00
sdeng f6a002f2a6 Add satd avx2 implementation
Speed Test:
C/SatdHighbdTest
blocksize:   16 time:  138 us
blocksize:   64 time:  315 us
blocksize:  256 time: 1120 us
blocksize: 1024 time: 3955 us

AVX2/SatdHighbdTest
blocksize:   16 time:   89 us
blocksize:   64 time:  189 us
blocksize:  256 time:  590 us
blocksize: 1024 time: 1912 us

Change-Id: I6357174462fccd589a475b13d8114b853cab5383
2018-12-06 21:21:39 -08:00
Jerome Jiang 418acaa0bd Merge "vp9 decoder: cleanup on exit if no available frame buffer." 2018-12-07 02:51:24 +00:00
Jerome Jiang b87dab4538 vp9 decoder: cleanup on exit if no available frame buffer.
There was no setjmp on vpx_internal_error when there is no available
frame buffer, ready_for_new_data is not reset to 1.

BUG=webm:1571

Change-Id: I4f8efffb7d6fed3085b1f0229d0d1071a056b6c6
2018-12-06 16:58:27 -08:00
Angie Chiang dcea8785ad Implement find_prev_nb_full_mvs
In single_motion_search, we use prev coded nb full mvs to compute
mv inconsistency.

lambda is set to block_area / 4.

This is a draft. Will to experiments to figure out the impact on
coding efficiency and visual quality.

Change-Id: Id10f72b3c7e6085bfbe1a6156b9fd6917843d001
2018-12-06 16:02:50 -08:00
Angie Chiang 2100ad3557 Implement get_full_mv()
Change-Id: Icde1b01ea7f64e2c43dcd039cc37fd306e43030f
2018-12-06 16:00:44 -08:00
Angie Chiang f497f8f719 Pass mv_num into vp9_nb_mvs_inconsistency()
This allow av1_nb_mvs_inconsistency to cope with variant number of
motion vectors.

Change-Id: I391aa322d458cfefaf640e7b07d5ad5ce2d3375c
2018-12-06 15:54:35 -08:00
James Zern 4fa9f733f5 Merge "configure: test -std=c++11 before enabling unit tests" 2018-12-06 22:34:14 +00:00
Paul Wilkins 2e5bbdc8fc Improve rd_variance_adjustment() for low variance blocks.
Change the cross over point for switching between per pixel
and per block variance numbers when comparing reconstruction
and source complexity.

This improves the accuracy of the comparison for low variance
regions, For example, recon and source may both have an integer per
pixel variance of 1, but one of these may actually be be 1.01 and the
other 1.99.

The reason for using per pixel at all was because this number is already
available for the source block so does not need to be recomputed
here. Changing the threshold from >0 to >100 for using per pixel values
will thus cause a little extra work for some blocks.

With my default runs on derf and nf sets their is a net gain as follows:
(-ve = better, Overal PSNR, SSIm, PSNR-HVS)

derf low res  -0.106, -0.107, -0.093
midres -0.000, -0.021, 0.001
hd res -0.198, -0.190, -0.282
nf2k -0.090, -0.088, -0.077

Change-Id: I53ef514fe1c35ee3f08c64e9b22fc05fc7fe5887
2018-12-06 13:34:41 +00:00
Harish Mahendrakar c40224631a Merge "Fix DoS in Error Streams" 2018-12-06 05:53:43 +00:00
Yaowu Xu 93488acda6 Optimize RDMult
This commit introduces the optimized RDMult values for both key
and non-key frames. For key frames, the commit gets values back
from commit#b13f6154df9c0834d74f7e3d41e41c4208f56d18. For impact
on key frame only encodings, see commit message for that commit.

For inter frames, the values get optimzied by running encoding tests
in Q mode with the following range using 150 frames:
2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62

The impact of current set of RDMULT values:
           PSNR     SSIM     PSNR-HVS
lowres:  -0.325%   0.422%    -0.228%
midres:  -0.377%   0.158%    -0.376%
hdres:   -0.309%   0.522%    -0.322%

Test baseline is on commit#35617458

Overall, the values help PSNR based metrics, but hurt SSIM metric
slightly.

Change-Id: I7eba37a6524cb36b8498a1d104d2667781bc2089
2018-12-05 18:27:17 -08:00
sdeng b28b0709b9 Add high bit Hadamard 16x16 avx2 implementation
Speed test:
[ RUN      ] C/HadamardHighbdTest.DISABLED_Speed/1
Hadamard16x16[          10 runs]: 2 us
Hadamard16x16[       10000 runs]: 1836 us
Hadamard16x16[    10000000 runs]: 1829451 us

[ RUN      ] AVX2/HadamardHighbdTest.DISABLED_Speed/1
Hadamard16x16[          10 runs]: 1 us
Hadamard16x16[       10000 runs]: 1009 us
Hadamard16x16[    10000000 runs]: 984856 us

Change-Id: I89b9cdbe19350815576d66e627df87e5025ed0a4
2018-12-05 16:40:24 -08:00
Johann Koenig 5fddd63973 Merge "remove old visual studio support" 2018-12-05 23:40:57 +00:00
Jerome Jiang 67c999876e Merge "Refactor datarate svc test." 2018-12-05 18:48:15 +00:00
Johann Koenig 08f281ef0e Merge "quantize neon: fix hbd builds" 2018-12-05 18:20:29 +00:00
Johann 539dc7649f remove old visual studio support
Change-Id: I86682ef1aac1991e1ef6965e7aa298f6619bee13
2018-12-05 10:18:06 -08:00
Sai Deng fc5d166782 Merge "Fix overflow in calculating highbd SSIM" 2018-12-05 17:55:58 +00:00
Harish Mahendrakar 509ec5945b Merge "Add Parse and Recon Split functions" 2018-12-05 17:31:00 +00:00
Deepa K G 7a19f6afea Rescale arf bit budget calculation
To compute the total budget for a depth layer, exclude the count of
frames in the current layer.

Change-Id: I9ffd1f63ea597de3ea95e0832b13f5b1f35cb086
2018-12-05 16:52:49 +05:30
Jerome Jiang 9fcda19178 Refactor datarate svc test.
Bring some repeated test set up into a function.

Change-Id: I6acc545a349dc16581a23baf848c91ec36a2e83f
2018-12-04 17:33:24 -08:00
sdeng 6dc612758c Fix overflow in calculating highbd SSIM
Example internal stats
Before the fix:
Bitrate	AVGPsnr	GLBPsnr	AVPsnrP	GLPsnrP	VPXSSIM	VPSSIMP	FASTSIM	PSNRHVS	WstPsnr	WstSsim	WstFast	WstHVS	AVPsnrY	APsnrCb	APsnrCr	  Block	WstBlck	Consist	WstCons	    Time	RcErr	AbsErr
 153.39	 37.131	 36.420	 37.151	 36.437	716.077	817.445	 10.422	 34.347	 32.980	  0.916	  9.281	 30.208	 36.024	 41.830	 40.581	  0.000	  0.000	100.000	100.000	   55006	   2.26	   2.26
No mismatch detected in recon buffers

After the fix:
Bitrate	AVGPsnr	GLBPsnr	AVPsnrP	GLPsnrP	VPXSSIM	VPSSIMP	FASTSIM	PSNRHVS	WstPsnr	WstSsim	WstFast	WstHVS	AVPsnrY	APsnrCb	APsnrCr	  Block	WstBlck	Consist	WstCons	    Time	RcErr	AbsErr
 153.39	 37.131	 36.420	 37.151	 36.437	 69.808	 70.023	 10.422	 34.347	 32.980	  0.910	  9.281	 30.208	 36.024	 41.830	 40.581	  0.000	  0.000	100.000	100.000	   55067	   2.26	   2.26
No mismatch detected in recon buffers

Change-Id: I820abc498c1543548f193874046582b50afd0238
2018-12-05 00:55:41 +00:00
James Zern 9c8ae2de7b configure: test -std=c++11 before enabling unit tests
since:
77fa51003 Replace deprecated scoped_ptr with unique_ptr

the unit tests require a c++11 capable compiler; future versions of
googletest (1.9.x) will as well, so this change was inevitable if we
wanted to keep the snapshot up to date.

Change-Id: Id5c646bd10fae09e7b705b7d5fad1344f2216282
2018-12-04 16:38:08 -08:00
Angie Chiang 3c6fefebee Fix the parameter of vp9_full_pixel_diamond_new
Change-Id: I36b970953c960fde65d7b7705ccfa575c8741c43
2018-12-04 15:46:51 -08:00
Johann Koenig 5039d2d82b Merge "Reland "third_party/googletest: update to v1.8.1"" 2018-12-04 23:40:41 +00:00
Jerome Jiang 51fe234ad1 Merge "vp9: force refresh of long term ref when denoiser reset." 2018-12-04 23:34:20 +00:00
Jerome Jiang 28345f9730 vp9: force refresh of long term ref when denoiser reset.
This will allocate extra frame buffer if long term temporal reference is
used and denoiser is enabled on non-key frame.

Add test.

Change-Id: I0e8d1fdb9a2d697a8eed7fe6206bcb362e69f1c8
2018-12-04 12:16:40 -08:00
Jingning Han c6a8921172 Merge "Clean up rc_pick_q_and_bounds_two_pass()" 2018-12-04 17:33:42 +00:00
Deepa K G 69a5a1d19c Fix intra_count_low calculation in first pass
In first pass, scaled_low_intra_thresh should not be
compared with motion_error, as scaled_low_intra_thresh
accounts for bit-depth, whereas motion_error does not.
In addition, mv_cost is excluded for comparison.

Change-Id: Id2fa02d364c086876c71ffebb2dd763eaa647e4a
2018-12-04 18:45:08 +05:30
Sai Deng e438411581 Merge "Add high bit Hadamard 8x8 avx2 implementation" 2018-12-04 00:53:42 +00:00
Marco Paniconi 1081ca3774 vp9-svc: Fix to postencode drop for layers.
Postencode drop is only checked on base spatial
layers, and if set, whole superframe is dropped and
and next superframe is encoded at max-q.

Fix here is to make sure all layers are encoded at
max-q on a postencode dropped frame.

Change-Id: I2313d83ee29a382465bcec1085d8c73c37ce26d6
2018-12-03 14:25:06 -08:00
Marco Paniconi c88ec17eb2 Merge "vp9: Rename post_encode drop function." 2018-12-03 22:09:09 +00:00
Marco Paniconi b2f37bac65 vp9: Overshoot detection for skipped base layer.
If scene/slide change is detected on current
superframe and max-q set because of high overshoot:
then if the lower/base spatial layer are skipped on
the current superframe, max-q is forced on the
next encoded base/lower spatial layers.

Change-Id: Id61efda86ee545395012e19476d19845e3932678
2018-12-03 12:49:29 -08:00
Marco Paniconi c34e5cafca vp9: Rename post_encode drop function.
Feature works also for non-screen content mode,
so rename it.

Change-Id: I665362d50cf9a4017f114973586ad0eead066ddd
2018-12-03 12:18:35 -08:00
Johann 26dbf9eba8 quantize neon: fix hbd builds
BUG=webm:1448

Change-Id: I2140fb9b6ce92716d2d9509f3031244088a62127
2018-12-03 10:55:00 -08:00
sdeng f5306a5091 Add high bit Hadamard 8x8 avx2 implementation
Speed tests:
[ RUN      ] C/HadamardHighbdTest.DISABLED_Speed/0
Hadamard8x8[          10 runs]: 0 us
Hadamard8x8[       10000 runs]: 316 us
Hadamard8x8[    10000000 runs]: 311749 us
[       OK ] C/HadamardHighbdTest.DISABLED_Speed/0 (371 ms)

[ RUN      ] AVX2/HadamardHighbdTest.DISABLED_Speed/0
Hadamard8x8[          10 runs]: 0 us
Hadamard8x8[       10000 runs]: 161 us
Hadamard8x8[    10000000 runs]: 156910 us
[       OK ] AVX2/HadamardHighbdTest.DISABLED_Speed/0 (160 ms)

Change-Id: I94f7324be20405ff55f8a02ad4651c4ab4c10202
2018-12-03 10:10:41 -08:00
Venkatarama NG. Avadhani 3fb6f75feb Fix DoS in Error Streams
This fixes an issue where, in very rare error cases, one row of LPF
could be waiting infinitely for its previous row's LPF to complete.

With LPF optimization, the second row's LPF could be triggered before
the first row's LPF. In this case, the second row's LPF will wait for
LPF of n-sync number of SBs of the first row to finish. In error
streams, depending on when the error was detected, the LPF job of the
first row may then never be triggered. This puts the thread doing the
second row's LPF in an infinite wait.

The issue is reproduceable once in approximately 500 runs of the clip in
bug 1562.

BUG=webm:1562

Change-Id: I265d7df5ceeff0410334f5b9a4181f895bb54cab
2018-12-03 17:56:55 +00:00
Shubham Tandle ad5ec9b2f3 Add Parse and Recon Split functions
Add functions that will do only parse or only recon. These are
duplicated and modified from decode_partition and decode_block.

Change-Id: I2201e235bf491e823ae63d27b2586bbb43b48929
2018-12-03 17:55:52 +00:00
Johann 5fbc7a286b quantize 32x32: saturate dqcoeff on x86
This slows down low bitdepth builds but is necessary to obtain correct
values.

BUG=webm:1448

Change-Id: I4ca9145f576089bb8496fcfeedeb556dc8fe6574
2018-11-30 16:27:14 -08:00
Jingning Han 3561745835 Merge "Simplify constant q mode qp selection" 2018-11-30 21:12:51 +00:00
Jingning Han 8ef9af0a6c Clean up rc_pick_q_and_bounds_two_pass()
Remove unneeded VPX_Q condition check.

Change-Id: I46b09ae522caa47fa7ea4441b6a6ac2840315d1c
2018-11-30 11:46:27 -08:00
Sai Deng eac33bb791 Merge "Use 16 bit ints in Hadamard highbd col8 first pass" 2018-11-30 19:09:28 +00:00
Johann Koenig c5edbfa92c Merge changes Ic80def57,I61a2f8bf
* changes:
  quantize 32x32: fix dqcoeff
  quantize: fix x86 hbd builds
2018-11-30 18:58:25 +00:00
Angie Chiang 08b19de315 Merge changes I18680413,Iebe38092
* changes:
  Consider mv inconsistency in single_motion_search
  Change the interface of vp9_full_pixel_diamond_new
2018-11-30 18:25:50 +00:00
Jingning Han 7ca417f665 Simplify constant q mode qp selection
Decouple the constant q mode qp selection from vbr/cbr/cq modes.
Skip vp9_frame_type_qdelta() adjustment for non-ARF inter frames,
instead keep using the cq-level. It improves the compresson
performance:

         avg PSNR       overall PSNR     SSIM
lowres   -0.17%         -0.20%           -0.1%
midres   -0.21%         -0.24%           -0.08%
hdres    -0.15%         -0.19%           -0.04%

Change-Id: I52fd5f8edbd3fdcbeda31ee3a6d6eb016091a7e3
2018-11-30 10:23:33 -08:00
Jingning Han 806e1c9843 Factor key frame qp selection from two-pass qp and bound decision
Factor out this common code needed all rate control modes.

Change-Id: If17850fbebcdce7ff24afb211aa2e6054486b814
2018-11-29 22:27:08 -08:00
Angie Chiang 9a2ae31769 Consider mv inconsistency in single_motion_search
This is still a work-in-process.
nb_full_mvs and lambda are set to zero for now, which means
mv inconsistency penalty is zero while doing the mv search.

Change-Id: I18680413d748fbdb9a33621f92f83e021036a3ab
2018-11-29 17:36:57 -08:00
Angie Chiang 5a0d53a031 Change the interface of vp9_full_pixel_diamond_new
Avoid passing in tpl_stats because this function will be called in
motion search, where tpl_stats should be fixed at the point.

Let further_steps becomes internal variable in the function.

Change-Id: Iebe380925eb1891c19e0b78163dab8e6bfafccdb
2018-11-29 17:06:30 -08:00
sdeng 07ab3c642c Use 16 bit ints in Hadamard highbd col8 first pass
Change-Id: I2f04937d8a4e171d42b25ee6c6555ccad29eb192
2018-11-29 15:22:46 -08:00
Jingning Han c41c4752a1 Revert "Optimize RDMULT values for key frames"
This reverts commit b13f6154df.

Temporarily revert this change due to interactions with rate control at very low target bit-rate.

Original change's description:
> Optimize RDMULT values for key frames
> 
> Encoding of 5 frames of each sequence using key frames only, the new
> values help the metrics by:
> 
>          PSNR     SSIM       PSNR-HVS
> lowres:  -0.870%  -0.140%    -0.892%
> midres:  -0.899%  -0.146%    -1.052%
> hdres:   -0.944%  -0.115%    -1.028%
> 
> Tested q range:
> 2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62
> 
> Change-Id: I5b0dda366d589f52987c5bad11a1f95c4e6dc1a5

TBR=yaowu@google.com,paulwilkins@google.com,jingning@google.com,builds@webmproject.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Change-Id: Ie1e4cf21308d69699a65a393a83884882682ea8e
2018-11-29 18:23:40 +00:00
Marco Paniconi 932f8fa04d vp9-svc: Add num_encoded_top layer counter
Useful for noise estimation when top layer
is skipped encoded.

Change-Id: I18cbe6119bac6c21514941b1e3b530a05a42df14
2018-11-29 09:34:53 -08:00
Marco Paniconi 2aacb19ca1 Merge "vp9: Fix condition for disabling noise estimation" 2018-11-29 07:57:44 +00:00
Marco Paniconi 80519937e3 vp9: Fix condition for disabling noise estimation
Fix condition for turning off denoiser due to high
motion: use proper superframe counter and
frames_since_key counter so this condition won't
take effect on key (super)frame.

Change-Id: Ic502bf5ebfa32a921f611a78e8e963eb62b5bc79
2018-11-28 22:12:28 -08:00
Jerome Jiang c076467e8d vp9 denoiser: force copy block when last not a reference.
Last reference doesn't always exist when SVC layers changed dynamically.

When last is not a reference for current layer, copy block directly on
denoiser.

Change-Id: I9d98c4d6fdcfa25ba707db3333712761b5cf9ab8
2018-11-28 20:53:55 -08:00
Jingning Han d40a5bc8a3 Merge "Remove ineffective condition from rc_pick_q_and_bounds" 2018-11-28 18:35:35 +00:00
Johann d566160f32 quantize 32x32: fix dqcoeff
Calculate the high bits of dqcoeff and store them appropriately in high
bit depth builds.

Low bit depth builds still do not pass. C truncates the results after
division. X86 only supports packing with saturation at this step.

BUG=webm:1448

Change-Id: Ic80def575136c7ca37edf18d21e26925b475da98
2018-11-28 11:30:37 -05:00
Johann 0eeb797512 quantize: fix x86 hbd builds
Calculate the high bits of dqcoeff in high bit depth builds and store
them appropriately.

BUG=webm:1448

Change-Id: I61a2f8bfcf2e30765f10a94073c4d58321d2fa24
2018-11-28 11:30:02 -05:00
Marco Paniconi 615922dfb5 Merge "vp9-svc: Add check/reset for long term reference." 2018-11-28 02:58:59 +00:00
Marco Paniconi 2fc5eabcde vp9-svc: Add check/reset for long term reference.
Add check and reset (turn off) usage of long term
reference if some conditons (layer id of reference vs
current frame) are not met.

Change-Id: Ie3a84e3618f4fc4d5f8da4e67316cfbefb8bae78
2018-11-27 16:38:14 -08:00
Jerome Jiang 3861eb6d02 vp9 svc: copy block if ref buffer in denoiser is NULL.
BUG=b/119097707

Change-Id: I6569306e897da46a44f9d8f2fb28a2a355dd4c2c
2018-11-27 15:56:57 -08:00
Johann 69d8a9aa5a Reland "third_party/googletest: update to v1.8.1"
This is a reland of 7d777ce613

Previous attempt was reverted due to build issues with older
versions of Visual Studio. We no longer support VS <= 13.

Original change's description:
> third_party/googletest: update to v1.8.1
>
> BUG=webm:1559
>
> Change-Id: I7a0b16c7bf3f97db2d8650a190b93aae7e12a948

Bug: webm:1559
Change-Id: I9cb39988286cc56125879222ef0bd952d61b7c1d
2018-11-27 22:39:57 +00:00
Jingning Han b1022a018b Merge "Replace deprecated scoped_ptr with unique_ptr" 2018-11-27 19:01:26 +00:00
Johann Koenig fec20805f7 Merge "rename quantize_x86.h" 2018-11-27 18:47:33 +00:00
Jingning Han b3c87f7346 Remove ineffective condition from rc_pick_q_and_bounds
Change-Id: I67b92182ee80ec5548c5a97345b6252e49033c4a
2018-11-27 10:33:43 -08:00
Jingning Han 77fa510030 Replace deprecated scoped_ptr with unique_ptr
Change-Id: I2793a1b65164946eb7d67d80ccba9e798db3d9af
2018-11-27 09:49:12 -08:00
Johann 5757b919c8 rename quantize_x86.h
Pave the way for new quantize_OPT.h helper files.

Change-Id: Ice7225612983f5587a9660af3320c7d0c8bb1c2f
2018-11-27 12:35:54 -05:00
Jingning Han d3d22aa7c9 Merge "Fix ARF rate allocation for cq mode" 2018-11-27 15:54:45 +00:00
Marco Paniconi 45d303632b Merge "VP9 SVC: fix crash on scaling partition." 2018-11-27 04:48:05 +00:00
Marco Paniconi 0aaf170c13 vp9-svc: Put check on usage of long term temporal ref.
If the scale factor of the golden long term reference
is different from the last reference then disable usage
of long term reference.

This should not happen, but add this as a check against
some possibly incorrect update of the svc configuration.

Change-Id: Ic1062d4384e005007d8c922813fa8ad188d8fa98
2018-11-26 19:18:22 -08:00
Jerome Jiang 545f096ef2 VP9 SVC: fix crash on scaling partition.
When scaling up partition from lower resolution layer L, mi_row and
mi_col from L must be smaller than mi_rows and mi_cols from L.

Before this change, the condition was based on mi_rows from top layer
divided by 2, which is not necessarily equal to the mi_rows from lower
resolution layer.

Added variable in SVC structure to keep track of mi_rows and mi_cols
from each spatial layer.

Re-enable partition scaling for 1080p.

BUG=webm:1578

Change-Id: Icc1c701b095cfe0a92bfecca1ed39dbe21da12b6
2018-11-26 18:34:12 -08:00
Marco Paniconi 09ed15d255 vp9-svc: Fix to skip enhancement layer setting
If in constrained layer drop mode, avoid setting
skip flag if base layer is dropped, as whole superframe
will be dropped in this case. This avoids an assert trigger
in the svc superframe packing.

Change-Id: I51c953c7fee979790c65c798bac9bd3d805dc66f
2018-11-26 14:36:34 -08:00
Jingning Han 8d9b1a7d82 Fix ARF rate allocation for cq mode
In the limited test set, it improves the cq mode compression
performance by 1.9% in PSNR and 6% in SSIM as compared to use
same quantization parameter for all ARFs.

Change-Id: I35c4d7097b5838ab0b92d7f9937520721e3bb84b
2018-11-26 14:12:55 -08:00
Marco Paniconi 314641e298 Merge "vp9 screen-content: Keep lower step_param for quality layers." 2018-11-22 01:27:59 +00:00
Angie Chiang 532a42aa52 Merge "Fix scan_build warnings in user_priv_test.cc" 2018-11-22 01:20:18 +00:00
Marco Paniconi bff94472a6 vp9 screen-content: Keep lower step_param for quality layers.
Issue with step_param = 2 seems specific to lower layers
with different resolution.

Change-Id: I26405488ac7691b3e471e98e794d4b1d8098a91d
2018-11-21 16:21:58 -08:00
Angie Chiang b2f59340e8 Fix scan_build warnings in user_priv_test.cc
BUG=webm:1575

Change-Id: I4e38f11162e0de82a730f16b387aeafd2d00e777
2018-11-21 14:59:51 -08:00
Angie Chiang 41bb11cf7e Merge changes I02279405,I87e1c3f0,Id70235c8,I62602aa4,I5722d262
* changes:
  Fix scan_build warnings in tiny_ssim.c
  Fix scan_build warnings in convolve_test.cc
  Fix scan_build warnings in vp9_loopfilter.c
  Fix scan_build warnings in variance_test.cc
  Fix scan_build warnings in vp9_resize.c
2018-11-21 22:55:51 +00:00
Marco Paniconi 9b5ab487b5 vp9 screen-content: Adjust seach step param
Increase to 4 (from 2) on slide/scroll changes,
as there is an issue/failure with the current setting
with offline encode for high resolns.

Change-Id: I8f06c6bdcd59013ab000d75bd75770c667bf70d2
2018-11-21 11:46:45 -08:00
Angie Chiang 6234256646 Fix scan_build warnings in tiny_ssim.c
BUG=webm:1575

Change-Id: I022794054b494512903d912bdbf3e85461f31665
2018-11-21 11:24:09 -08:00
Angie Chiang 49b6b99f5c Fix scan_build warnings in convolve_test.cc
Change-Id: I87e1c3f0492cde805b54b048385ea200652dfccc
2018-11-21 10:40:52 -08:00
Angie Chiang 913f428015 Fix scan_build warnings in vp9_loopfilter.c
BUG=webm:1575

Change-Id: Id70235c801d253d47267c6d34760484f82d5c881
2018-11-21 10:40:52 -08:00
Angie Chiang a2a0cce56c Fix scan_build warnings in variance_test.cc
BUG=webm:1575

Change-Id: I62602aa47f07d525ba95fe7b2618bf62ae23fe6f
2018-11-21 10:40:40 -08:00
Angie Chiang 59e4e67303 Fix scan_build warnings in vp9_resize.c
BUG=webm:1575

Change-Id: I5722d2626b9043b83581a700e58c2b7204113a16
2018-11-21 10:23:33 -08:00
Angie Chiang ba003455ab Merge changes Id47930b4,I4f423630,I277c159b
* changes:
  Replace assert by ASSERT_TRUE
  Fix scan_build warnings in temporal_filter_test.cc
  Fix scan_build warnings in dct_test.cc
2018-11-21 18:06:17 +00:00
Wan-Teh Chang aae6c93173 Merge "Declare buffer_alloc_sz and frame_size as size_t." 2018-11-21 17:15:59 +00:00
Yaowu Xu 06fab8cf67 Merge "Replace int64_t with int for rdmult" 2018-11-21 01:33:50 +00:00
Angie Chiang ab5c8ff8a0 Replace assert by ASSERT_TRUE
BUG=webm:1575

Change-Id: Id47930b48733159f5e967dc5fd1205e501b635b9
2018-11-20 15:08:44 -08:00
Angie Chiang 3c51b91f37 Fix scan_build warnings in temporal_filter_test.cc
BUG=webm:1575

Change-Id: I4f4236305ebd932515451b1306211154b34678de
2018-11-20 14:55:52 -08:00
Angie Chiang e0f2460125 Fix scan_build warnings in dct_test.cc
BUG=webm:1575

Change-Id: I277c159bafa2ef7c3cfa27c86f60e3df0c3b79b3
2018-11-20 14:55:43 -08:00
Marco Paniconi db55f19bca Merge "Disable partition scaling on 1080p and above." 2018-11-20 22:06:33 +00:00
Marco Paniconi 541ab3db72 Merge "vp9-svc: Reset temporal layers on scene change" 2018-11-20 22:02:59 +00:00
Jerome Jiang e6332e04f2 Disable partition scaling on 1080p and above.
BUG=webm:1578
Change-Id: I7c8014b7ab96d372d486433bce24d058a60fdc85
2018-11-20 12:29:51 -08:00
Wan-Teh Chang 73cbff0e53 Declare buffer_alloc_sz and frame_size as size_t.
Change-Id: Id632ddcbfb0bd3a4258aebdfb98f9dc2e4d04438
2018-11-20 12:22:35 -08:00
Marco Paniconi 7195ded2c5 vp9-svc: Reset temporal layers on scene change
Reuse existing function for resetting temporal
layer pattern.

And fix to use first spatial layer to encode, and
some refactoring in encode_without_recode_loop().

Change-Id: Ifb22bb9de793ecb8e73f410e125c7c12383da1d2
2018-11-20 11:19:30 -08:00
Wan-Teh Chang 6b191c16c0 Merge "Validate the |border| parameter earlier." 2018-11-20 18:47:19 +00:00
Angie Chiang 41071cadec Merge changes I9b5f8b08,Ic90b09e5,Ib2380aaf,I3ad3af49,Ib5d1a411
* changes:
  Fix scan_build_warnings in comp_avg_pred_test.cc
  Fix scan_build warnings in convolve_test.cc
  Fix scan_build warnings in idct_test.cc
  Fix scan_build warnings in tiny_ssim.c
  Fix scan_build warning in dct_partial_test.cc
2018-11-20 18:02:46 +00:00
Yaowu Xu 7bc2edfade Replace int64_t with int for rdmult
No need for the upgrade to int64_t

Change-Id: I8331839c00718a0a987257772357be72b40e19be
2018-11-20 09:40:50 -08:00
Wan-Teh Chang 2361c05e89 Validate the |border| parameter earlier.
vpx_realloc_frame_buffer() should validate the |border| parameter
earlier, before it allocates the buffer and preferrably before it uses
|border|.

This backports libaom commit 2860b3ae8b764bdfa2b8c7a06df2673e907b993f:
https://aomedia-review.googlesource.com/c/aom/+/74324

Change-Id: Ib9d59d74e27430ccb1e83c6ad5424aff9672c989
2018-11-20 09:35:00 -08:00
Angie Chiang cf6a576920 Fix scan_build_warnings in comp_avg_pred_test.cc
BUG=webm:1575

Change-Id: I9b5f8b08d23fd62ff6400605023f33e3890b0f2d
2018-11-19 18:48:34 -08:00
Angie Chiang 9a848af54d Fix scan_build warnings in convolve_test.cc
BUG=webm:1575

Change-Id: Ic90b09e596fa68bc516237d31b7f4540831becfd
2018-11-19 18:46:22 -08:00
Angie Chiang 2d672cb97d Fix scan_build warnings in idct_test.cc
BUG=webm:1575

Change-Id: Ib2380aaf8c9f9bc7db87f36701a2792781beb44b
2018-11-19 18:46:01 -08:00
Angie Chiang 0310ebd8d1 Fix scan_build warnings in tiny_ssim.c
BUG=webm:1575

Change-Id: I3ad3af49d778f102e9152dcb1eb9d5c048756cdf
2018-11-19 18:45:23 -08:00
Angie Chiang 50bbc0984c Fix scan_build warning in dct_partial_test.cc
BUG=webm:1575

Change-Id: Ib5d1a411a223a93d1795ebe1af12e67d64fadabe
2018-11-19 18:44:45 -08:00
Yaowu Xu 1e58bdb41d Minor simplifications
Change-Id: I231e863f838f449335236c174b74bd33dfdd8b19
2018-11-19 17:41:03 -08:00
Jerome Jiang cb240c1c41 Merge "Fix oob in vpx_setup_noise" 2018-11-20 00:11:21 +00:00
Marco Paniconi ac3eccdc24 vp9: Fix to the svc buffer update
Condition the pre-encode buffer update based on
TS diff on temporal layers = 1 for now, as some
fix is needed for the case where #temporal_layers > 1.

Change-Id: I58163b956db415217e4687a31f8ba110545b09f5
2018-11-19 13:32:20 -08:00
Yaowu Xu e946eaf3af Merge "Optimize RDMULT values for key frames" 2018-11-19 18:47:21 +00:00
Jerome Jiang a421e21e03 Fix oob in vpx_setup_noise
Array index wasn't checked on boundary.

BUG=webm:1572

Change-Id: I55a93c024af77a4fd904b0e992d5587a142d66a4
2018-11-16 15:40:25 -08:00
Jon Kunkee 3a56c238ee Work around ARM64 Windows SDK arm_neon.h quirk
Since the Windows SDK has an ARM32-only arm_neon.h, files including it
during ARM64 Windows builds need to be redirected to arm64_neon.h.

Instead of editing many files to include ARM64-Windows-specific ifdef
logic, this commit introduces an ARM64-Windows-specific version of
arm_neon.h that performs the needed redirection and lands earlier in
the header search path than the ARM32-only arm_neon.h.

Change-Id: Idc63947a238ca1bd0c479d8f4ad68950487947c6
2018-11-16 22:27:18 +00:00
Jon Kunkee fa1e85b095 Merge "Add ARM64 support to VS project generation" 2018-11-16 22:26:15 +00:00
Yaowu Xu b13f6154df Optimize RDMULT values for key frames
Encoding of 5 frames of each sequence using key frames only, the new
values help the metrics by:

         PSNR     SSIM       PSNR-HVS
lowres:  -0.870%  -0.140%    -0.892%
midres:  -0.899%  -0.146%    -1.052%
hdres:   -0.944%  -0.115%    -1.028%

Tested q range:
2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62

Change-Id: I5b0dda366d589f52987c5bad11a1f95c4e6dc1a5
2018-11-16 09:55:51 -08:00
Marco Paniconi 84b19ab5ac vp9: Reorganize the buffer level for cbr mode
Refactor the code with some changes.

Split update into two parts: move the fillup
(with per-frame-bandwidth) before the encoding, and
keep the leaking part (with encoded_frame_size) after
the encoding (postencode).

For SVC with ref_frame_config usage: allow usage of timestamp
delta for the fillup part of buffer, instead of the (average)
framerate passed in via the duration.

Moving the buffer fillup (+per-frame-bandwidth) part to the
pre-encode causes some difference in performance
(since buffer level affects active_worst/QPand frame-dropping),
but the change is observed to be small.
Made small adjustment to active_worst_quality to compensate.

Adjust some thresholds in datarate tests.

Change-Id: I81a5562367034f318cffd451304bc4a34bf02a1d
2018-11-16 09:15:11 -08:00
Johann Koenig 460f66aa99 Merge "quantize: use aarch64 vmaxv" 2018-11-16 14:57:30 +00:00
Jingning Han 2dea20be26 Merge "Fix arf boost factor calculation for intermediate ARFs" 2018-11-16 00:07:22 +00:00
Angie Chiang 22ca684d88 Merge changes Ib9d16a4b,I6061f38c
* changes:
  Refactor av1_nb_mvs_inconsistency()
  Recompute mv inconsistency after mv search is done
2018-11-15 23:40:59 +00:00
Jon Kunkee aa827867aa Add ARM64 support to VS project generation
Windows builds can use msbuild.exe to build libvpx through a set of
generated Visual Studio project files. This commit adds awareness of
ARM64 Windows to this process by adding ARM64 configurations and
setting msbuild properties to consume the right SDK version.

Change-Id: I1bbc01cbe7be3d53c4e1af6cd96c6e4170aa4915
2018-11-15 23:04:31 +00:00
Jon Kunkee e6f889c120 Add ARM64 Windows to configure scripts
In order to correctly configure for Windows 10 on ARM, this change adds
a --target value arm64-win64-vs15 to ./configure and adds feature
enable/disable logic for the new platform.

This is merely sufficient for Chromium targeting ARM64 Windows.

Bug: 893460
Change-Id: I46194286f63104bdf6ac57d719fdf1e5d5fa72c8
2018-11-15 22:14:46 +00:00
Jingning Han dcc7c5aeb1 Fix arf boost factor calculation for intermediate ARFs
Need to offset the forward range by 1 to include the stats for
the current frame itself.

Change-Id: I3b5171b7edef51ec4e97e4e0542ca58af5ce1416
2018-11-14 21:03:29 -08:00
Jingning Han ea57f9acda Merge "Disable tpl model in GF-only GOP structure" 2018-11-15 05:01:04 +00:00
Jingning Han bca3baae7f Merge "Fix GF-only frame type allocation" 2018-11-15 05:00:55 +00:00
Angie Chiang ba4fbc4b8d Refactor av1_nb_mvs_inconsistency()
Change-Id: Ib9d16a4bc3ce1d28493e34f24dc18a6b511738f0
2018-11-14 17:01:51 -08:00
Angie Chiang 069943a5f1 Recompute mv inconsistency after mv search is done
Change-Id: I6061f38cb42eea2b4c8996ad372c829dc1051c8d
2018-11-14 17:01:51 -08:00
Jingning Han 66a6cfa5eb Disable tpl model in GF-only GOP structure
The tpl model assumes a relative short stats buffer length. Hence
it is not ready to support GF-only GOP structure where the max
length can go up to 250. Disable tpl model in such setting to avoid
a rare encode failure in GF-only setting.

Change-Id: I3409dbb829a8105478876684ec21a2bd405c33c8
2018-11-14 14:58:56 -08:00
Harish Mahendrakar 9e69aa0b48 Merge "vpx_dec_fuzzer: Unify single and multi-thread tests" 2018-11-14 21:55:16 +00:00
Harish Mahendrakar 07de398172 vpx_dec_fuzzer: Unify single and multi-thread tests
As thread count is now randomized, serial and threaded modes can be
combined to a single binary.
With this change, threads takes values between 1 to 64 and tests both
single thread and multi-thread variants of the decoders

Change-Id: I6dd2a3aa03bff9c0e2c126843b543d46892be696
2018-11-14 21:06:14 +00:00
Harish Mahendrakar 396b7484fc Merge "Added libFuzzer plugin to test decoders" 2018-11-14 21:02:45 +00:00
Jingning Han 38eb8e7752 Fix GF-only frame type allocation
Rework the recursive ARF allocation to avoid missing one frame's
type assignment issue in GF only GOP structure. This fixes a rare
encoder failure issue in GF only setting.

Change-Id: I3e41fe36d3cb954de25ffc058a42b2b8be0fcd7a
2018-11-14 11:35:21 -08:00
Harish Mahendrakar 6cb8d8a2b1 Added libFuzzer plugin to test decoders
vpx_dec_fuzzer.cc can be built with clang++ to generate fuzzer binary
Build instructions are part of the file

Change-Id: I19ba0bd49b236e27b27e81a83f6de59f15bdc994
2018-11-14 11:05:04 -08:00
Jingning Han 16f50c34b3 Merge "Rescale arf bit budget calculation" 2018-11-13 23:13:14 +00:00
Jingning Han a18946fbed Skip ACL recode loop for intermediate ARF layers
Speed up the encoding time by ~20% for multi-layer ARF system.

Change-Id: I16de1cfed7cd1815cf0269eb4f90ad74fdf087ee
2018-11-13 09:03:23 -08:00
Jingning Han 134072dbd3 Rescale arf bit budget calculation
To compute the total budget for a depth layer, exclude the count of
frames that have been allocated the bit budget. This improves the
avg PSNR by 0.15% and overall PSNR by 0.25% for lowres and midres
test sets.

Change-Id: I5115e33e1422dc930179142cd29aeebe97425283
2018-11-13 09:01:53 -08:00
Johann 43a30d3a1a quantize: use aarch64 vmaxv
Simplify max value calculation on aarch64 by using vmaxv. Much
faster for 4x4 but diminishing returns as the block size grows.

Only the vp9 quantize has a speed test hooked up. Anticipate
similar results for the other quantize versions.

Before:
[ RUN      ] NEON/VP9QuantizeTest.DISABLED_Speed/2
[    BENCH ]      Bypass calculations       4x4  31.6 ms ( ±0.0 ms )
[    BENCH ]        Full calculations       4x4  31.6 ms ( ±0.0 ms )
[    BENCH ]      Bypass calculations       8x8  17.7 ms ( ±0.0 ms )
[    BENCH ]        Full calculations       8x8  17.7 ms ( ±0.0 ms )
[    BENCH ]      Bypass calculations     16x16  14.2 ms ( ±0.0 ms )
[    BENCH ]        Full calculations     16x16  14.2 ms ( ±0.0 ms )
[       OK ] NEON/VP9QuantizeTest.DISABLED_Speed/2 (1906 ms)
[ RUN      ] NEON/VP9QuantizeTest.DISABLED_Speed/3
[    BENCH ]      Bypass calculations     32x32  18.6 ms ( ±0.0 ms )
[    BENCH ]        Full calculations     32x32  18.6 ms ( ±0.0 ms )

After:
[ RUN      ] NEON/VP9QuantizeTest.DISABLED_Speed/2
[    BENCH ]      Bypass calculations       4x4  29.1 ms ( ±0.0 ms )
[    BENCH ]        Full calculations       4x4  29.1 ms ( ±0.0 ms )
[    BENCH ]      Bypass calculations       8x8  16.9 ms ( ±0.0 ms )
[    BENCH ]        Full calculations       8x8  16.9 ms ( ±0.0 ms )
[    BENCH ]      Bypass calculations     16x16  14.1 ms ( ±0.0 ms )
[    BENCH ]        Full calculations     16x16  14.1 ms ( ±0.0 ms )
[       OK ] NEON/VP9QuantizeTest.DISABLED_Speed/2 (1803 ms)
[ RUN      ] NEON/VP9QuantizeTest.DISABLED_Speed/3
[    BENCH ]      Bypass calculations     32x32  18.6 ms ( ±0.0 ms )
[    BENCH ]        Full calculations     32x32  18.6 ms ( ±0.0 ms )

Change-Id: Ic95812b3fdbd4e47b4dcb8ed46c68a9617de38d2
2018-11-12 11:47:29 -08:00
Yaowu Xu 4a8c248744 Merge "Refactor common code in RDMULT computation" 2018-11-11 14:17:35 +00:00
Jerome Jiang 2ac954dfd2 vp8: Init buffers and pred arrays for mt after allocation.
Buffers and arrays used for prediction are not initialized after
allocation.

BUG=902691

Change-Id: Ic727e5dab7456e91ec9d6c80694f60a1a3600640
2018-11-09 22:45:24 -08:00
Yaowu Xu 42c81fcafc Refactor common code in RDMULT computation
Change-Id: I2b59ba26fdb1f75302c457c90817398acaa28975
2018-11-09 14:45:48 -08:00
Jerome Jiang c66fe1a893 Merge "Add operator<< to hadamard test." 2018-11-08 18:37:59 +00:00
Jerome Jiang 1ddadc95b0 Add operator<< to hadamard test.
This quiets valgrind warning.

Change-Id: I7c5e23ebb91cc67cf93678135b826b2bc8e9db2f
2018-11-07 21:35:35 -08:00
Yaowu Xu 82ab8e7466 Merge "Simplify rdmult computation" 2018-11-07 21:55:10 +00:00
Marco Paniconi 2c241e1192 Merge "vp9-screen-content: Adjust condition for large search area" 2018-11-07 21:14:49 +00:00
Yaowu Xu 5c96d3b82b Simplify rdmult computation
Recognizing that max dc_quant used in rdmult computation is 21387 and
21387 * 21387 * 88 / 24 is still within the range of int32_t, this
commit simplifies the computation with minor cleanups.

Change-Id: I2ac7e8315d103c0bb39b70c312c87c0fda47b4f9
2018-11-07 11:33:07 -08:00
Marco Paniconi 80835de6ca vp9-screen-content: Adjust condition for large search area
Account for dropped frame, and change resolution threshold
for limiting split below 16x16.

Change-Id: If94cfb2bc24d9103332d1c8d945daca8899db33d
2018-11-07 11:27:23 -08:00
Jingning Han 0623526872 Cosmetic clean up in find_arf_order
Remove duplicate variable definition.

Change-Id: I476bb319078f1043116163ac7aeff28a4a3ab5e6
2018-11-07 11:26:23 -08:00
Jingning Han ec12c265e9 Merge "Unify GOP structure layout setup" 2018-11-07 18:21:38 +00:00
Paul Wilkins 9a95f27563 Merge "Modified key frame detection." 2018-11-07 17:03:44 +00:00
Jingning Han 0d7a05ae0b Unify GOP structure layout setup
Refactor define_gf_group_structure() to unify the single-layer,
multi-layer ARF, and GF only GOP structure setup.

Change-Id: Iebbe9c3742fc58ae4e77b1072ebecb3ee7bd26b2
2018-11-07 08:45:06 -08:00
Johann Koenig e42909a5aa Merge "vp8: remove VP8_ENTROPY_STATS code" 2018-11-07 15:41:22 +00:00
Jerome Jiang 59540801e6 Merge "vp9: postencode drop frame for screen content in CBR." 2018-11-07 00:33:58 +00:00
Marco Paniconi 856e8f58e1 Merge "vp9 screen-content: Adjustments for screen content." 2018-11-06 23:45:30 +00:00
Jerome Jiang a03b04a55f vp9: postencode drop frame for screen content in CBR.
Encode the next frame at max q.

For layers: post_encode_drop is only check on base
spatial layer, and if base is post-encoded-dropped,
then whole superframe is dropped.

Added API to guard postencode dropping. Turned off by default.

Added unittest.

BUG=b/112990050
Change-Id: I42fee279014aca616f7a4d9b582cb2bf5da2f2e7
2018-11-06 15:39:07 -08:00
Marco Paniconi e4f45b120a Merge "vp8: Increase rate correction threshold for drop-overshoot" 2018-11-06 22:43:16 +00:00
Sai Deng ea54bcc7f3 Merge "Refactor Hadamard tests and add highbd tests" 2018-11-06 19:03:17 +00:00
Marco Paniconi 7db0178457 vp9 screen-content: Adjustments for screen content.
Increase search area, use NSTEP, and in some cases avoid
bsize below 16x16. This for base spatial layer when many blocks
in the frame have motion (from scene detection analysis).

Improves quality for scrolling motion.

Change-Id: If77b43e738a6c43610d4727a95712667088db564
2018-11-06 11:01:52 -08:00
Jerome Jiang f6176a73a2 Merge "vp8 dec: only compute ref frame buffer pointer for non intra" 2018-11-06 18:47:16 +00:00
sdeng 8f0f274ec0 Refactor Hadamard tests and add highbd tests
Change-Id: I306083f233e53884ac21fb4621066713edddc8f7
2018-11-06 17:39:01 +00:00
Jingning Han ee8920732d Merge "Track maximum layer depth in a GOP" 2018-11-06 17:10:24 +00:00
Jingning Han df159d67b2 Merge "Fix gf_group->frame_end assignment" 2018-11-06 17:10:18 +00:00
Jingning Han 9f1b4bb10d Merge "Refactor define_gf_group_structure()" 2018-11-06 17:10:10 +00:00
Jingning Han 5f906e92e1 Merge "Remove redundant assignments in define_gf_group_structure()" 2018-11-06 17:10:03 +00:00
Jingning Han 127e805064 Merge "Refactor find_arf_order()" 2018-11-06 17:09:54 +00:00
Paul Wilkins a76dcd98ec Modified key frame detection.
Address poor key frame detection in some content.

This patch improves on poor key frame / scene cut detection observed
with some test content. The content in question was letter boxed film
style material and also had quite low contrast. For both 1080P and 4K
multiple genuine scene cuts were being missed.

The changes alter the conditions for marking a transition as a "flash" rather
than a scene change. The new code still deals well with genuine flashes as
observed in the "crew" test clip, without falsely flagging some of the
the scene cuts in the "film" test clip.

The new film test clip also had some "flash" frames caused by a lightning
effect and in one case a flash occurred right before a scene change. This
caused a misplacement of the key frame but has been addressed by a new
clause that requires the coded error for the next frame after a candidate
key frame to be lower than the current frame.

The patch also changes the way in which neutral blocks (similar inter and
inter error) are handled in the candidate key frame decision in a way which
hopefully handles the letter boxed format better.

During wider testing some film clips still had missed key frames but this
patch does improve things. In the case of the initial test clip the encoder
correctly marks all 3 scene cuts vs 0 before the patch.

Testing with our standard (mainly short single kf) derf and NF test clips
is neutral.

Change-Id: I3b7dcfe7b2fb13fd0816ea46acc3e69c8bc581b3
2018-11-06 15:10:10 +00:00
Johann Koenig 7808cc796e Merge "vpx_codec_enc_config_default: disable 'usage'" 2018-11-06 02:34:04 +00:00
Jerome Jiang e674cfcb05 vp8 dec: only compute ref frame buffer pointer for non intra
When ref frame is INTRA_FRAME, pre buffer shouldn't be used.

This CL copies behavior in single thread. That should apply to
multithreading case too.

BUG=webm:1496

Change-Id: Ibe9ab8ea9dc664151fa7ebac529d5fd1a481b4a3
2018-11-05 17:46:03 -08:00
Jingning Han 9718a01e42 Track maximum layer depth in a GOP
Track the effective maximum layer depth in a given group of
pictures. Keep it in the GF_GROUP data structure.

Change-Id: If777c4e0f4a871c7226a91e3871f445e92f18b24
2018-11-05 16:22:44 -08:00
Jingning Han 47b6d3c418 Fix gf_group->frame_end assignment
The previous value was set off by 1. Use the correct value.

Change-Id: I1ce53cc99063ce31e7ab1c43c6e444cb9a1972db
2018-11-05 15:42:20 -08:00
Jingning Han b5cd400e76 Refactor define_gf_group_structure()
Make it a standalone operation unit. Refactor to cut off unnecessary
dependency between define_gf_group_structure() and
allocate_gf_group_bits().

Change-Id: I954fd4e96152471a994f2ffd38a72061ab517ddd
2018-11-05 15:38:51 -08:00
Johann Koenig 4a6d4ec33e Merge changes I774a0711,I0b4fd670,Ia09935e5
* changes:
  Fix compilation on OS/2
  Use wcslen() instead of std::wcslen()
  Fix compilation on OS/2
2018-11-05 23:32:11 +00:00
Johann Koenig 921863210e Merge "clang-tidy: fix vpx_dsp parameters" 2018-11-05 23:29:06 +00:00
Johann e57f388bcf vpx_codec_enc_config_default: disable 'usage'
Found with clang-tidy. This value is unused in libvpx.

There is an existing test which ensures this is not used:
test/encode_api_test.cc:
    EXPECT_EQ(VPX_CODEC_INVALID_PARAM,
              vpx_codec_enc_config_default(kCodecs[i], &cfg, 1));

Change-Id: I94bd0663c6652b4267204c02c3921972c854d0b0
2018-11-05 15:14:46 -08:00
Jingning Han dfe4a7c88e Remove redundant assignments in define_gf_group_structure()
The functionality has been covered in the above
set_gf_overlay_frame_type() call.

Change-Id: Id4049cd9a1a5a9bad7ea62c412fcb557afa9a572
2018-11-05 11:16:43 -08:00
Johann b08582a02e clang-tidy: fix vpx_img_wrap align
This function specifically only aligns the stride and not the base buffer
like vpx_img_alloc does.

BUG=webm:1444

Change-Id: I3092827eeec3c9e16306a3973534d3a362a337e8
2018-11-05 19:15:04 +00:00
Jingning Han 6596fb201d Refactor find_arf_order()
Make the maximum layer depth allowed a control parameter in
GF_GROUP. No coding stats would change.

Change-Id: I9d17167da322831e7013d761980e1c16375a161b
2018-11-05 11:12:24 -08:00
Marco Paniconi e161a36a6f vp8: Increase rate correction threshold for drop-overshoot
For 1 pass cbr encoding mode, with frame-dropping on:
increase the rate correction threshold for drop-overshoot detection,
to better capture cases of large overshoot.

Change-Id: I1153b1b71cf106749dd985074d6bc8f37d163c7e
2018-11-05 09:37:53 -08:00
Johann Koenig a430020f73 Merge "vpx postproc: rewrite in intrinsics" 2018-11-02 18:13:14 +00:00
Sai Deng 06cea479d8 Merge "Add highbd Hadamard transform C implementations" 2018-11-02 16:46:46 +00:00
Johann Koenig 86b1179b3f Merge "fix snprintf error on windows" 2018-11-02 16:23:23 +00:00
Johann Koenig 2aff80d54a Merge "clang-tidy: normalize variance functions" 2018-11-02 14:41:24 +00:00
Johann 76d7b379a9 fix snprintf error on windows
Include vpx_ports/msvc.h to handle snprintf on older
versions of Visual Studio

Change-Id: I06cd99b32bbae82b3df079d41ff20a9a07f6fe1c
2018-11-02 07:34:12 -07:00
Johann 0239b200d3 vp8: remove VP8_ENTROPY_STATS code
Does not compile. Noticed while cleaning up un-namespaced functions

Change-Id: I4a9048e66d051397f652e7b5412606a5e234f61f
2018-11-01 15:06:23 -07:00
sdeng 50c89f84b1 Add highbd Hadamard transform C implementations
Change-Id: Ibec078c80ca1dfe6fbbc4288db89d719dac453a7
2018-11-01 19:33:05 +00:00
Johann 96082749aa clang-tidy: fix vpx_dsp parameters
BUG=webm:1444

Change-Id: Iee19be068afc6c81396c79218a89c469d2e66207
2018-11-01 12:14:14 -07:00
Johann Koenig 811759d868 Merge "vp8 boolcoder: normalize to "bc"" 2018-10-31 22:39:08 +00:00
Jerome Jiang 1658b2f47a Merge "vp8: fix to address overflow in decoder." 2018-10-31 22:27:44 +00:00
Johann Koenig d9381a1c6a Merge "vp8dx_get_quantizer: normalize VP8D_COMP" 2018-10-31 22:19:43 +00:00
Johann 4635b0fced clang-tidy: normalize variance functions
Always use src/ref and _ptr/_stride suffixes.

Normalize to [xy]_offset and second_pred.

Drop some stray source/recon_strides.

BUG=webm:1444

Change-Id: I32362a50988eb84464ab78686348610ea40e5c80
2018-10-31 15:05:37 -07:00
Johann Koenig 331d289c5c Merge "clang-tidy: fix vp9/encoder parameters" 2018-10-31 21:43:05 +00:00
Johann Koenig 5036f0fed8 Merge "clang-tidy: fix vp9/decoder parameters" 2018-10-31 21:42:16 +00:00
Johann Koenig 7feabc11ec Merge "clang-tidy: fix vp9/common parameters" 2018-10-31 21:42:07 +00:00
Johann Koenig 8be18fc8fa Merge "clang-tidy: fix vp8/encoder parameters" 2018-10-31 21:41:53 +00:00
Johann 867f25c830 vp8 boolcoder: normalize to "bc"
"bc" maps to BOOL_CODER better than "br"

Change-Id: Idefd03e79ccc1851a1b26f8206a159b0e5c5fb2d
2018-10-31 14:24:31 -07:00
Johann Koenig c307feb403 Merge "clang-tidy: fix vp8/decoder parameters" 2018-10-31 21:14:46 +00:00
Johann c85967fe71 vp8dx_get_quantizer: normalize VP8D_COMP
Use "pbi" like the rest of the functions

Change-Id: I5f3036b8f8361c30353be378d83455b83b82ac9f
2018-10-31 14:13:55 -07:00
Chi Yo Tsai e77f6620a5 Merge "Add SSE2 support for hbd 4-tap interpolation filter." 2018-10-31 20:38:52 +00:00
Jerome Jiang f3a027a46d vp8: fix to address overflow in decoder.
Can't call internal error from the decoder thread.

Add vpx_internal_error_info to MACROBLOCKD. When corrupted frame
detected, the decoder thread returns to its own context and signal
completion of decoding for current frame.

The main decoding thread will detect error too and return error code to
decoding API call.

Each thread will signal end of decoding of the frame. Main thread waits
for the signal of all other threads to start decoding next frame.

BUG=875626,webm:1496
Change-Id: Icd05fbc558893a4e7d8532c1e7177e7550283a64
2018-10-31 11:42:28 -07:00
Johann Koenig 83f8fee04b Merge "clang-tidy: fix vp8/common parameters" 2018-10-30 22:11:59 +00:00
Johann 868484bc66 clang-tidy: fix vp9/encoder parameters
BUG=webm:1444

Change-Id: I6823635eb1a99c3fcca0a8f091878e3ab2fdd2ac
2018-10-30 12:46:39 -07:00
Johann cf23ace9b2 clang-tidy: fix vp9/decoder parameters
BUG=webm:1444

Change-Id: I9c7c0a4161aaf52436bd5c01d30b035b2ff5508c
2018-10-30 12:17:22 -07:00
chiyotsai 5a51d961f2 Add SSE2 support for hbd 4-tap interpolation filter.
Unit test performance on bitdepth 10:
    | 4X4 | 8X8 |16X16|64X64|
 2D |1.582|1.461|1.425|1.572|
HORZ|1.643|1.247|1.346|1.345|
VERT|1.378|1.695|2.020|1.763|

Unit test performance on bitdepth 12:

    | 4X4 | 8X8 |16X16|64X64|
 2D |1.578|1.409|1.426|1.497|
HORZ|1.625|1.153|1.323|1.259|
VERT|1.392|1.707|2.030|1.787|

Change-Id: I6df85330ac33fcb17d46e4302b41415dda1219f5
2018-10-30 12:12:28 -07:00
Johann dc89abccdb clang-tidy: fix vp9/common parameters
BUG=webm:1444

Change-Id: I1a14ad119b3bcbaddcf2291a7521513cf6425635
2018-10-30 12:04:40 -07:00
Johann 9a978eb1d9 clang-tidy: fix vp8/encoder parameters
BUG=webm:1444

Change-Id: I57a305cdab0d62b0745116272fbd5d9257c6e679
2018-10-30 12:04:25 -07:00
Johann e018967dc4 clang-tidy: fix vp8/decoder parameters
BUG=webm:1444

Change-Id: I3dfc56f7f6430d36a1c447d8999733015a001101
2018-10-30 10:55:21 -07:00
Johann aa660e0ea3 clang-tidy: fix vp8/common parameters
Match function definitions to declarations

BUG=webm:1444

Change-Id: Ib96d3b735eaf81cece5406c89cc5156bc2cde462
2018-10-30 10:36:45 -07:00
Chi Yo Tsai 8886fe7e31 Merge "Add AVX2 support for hbd 4-tap interpolation filter." 2018-10-30 16:50:00 +00:00
Jingning Han e20be1b234 Merge "Properly space qp in q mode for multi-layer ARF" 2018-10-30 04:44:09 +00:00
Johann c176e64904 vpx postproc: rewrite in intrinsics
About ~10% faster on 64bit but ~10% slower on 32

Removes the assembly usage of vpx_rv.

Change-Id: I214698fb5677f615dee0a8f5f5bb8f64daf2565e
2018-10-29 18:53:32 -07:00
Jingning Han 50a9074440 Properly space qp in q mode for multi-layer ARF
Space the quantization parameter distribution according to the
layer depth for multi-layer ARF coding structure. This allows
lower layers to have relatively smaller quantization parameters
than higher layers. It improves the compression performance
in constant q mode for multi-layer ARF system:

        avg PSNR      overall PSNR      SSIM
lowres  -0.33%         -0.31%          -1.44%
midres  -0.29%         -0.38%          -1.14%
hdres   -0.27%         -0.49%          -1.02%

Change-Id: I9cfe2f27e6c0029c30614970a46de3045840264e
2018-10-29 16:50:32 -07:00
Johann Koenig fa0076282e Merge "vp8 bilinear: ensure non-16x16 arrays are aligned" 2018-10-29 22:26:42 +00:00
chiyotsai 505f2ed7fc Add AVX2 support for hbd 4-tap interpolation filter.
Speed gain:

BIT DEPTH | 8TAP FPS | 4TAP FPS | PCT INC |
    10    |   1.69   |   1.85   |  9.46%  |
    12    |   1.64   |   1.78   |  8.54%  |

Speed test is done on jet.y4m on speed 1 profile 2 over 100 frame with
br=500.

Change-Id: I411e122553e2c466be7a26e64b4dd144efb884a9
2018-10-29 22:18:17 +00:00
Johann Koenig 03ff6c837a vp8 bilinear: ensure non-16x16 arrays are aligned
The 16x16 array was changed to aligned. The 8xN and 4x4 functions
use aligned loads/stores on their internal arrays as well.

BUG=webm:1570

Change-Id: I9cfe53d7c8ed76e8854c2688eb9a509b876471d8
2018-10-29 19:01:50 +00:00
Johann Koenig 30ef91ff7d Merge "vp8 bilinear: ensure temp array is aligned" 2018-10-29 18:55:52 +00:00
Sai Deng 5f7d7554e0 Merge "Enable 10 bit tpl support" 2018-10-29 17:14:09 +00:00
Johann 4cba6ce198 vp8 bilinear: ensure temp array is aligned
Loads and stores to this array require 16 byte alignment.

BUG=webm:1570

Change-Id: I82c7d21c9539a108930fd030d79caaa0bcd1eeb3
2018-10-29 09:21:21 -07:00
Johann Koenig c0f71b4e9c Merge "remove "register" keyword" 2018-10-29 02:07:20 +00:00
Jingning Han d9de17ae02 Merge "Remove unused macros from vp9_firstpass.c" 2018-10-27 03:56:18 +00:00
sdeng c4978abc07 Enable 10 bit tpl support
lowres_bd10   midres_bd10
avg_psnr      -0.897        -1.261
ovr_psnr      -0.975        -1.349

Change-Id: Id54f2c419f4edaa91e89ffea52b4038b1d94e563
2018-10-26 23:50:50 +00:00
Johann 17004c71bc remove "register" keyword
This has been deprecated for a long time. c++17 is trying to recover the name.

Change-Id: Iade6bebce03a50b76061695f9e634a107cd989cd
2018-10-26 14:55:26 -07:00
Harish Mahendrakar a4e70f1808 Merge "Add Memory to Enable Row Decode" 2018-10-26 18:31:41 +00:00
Jingning Han 1f75698044 Remove unused macros from vp9_firstpass.c
Change-Id: If5267a8c71113b171b7bddda5b49f0326c4266b8
2018-10-26 11:03:31 -07:00
Johann 5caec339be vp8 bilinear: rewrite 4x4
~20% faster than the MMX. Removes the last usage of
vp8_bilinear_filters_x86_[48].

Change-Id: Iee976fab9655d0020440f26c4403ce50103af913
2018-10-25 15:05:28 -07:00
Johann Koenig 13a946ec77 Merge "vp8 bilinear: rewrite 16x16" 2018-10-25 19:59:06 +00:00
Chi Yo Tsai f80d1b33c4 Merge "Add AVX2 support for 4-tap interpolation filter." 2018-10-25 18:25:09 +00:00
Johann ad0ed535a7 vp8 bilinear: rewrite 16x16
Marginally faster. Most importantly it drops a dependency on an
external symbol (vp8_bilinear_filters_x86_8).

Change-Id: Iff022e718720f1f0eeced6201a1ad69a9c9c4f45
2018-10-25 11:09:21 -07:00
Johann Koenig f2c039115d Merge "vp8 bilinear: rewrite in intrinsics" 2018-10-25 17:13:50 +00:00
Ritu Baldwa 351ec07a6d Add Memory to Enable Row Decode
Row based multi-thread needs extra memory to store the parsed
co-efficients, partitions and eob. This commit adds memory for the same.

Change-Id: I13fa4a6ada2ec3048bc973e465055b832429388f
2018-10-25 12:22:58 +05:30
Jingning Han d7e64ac749 Merge "Enable tpl model to support multi-layer ARF" 2018-10-25 00:01:52 +00:00
Jingning Han 248f816a34 Merge "Reset frame udpate flags after qp estimate in tpl" 2018-10-25 00:01:46 +00:00
Jingning Han d3085ae12b Merge "Bypass processing on use existing frame" 2018-10-25 00:01:41 +00:00
Jingning Han 47f558dd29 Merge "Fix frame offset computation for GOP extension" 2018-10-25 00:01:35 +00:00
Jingning Han 8075dea1e5 Merge "Refactor gop_length use case in tpl model" 2018-10-25 00:01:29 +00:00
Johann 6f35ac956b vp8 bilinear: rewrite in intrinsics
8x8 is 15% faster than the assembly. 8x4 is 200% faster than MMX.

Remove MMX version.

Change-Id: I55642ebd276db265911f2c79616177a3a9a7e04f
2018-10-24 16:01:39 -07:00
Chi Yo Tsai 979f0c0e5a Merge "Clean up vpx_dsp/x86/convolve_sse2.h" 2018-10-24 16:36:20 +00:00
Jingning Han 47922cc140 Enable tpl model to support multi-layer ARF
Enable temporal dependency model for the base layer ARF. It
improves the multi-layer ARF compression performance (results
are tested in speed 0 vbr mode):

         avg PSNR    overall PSNR     SSIM
lowres   -0.40%       -0.46%         -0.32%
midres   -0.59%       -0.68%         -0.45%
720p     -0.55%       -0.59%         -1.07%

Change-Id: I7790b89ccfb6e61f9b7965f34d348c7440220dd0
2018-10-23 20:30:35 -07:00
chiyotsai 6657ab8571 Add AVX2 support for 4-tap interpolation filter.
Performance:
     | 4X4 | 8X8 |16X16|64X64|
2 DIM|1.491|1.902|1.772|1.479|
 HORZ|1.145|1.521|1.757|1.497|
 VERT|1.176|1.614|1.707|1.467|

Each number in the chart above is 8-tap function time / 4-tap function time.

The framerate tested on jets.y4m for 100 frames on speed 1 increased from 3.72
fps to 3.91 fps (about 5% increase).

Change-Id: Ic0ad275cf32fafeefd0a89811badd8adff2134a0
2018-10-24 01:02:06 +00:00
chiyotsai 73930f9763 Clean up vpx_dsp/x86/convolve_sse2.h
Removes unnecesssary includes and reword some functions/comments.

Change-Id: Ied557d7faa9d845d38255e6e3e0e3fe1395276e1
2018-10-23 16:28:29 -07:00
Jingning Han fe471693ac Reset frame udpate flags after qp estimate in tpl
After the frame quantizer estimate run in tpl model, reset the
actual value assigned to the current coding frame. This would
avoid certain frame update flags being overwritten by different
frame types' update.

Change-Id: Idde2ba1108f1f68747b14149b211f882965c99f0
2018-10-23 16:24:28 -07:00
Yunqing Wang e9ba435435 Merge "Use 8-tap interp filter in temporal filtering" 2018-10-23 22:29:46 +00:00
Yunqing Wang bb071af390 Use 8-tap interp filter in temporal filtering
Used 8-tap interp filter in temporal filtering to achieve more accurate
motion search result. Using 8-tap sharp gave slight better result than
using 8-tap regular.

Speed 0 borg test showed that
        avg_psnr:  ovr_psnr:    ssim:
hdres:  -0.160      -0.157     -0.173
midres: -0.083      -0.061     -0.183
lowres: -0.077      -0.099     -0.204

Speed test didn't see noticeable encoder time changes.

Change-Id: I97dc3c4864b5a5675a6c1e3952799b81eedd7d93
2018-10-23 12:34:56 -07:00
Jingning Han 0257168c77 Merge "Remove empty else branch in mode_estimation" 2018-10-23 19:16:41 +00:00
Jingning Han 93edc3db13 Bypass processing on use existing frame
The use of show existing frame requries no further operation on
that coding frame. Bypass the corresponding process.

Change-Id: Ia092027a8a543be0ca54c00b4d51e453039712b8
2018-10-23 10:34:58 -07:00
Jingning Han 9f02ba3684 Fix frame offset computation for GOP extension
Properly compute the extended GOP frames' buffer offsets.

Change-Id: I9aed14f4b8d623f1832e782828dce07aa546507d
2018-10-23 10:34:58 -07:00
Jingning Han b77291abc9 Refactor gop_length use case in tpl model
Make it support both single- and multi-layer ARF GOP structure.

Change-Id: I760a95804d1b583b057120f6d6be65195a0e6c19
2018-10-23 10:34:58 -07:00
Jingning Han 77e109340d Remove empty else branch in mode_estimation
Change-Id: Iefa184aae80b920b054e3e922a77244c2b0d4b61
2018-10-23 10:34:17 -07:00
Jingning Han 08655e8cd1 Merge "Use the proper gfu_boost factor to compute rd_mult" 2018-10-23 02:28:22 +00:00
Jingning Han 79ef532489 Use the proper gfu_boost factor to compute rd_mult
Update the Lagrangian multiplier according to the gfu_boost factor
assigned per frame. It improves the multi-layer ARF compression
performance (results below shown for speed 0):

         avg PSNR      overall PSNR      SSIM
lowres    -0.08%          0.02%         -0.28%
midres    -0.08%          0.03%         -0.22%
hdres     -0.19%         -0.10%         -0.39%
nflx2k    -0.29%         -0.18%         -0.85%

Change-Id: Ifeb4b14918f880ba011ea41c1454ab00504f8855
2018-10-22 15:06:28 -07:00
Hui Su 137d99c91f Merge "ML_VAR_PARTITION: enable at speed 5" 2018-10-19 16:48:40 +00:00
Hui Su 0bedb351ff ML_VAR_PARTITION: enable at speed 5
When the ML_VAR_PARTITION experiment is turned on, replace
REFERENCE_PARTITION with ML_BASED_PARTITION at speed 5.

Coding gains(avg_psnr) compared to baseline:
ytlivehr  1.63%
ytlivelr  0.07%

Tested encoding speed with several clips from ytlivehr and ytlivelr
on linux desktop(rt, vbr, 4 threads). Encoder speed is on average
faster than baseline:
360p:   14% faster
720p:    7% faster
1080p: 1.5% faster

Change-Id: I39b00078176ff516f7306818f33ba2b1ea53dfa1
2018-10-18 14:41:25 -07:00
chiyotsai af4cd92629 Changes 4-tap SSSE3 filter to 8-tap AVX2 filter.
AVX2's 8-tap filter is slightly faster than 4-tap SSSE3 filter.

Change-Id: I5fc37c431670780108706b206b32c791828555c9
2018-10-18 18:24:32 +00:00
Chi Yo Tsai 40a0590950 Merge "Add SSSE3 support for 4-tap interpolation filter" 2018-10-18 18:19:41 +00:00
Hui Su 9e29417962 Merge "Enable rect partition search for HBD at speed 1" 2018-10-18 16:46:15 +00:00
chiyotsai 01df00ec0f Add SSSE3 support for 4-tap interpolation filter
Performance:
     | 4X4 | 8X8 |16X16|64X64|
2 DIM|1.526|1.827|1.844|1.906|
 HORZ|1.336|1.795|1.886|1.654|
 VERT|1.443|1.539|2.139|2.190|

The ratio is SSSE3 8-tap time / SSSE3 4-tap time.

Change-Id: I01ed2ab494428256e918875774a459afecc5ec6a
2018-10-18 16:30:35 +00:00
Jingning Han b1f789cf18 Merge "Replace MAX_LAG_BUFFERS with MAX_ARF_GOP_SIZE for gop size" 2018-10-18 16:25:37 +00:00
Yunqing Wang 2b838d9081 Merge "Optimize vp9_highbd_temporal_filter_apply_c" 2018-10-17 23:11:46 +00:00
Jingning Han 7dfb8e18f6 Replace MAX_LAG_BUFFERS with MAX_ARF_GOP_SIZE for gop size
MAX_ARF_GOP_SIZE accurately reflects the maximum frame operated
per group of pictures. Use that to replace MAX_LAG_BUFFERS in
such use cases.

Change-Id: Id26f9b1b2b0c38f255dee19795356c387d06d033
2018-10-17 16:11:37 -07:00
Angie Chiang 22d67c346f Merge changes I6d5c77af,I6bf504b4,Ie5dc5ea7,Ie6024b1a,If45fba8a, ...
* changes:
  Add do_motion_search
  Preserve code of doing mv search in raster order
  Variant implementation of changing mv search order
  Add feature_score_loc_sort
  Init mv_[dist/cost]_sum in init_tpl_stats
  Change mv search order according to feature_score
2018-10-17 23:10:50 +00:00
Angie Chiang 6127de7a7d Add do_motion_search
This will make the code cleaner.

Change-Id: I6d5c77af7261c39656b35ec40ac1451bbdbfb7a7
2018-10-17 14:46:28 -07:00
Chi Yo Tsai 88030f7d66 Merge "Adds SSE2 support for interpolation filter for width 4 and 8" 2018-10-17 21:35:14 +00:00
Angie Chiang 043d9cfe6c Preserve code of doing mv search in raster order
With this change, there will be three version of mv search scheme
on the codebase simultaneously.
We will do further experiment to evaluate which version is better
in terms of visual quality and coding performance.

Change-Id: I6bf504b4551316ef10b8a341ab3ba14d0ec977ce
2018-10-17 13:56:42 -07:00
Hui Su 8d5461eb0a Enable rect partition search for HBD at speed 1
This patch enables rectangular partition search on speed 1 for high
bit depth encoding. The encoding speed loss is reduced thanks to
recently added speed features.

This only affects speed 1 high bit-depth encoding.

Coding gains:
                      avg_psnr     ovr_psnr
lowres_bd10(480p)      1.34%        1.40%
midres_bd10(720p)      1.28%        1.33%

Average speed loss:
        QP=30    QP=40    QP=50    average
480p     2.5%     2.3%     2.6%     2.5%
720p     4.0%     3.9%     3.2%     3.7%

Change-Id: Id9cac4eea0769d94e093c9d170194659b3342d89
2018-10-17 13:33:55 -07:00
chiyotsai 71b4e0bded Adds SSE2 support for interpolation filter for width 4 and 8
Performance:
The chart below shows the speed relative to baseline
(baseline_time/new_time)
_____| 4X4 | 8X8 |16X16|64X64|
2 DIM|1.889|1.780|1.811|1.963|
 HORZ|2.266|1.834|1.617|1.595|
 VERI|2.043|2.190|2.373|2.485|

Change-Id: Ic4262222db78f013b94a8c61b46efb8520722927
2018-10-17 13:29:13 -07:00
Urvang Joshi e4eb48f1bc Merge "For keyframe-only coding do not boost in q mode" 2018-10-17 20:25:02 +00:00
Urvang Joshi 8bb92e382d For keyframe-only coding do not boost in q mode
If we are using keyframe only coding - either coding a
single frame, or a sequence of keyframes - in the end-usage=q
mode, use the cq_level directly as the quality of each
coded frame, rather than boost them.

Ported from AV1: 563a0d1eb92bdc1e987df071a568d8406c4ffa92

Change-Id: I6dc929b8b4f0aa18e279139077f3a87958c92245
2018-10-17 11:48:10 -07:00
chiyotsai 62830c53a6 Refactor SSE2 Code for 4-tap interpolation filter on width 16.
Some repeated codes are refactored as inline functions. No performance
degradation is observed. These inline functions can be used for width 8
and width 4.

Change-Id: Ibf08cc9ebd2dd47bd2a6c2bcc1616f9d4c252d4d
2018-10-17 11:33:43 -07:00
Yunqing Wang 4f3398a26a Optimize vp9_highbd_temporal_filter_apply_c
Following the previous patch:
(https://chromium-review.googlesource.com/c/webm/libvpx/+/1277913),
this patch modified the highbd version of applying temporal filter
in the similar way.

Change-Id: I2bb6f1fff6e32bca86f7139a497181d34aa9f3ec
2018-10-17 11:25:12 -07:00
chiyotsai 272f46212e Add SSE2 support for 4-tap interpolation filter for width 16.
Horizontal filter on 64x64 block: 1.59 times as fast as baseline.
Vertical filter on 64x64 block: 2.5 times as fast as baseline.
2D filter on 64x64 block: 1.96 times as fast as baseline.

Change-Id: I12e46679f3108616d5b3475319dd38b514c6cb3c
2018-10-17 09:58:30 -07:00
Angie Chiang d3c78fd71f Variant implementation of changing mv search order
We start mv search from the block with highest feature score, then
move on to the block's neighbors with with an searching order using
their feature scores.

We use max heap to help us achieve the functionality.

This feature is under flag USE_PQSORT

Change-Id: Ie5dc5ea715b0f9a7a594e5080a7cb4f5309f5597
2018-10-17 09:03:33 -07:00
Angie Chiang 534e61d5a3 Add feature_score_loc_sort
This CL is for facilitating the upcoming change,
a variant implementation of change mv search order according to
feature score

Change-Id: Ie6024b1a5ec02343aea6aa81fc14f94e2e515d06
2018-10-17 09:03:33 -07:00
Angie Chiang 45542ce51b Init mv_[dist/cost]_sum in init_tpl_stats
Change-Id: If45fba8a74186803eec09da7dbaf2e1fe4e9e156
2018-10-17 09:03:33 -07:00
Angie Chiang 2d4eeae1c8 Change mv search order according to feature_score
Sort the feature_score in descending order.
Do mv search from the block with higher score to the block with
lower score

Change-Id: I47a87cd66ea3e40d8c8fc55a7517ab8aa10fdb94
2018-10-17 09:03:33 -07:00
Wan-Teh Chang 86db847ab8 Merge "Reduce the cpi->scaled_ref_idx array size by 1." 2018-10-17 14:43:22 +00:00
Jingning Han 0c5fd59efd Merge "Refactor tpl dependency model to support multi-layer ARF updates" 2018-10-16 21:24:17 +00:00
Jingning Han 4f5444949f Merge "Refactor GOP reference frame ordering for tpl model" 2018-10-16 21:23:52 +00:00
Jingning Han da6682aac9 Merge "Record gop size" 2018-10-16 21:07:55 +00:00
Hui Su 21bb9e3a09 Merge "Fix a bug in ml_prune_rect_partition()" 2018-10-16 20:58:45 +00:00
Yunqing Wang e56b1db67a Merge "Fix the filter tap calculation in mips optimizations" 2018-10-16 17:55:37 +00:00
Hui Su 2bc3d47dd1 Fix a bug in ml_prune_rect_partition()
The quantization step size should be scaled properly for high bit depth
settings.

This only affects speed 0.
Encoder speed change is almost neutral.
There is a small coding gain of 0.09%.

Change-Id: I96b2bae03a53ce8ccd6428e3a050cfe18e06a024
2018-10-16 10:52:08 -07:00
Jingning Han d8825f16b2 Refactor tpl dependency model to support multi-layer ARF updates
Refactor to form a systematic reference frame update system for
the temporal dependency model. This prepares to support the multi-
layer ARF system.

Change-Id: Idb90fbe3966695b487c1a0a52f4626b0b6807434
2018-10-16 10:29:07 -07:00
Hui Su ff99ab36de Merge "Enable ML based partition search breakout for HBD" 2018-10-16 17:14:27 +00:00
Yunqing Wang bcd17e32c9 Fix the filter tap calculation in mips optimizations
The interp filter tap calculation was not accurate to tell the
difference between 2 taps and 4 taps. This patch fixed the bug, and
resolved Jenkins test failures in mips sub-pel filter optimizations.

BUG=webm:1568

Change-Id: I51eb8adb7ed194ef2ea7dd4aa57aa9870ee38cfc
2018-10-16 09:35:23 -07:00
KO Myung-Hun 191d3ada08 Fix compilation on OS/2
_beginthread() is not declared on __STRICT_ANSI__ mode.

-----
    [CXX] test/quantize_test.cc.o
In file included from ./vp8/common/threading.h:194:0,
                 from ./vp8/encoder/onyx_int.h:24,
                 from test/quantize_test.cc:24:
./vpx_util/vpx_thread.h: In function 'int pthread_create(TID*, const void*, void* (*)(void*), void*)':
./vpx_util/vpx_thread.h:259:20: error: '_beginthread' was not declared in this scope
   tid = (pthread_t)_beginthread(thread_start, NULL, 1024 * 1024, targ);
                    ^~~~~~~~~~~~
./vpx_util/vpx_thread.h:259:20: note: suggested alternative: 'thread'
   tid = (pthread_t)_beginthread(thread_start, NULL, 1024 * 1024, targ);
                    ^~~~~~~~~~~~
                    thread
-----

Change-Id: I774a071162b3876a7f3253ce7c5749f1b0b45818
2018-10-16 17:00:15 +09:00
Jerome Jiang 97e7da4862 Merge "fix output file check in vpxenc tests script." 2018-10-16 06:09:36 +00:00
Jingning Han 69a4ed0dd4 Merge "Add frame_gop_index to GF_GROUP" 2018-10-16 03:49:52 +00:00
Jingning Han ca97594a8a Merge "Add encoder side frame buffer for tpl model" 2018-10-16 03:49:25 +00:00
Jerome Jiang cee0391b57 fix output file check in vpxenc tests script.
BUG=webm:1556

Change-Id: I4be40e9bf667cd9896017f38d866a47d3e19dcaf
2018-10-15 20:35:17 -07:00
James Zern c875803def Merge "CHANGELOG: fix v1.7.0 release date" 2018-10-16 03:04:37 +00:00
Angie Chiang 8caabd077a Merge "Add indep loop for motion_compensated_prediction" 2018-10-16 00:43:25 +00:00
Hui Su 540d373d59 Enable ML based partition search breakout for HBD
For speed 0:
coding loss 0.045%; encoder speedup 6%.

For speed 1(only affects videos smaller than 720p):
coding loss 0.11%; encoder speedup 6.5%.

Change-Id: Ie441c9bad2021503e86fefd2f1fa3e1a42070bec
2018-10-15 23:49:18 +00:00
Yunqing Wang be51a7731d A temporary fix to mips sub-pel filters
There are Jenkins test failures in mips sub-pel filter optimizations.
[ RUN      ] MSA/ConvolveTest.MatchesReferenceSubpixelFilter/5
../libvpx/test/convolve_test.cc:889: Failure
Expected equality of these values:
  lookup(ref, y * kOutputStride + x)
    Which is: 255
  lookup(out, y * kOutputStride + x)
    Which is: 11
mismatch at (1,0), filters (4,0,1)

This relates to the 4-tap kernel added recently. This CL is a temporary
fix, while we investigate the issue.

BUG=webm:1568

Change-Id: If64c552b794425687cca4fbed893d8ccb73c89a5
2018-10-15 16:48:02 -07:00
Jingning Han 4dae31877d Refactor GOP reference frame ordering for tpl model
Process the frames in the order of GOP structure definition.
Decouple the dependency on rc->baseline_gf_interval.

Change-Id: I0d42c542aca552975cc8f08b0eb8b22ccf6a9537
2018-10-15 15:30:06 -07:00
Jingning Han f1e3f340ef Record gop size
Keep the frame operations needed within a group of picture.

Change-Id: Iece2e855f21860c930b34a3c586f084f7c61db00
2018-10-15 15:21:23 -07:00
Jingning Han 4d3275f865 Add frame_gop_index to GF_GROUP
Add frame_gop_index to track the frame offset within a group of
picture. This reworks the GOP frame offset calculation and use
case. The coding stats remain identical.

Change-Id: I94d0957bcc327f6bbeac6e84157635663c36b953
2018-10-15 14:58:08 -07:00
Jingning Han 227deb503e Add encoder side frame buffer for tpl model
Add an encoder side reference frame buffer pool to store the
reference frames for tpl model. This servces as an intermediate
step to support multi-layer ARF system. The buffer memory size will
be optimized afterwards.

Change-Id: If2d2f095d4911a4996f6c2a0b0a8e3d235ceadb2
2018-10-15 14:57:05 -07:00
James Zern 5ce1a99717 CHANGELOG: fix v1.7.0 release date
BUG=webm:1567

Change-Id: Ia6091445504c8c94334bc062c945238782553d44
2018-10-15 14:12:38 -07:00
Jingning Han f6a52bee7d Refactor tpl model setup to support multi-layer ARF setup
Generalize the tpl model framework to support the newly designed
GOP structure system. The existing tpl model assumes single layer
ARF.

This design will separate the tpl model operation for GOP with
and without ARF cases. When a GOP has ARF, the maximum lookahead
offset would upper limit the needed frame buffer to build the
tpl model for the entire GOP. When a GOP does not have ARF, we
would use the temporal model in a different approach.

The first step will focus on GOP with ARF. All the tpl model related
operation will only be triggered by ARF frame generation.

Change-Id: I13ab03a7bc68f5a4f6b03f2cb01c10befe955e73
2018-10-15 09:50:59 -07:00
KO Myung-Hun 0080704038 Use wcslen() instead of std::wcslen()
OS/2 kLIBC has wcslen(), but it is not in std namespace.

Change-Id: I0b4fd6705e6ae938b2188abdc688eea3bba27430
2018-10-16 00:22:14 +09:00
KO Myung-Hun c49ca22a88 Fix compilation on OS/2
off_t requires sys/types.h on OS/2

-----
    [CC] test/../ivfenc.c.o
In file included from test/.././ivfenc.h:13:0,
                 from test/../ivfenc.c:11:
test/../././tools_common.h:36:9: error: unknown type name 'off_t'
 typedef off_t FileOffset;
         ^~~~~
make.exe[1]: *** [test/../ivfenc.c.o] Error 1
-----

Change-Id: Ia09935e5de8573e63185369fc139e3355664afd1
2018-10-14 22:50:34 +09:00
Hui Su b7e9a463f8 Merge "Turn on ml_var_partition_pruning for HBD" 2018-10-13 15:04:24 +00:00
Yunqing Wang 7bc91452b8 Merge "Optimize apply_temporal_filter function" 2018-10-13 03:12:43 +00:00
Jingning Han dfa8cc9281 Merge "Remove unused variable from VP9_COMP" 2018-10-13 02:10:26 +00:00
Yunqing Wang fc1679863b Optimize apply_temporal_filter function
This patch optimized apply_temporal_filter function. The diff^2 for each
pixel in the 16x16 block is calculated once beforehand, so that we don't
calculate it multiple times while evaluating a pixel's neighbors. This
would speed up the function.

Change-Id: Ibdb8b041f317fd6df198950e2acf9cfcde26860d
2018-10-12 16:39:20 -07:00
Hui Su d4666ac7d2 Turn on ml_var_partition_pruning for HBD
This affects speed 0 and 1 only.

Tested on lowres_bd10(480p) and midres_bd10(720p),
                   speed 0       speed 1
coding loss:        0.07%         0.10%
encoder speedup:     14%          6.5%

Change-Id: I5812400d8c7393321b7284d3fca06026842390b5
2018-10-12 15:19:59 -07:00
Hui Su f4300285a3 Merge "Enable ML based rect partition pruning for HBD" 2018-10-12 19:59:53 +00:00
Hui Su bb82c4997f Enable ML based rect partition pruning for HBD
Tested on lowres_bd10(480p) and midres_bd10(720p), average coding
loss is 0.09%; average encoding speedup is 9%.

Only speed 0 is affected.

Change-Id: Ia8d48c1c6d1669745f0e956b172572a37e42f0c7
2018-10-12 09:31:22 -07:00
Yunqing Wang 1db646f0de Merge "Make 4-tap interp filter coefficients even numbers" 2018-10-12 16:14:00 +00:00
Jingning Han d409ffd458 Remove unused variable from VP9_COMP
Change-Id: I61447b7a21ac5b03f2a6accd6e433d8f9369e508
2018-10-11 21:35:17 -07:00
Yunqing Wang e4f030be87 Make 4-tap interp filter coefficients even numbers
This CL modified 4-tap interp filter coefficients to be even numbers,
which would help in writing 4-tap filter SIMD optimizations. The coding
performance change was negligible. Speed 1 borg test showed:
        avg_psnr:  ovr_psnr:    ssim:
lowres:  -0.003    -0.012      -0.017
midres:  0.029     0.018        0.043
hdres:   0.024     0.044        0.033

Change-Id: Id7c54bb9a9c1aee19c41bc6f1dc3b9682d158bba
2018-10-12 00:02:34 +00:00
Hui Su 9c1299e178 Merge "ML_VAR_PARTITION: adjust model threshold" 2018-10-11 23:41:57 +00:00
Jingning Han fbac7790a6 Merge "Call tpl model build at the beginning of a GOP" 2018-10-11 16:22:01 +00:00
Marco Paniconi f8ce04c93f Merge "Revert "vp8: Increase rate threshold for overshoot-drop"" 2018-10-11 10:25:12 +00:00
Marco Paniconi e188b5435d Revert "vp8: Increase rate threshold for overshoot-drop"
This reverts commit bc066684ca.

Reason for revert: <INSERT REASONING HERE>
Regression in webrtc perf test

Original change's description:
> vp8: Increase rate threshold for overshoot-drop
> 
> Increase the rate threshold for the dropping when
> overshoot is detected during encoding. This helps
> to prevent some unneccessary drops for hard content.
> 
> Change-Id: I258bf33883d46347efd44e1e192cb25c444d05fe

TBR=sprang@chromium.org,marpan@google.com,builds@webmproject.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Change-Id: Ib0e84747430ba6d04e479f9efd86d628b80a1e67
2018-10-11 10:24:16 +00:00
Angie Chiang 6f56d8b869 Merge changes Ia5978d91,I3e3754f3
* changes:
  Simplify mode_estimation / tpl_model_store
  Move [inter/intra]_cost change to mode_estimation
2018-10-11 01:18:45 +00:00
Angie Chiang 81f94b893b Add indep loop for motion_compensated_prediction
This is for non_greedy_mv experiment only
This is part of the change of changing mv search order according
feature_score.

Change-Id: I432efccd83d448a4a275dffd37921c76c3d84588
2018-10-10 17:49:08 -07:00
Harish Mahendrakar 4a2ea54a67 Merge "Loopfilter Multi-Thread Optimization" 2018-10-11 00:03:26 +00:00
James Zern d8c59e0206 Merge "subpel asm: fix whitespace" 2018-10-10 22:11:15 +00:00
Jingning Han d0f7f002b1 Call tpl model build at the beginning of a GOP
The gop index 0 is default as kf / gf. The effective first coding
frame controlled by the current GOP rate allocation is indexed 1.
Call the tpl model build for the current GOP once at index 1
position. This would unify the calling system for single/multi-layer
ARF GOP structure.

Change-Id: I4ce69337e04646098d5513c0aa56b4e0b4483337
2018-10-10 14:52:30 -07:00
Yunqing Wang 71ff94d3d3 Merge "Use 4-tap interp filter in speed 1 sub-pel motion search" 2018-10-10 21:05:44 +00:00
Johann 33b289f519 subpel asm: fix whitespace
Change-Id: I7a3314a268cf6049a7260361043e76d4561085c6
2018-10-10 07:44:49 -07:00
Matt Oliver b719117b9f project: Add WinRT/UWP configurations. 2018-10-10 21:20:07 +11:00
Angie Chiang 1072267dc1 Simplify mode_estimation / tpl_model_store
1) Let mode_estimation() save the results into tpl_frame directly
2) In tpl_model_store(), replace copies of tpl_stats parameters by
   memset()

Change-Id: Ia5978d91cb60cf896bd53d3f27701ef9ae3ba09a
2018-10-09 13:40:47 -07:00
Angie Chiang 8ca61034bc Move [inter/intra]_cost change to mode_estimation
Change-Id: I3e3754f349d31d17554f02bd14cd34620057ddcb
2018-10-09 12:43:36 -07:00
Angie Chiang af1fb84dd3 Merge changes I67700eba,I9e8f8ed3,Id93565cc
* changes:
  Move feature_score into an independent for loop
  Add set_mv_limits()
  Move lambda into TplDepFrame
2018-10-09 18:33:32 +00:00
Yunqing Wang 50b91aff52 Use 4-tap interp filter in speed 1 sub-pel motion search
Added the 4-tap interp filter, and used it for speed 1 sub-pel motion
search. Speed 2 motion search still used bilinear filter as before.

Speed 1 borg test showed good bit savings.
        avg_psnr:  ovr_psnr:    ssim:
lowres:  -1.125    -1.179      -1.021
midres:  -0.717    -0.710      -0.543
hdres:   -0.357    -0.370      -0.342
Speed test at speed 1 showed ~10% encoder time increase, which was
partially because of no SIMD version of 4-tap filter.

Change-Id: Ic9b48cdc6a964538c20144108526682d64348301
2018-10-09 09:47:22 -07:00
Yunqing Wang afa4b9780a Add accurate sub-pel motion search
Added the accurate sub-pel motion search. In this patch, used the 8-tap
filter in sub-pel motion search, and this was enabled at speed 0.

Speed 0 borg test showed that
        avg_psnr:  ovr_psnr:    ssim:
lowres:  -1.363     -1.403     -1.282
midres:  -0.842     -0.849     -0.720
hdres:   -0.480     -0.488     -0.503
Speed test at speed 0 showed ~8% encoder time increase.

Change-Id: I194ca709681ea588f3f6381093940e22d03c4d7b
2018-10-09 16:16:30 +00:00
Yunqing Wang 676b9b00d0 Merge "Set up the unit scaling factor for motion search" 2018-10-09 16:08:06 +00:00
Angie Chiang a9c18f7b33 Move feature_score into an independent for loop
We aim at change the mv search order according to feature_score
This is part of the change.

Change-Id: I67700eba014df92190eabc78060cf29adf0fc38b
2018-10-08 17:55:34 -07:00
Angie Chiang 3805232a69 Add set_mv_limits()
Change-Id: I9e8f8ed3eb150b3af1f465f595000bd05d43f3f6
2018-10-08 17:54:25 -07:00
Yunqing Wang 4af18a71d3 Set up the unit scaling factor for motion search
Set up the unit scaling factor used during motion search.

Change-Id: I6fda018d593b7ad4b7658d44c39be950a502d192
2018-10-08 15:59:32 -07:00
Supradeep T R 4da98b0cc3 Loopfilter Multi-Thread Optimization
Take the original loopfilter multi-thread optimization
(dafe064289) along with the fixes for bugs
1558 and 1562.

BUG=webm:1558
BUG=webm:1562

Change-Id: Ibbf6bd13f4ffff0e79184ccfd6b85a49e067a6d8
2018-10-08 14:59:09 -07:00
Angie Chiang 23d3633f40 Move lambda into TplDepFrame
Change-Id: Id93565cca41e00d4ab5de4c6de30accabf2adc52
2018-10-08 14:35:26 -07:00
Wan-Teh Chang 9addd7499c Reduce the cpi->scaled_ref_idx array size by 1.
The last element of the cpi->scaled_ref_idx array was not used, so
reduce the array size by 1.

The corresponding libaom CL is
https://aomedia-review.googlesource.com/c/aom/+/72445.

Change-Id: I9166f0fbe1a7898c8b611b1535fcc74b4f766997
2018-10-08 13:52:27 -07:00
Wan-Teh Chang 9e12d5a691 Merge "Avoid null checks related to pool->frame_bufs." 2018-10-08 20:51:28 +00:00
Wan-Teh Chang e00c00b48e Merge "Correct a for loop in init_ref_frame_bufs." 2018-10-08 19:07:02 +00:00
Hui Su 51b4601dd2 Merge "Turn on ml_var_partition_pruning for speed 1" 2018-10-08 17:25:23 +00:00
Wan-Teh Chang 51a47b0c72 Correct a for loop in init_ref_frame_bufs.
The cm->ref_frame_map and pool->frame_bufs arrays are of different sizes
(REF_FRAMES and FRAME_BUFFERS, respectively), so init_ref_frame_bufs()
cannot iterate over these two arrays using the same for loop.

Change-Id: Ica5bbd9d0c30ea3d089ad2d4bcf6cd8ae2daea64
2018-10-08 10:09:26 -07:00
Wan-Teh Chang 34e191b2f1 Avoid null checks related to pool->frame_bufs.
It seems that null pointer checks such as the following may make clang
scan-build think pool->frame_bufs may be a null pointer:

    buf = (buf_idx != INVALID_IDX) ? &pool->frame_bufs[buf_idx] : NULL;
    if (buf != NULL) {

This "misinformation" may make scan-build warn about the ref_cnt_fb()
function's use of its 'bufs' argument (Dereference of null pointer) when
we pass pool->frame_bufs to ref_cnt_fb().

Rewriting the above code as:

    if (buf_idx != INVALID_IDX) {
      buf = &pool->frame_bufs[buf_idx];

not only is clearer but also avoids confusing scan-build.

Change-Id: Ia64858dbd7ff89f74ba1a4fc9239b0c4413592c8
2018-10-08 09:41:55 -07:00
Yunqing Wang ecc31d2878 Merge "Changes to facilitate accurate sub-pel motion search" 2018-10-08 15:41:09 +00:00
Angie Chiang 4a47ef814b Merge "Fix bug in prepare_nb_full_mvs" 2018-10-06 00:42:47 +00:00
Hui Su b25c7515a0 Turn on ml_var_partition_pruning for speed 1
Coding loss:
lowres 0.08%; midres 0.11%; hdres 0.07%

Average encoding speedup is about 6%.

Change-Id: I950291cf0f1d610bcdedeb150bcbefea2f5579bc
2018-10-05 18:45:49 +00:00
Hui Su 49b7e8cda4 ML_VAR_PARTITION: adjust model threshold
Make decisions more aggressively to improve encoding speed.

Coding gains(avg-psnr) after this change over baseline:
rtc       1.55% for speed 7;  2.89% for speed 8.
ytlivehr  2.20% for speed 6.

Change-Id: If6ac4a942a5b4708bcc6b0a49bd92fbc4d67c3f8
2018-10-05 11:10:12 -07:00
Yunqing Wang c5586bfa84 Changes to facilitate accurate sub-pel motion search
This patch included changes to facilitate accurate sub-pel motion
search. More patch will follow to turn on accurate sub-pel motion
search.

Change-Id: I224c28c338353fe5c7609372162f79885c54248f
2018-10-05 11:06:59 -07:00
Angie Chiang e49ef1476d Fix bug in prepare_nb_full_mvs
Previously, the prepare_nb_full_mvs might construct nb_full_mv with
wrong mvs (from other ref frame).
The following changes will fix the bug.
1) Let ready in TplDepStats becomes int array
2) Add parameter rf_idx
3) Use mv_arr instead of mv to build the nb_full_mv

Change-Id: I199798aec4c6762d54799562e142457cc26ee043
2018-10-04 15:17:02 -07:00
Jingning Han 936b59ef0a Merge "Clean up vp9_firstpass.h" 2018-10-04 15:26:31 +00:00
Angie Chiang 5062f3c182 Merge "Add mv_{dist/cost}_sum to TplDepFrame" 2018-10-04 00:48:42 +00:00
Jingning Han 3cb34cef28 Clean up vp9_firstpass.h
Remove unused functions and macros.

Change-Id: I46458a60f75637c66af0e18ad876a2634e5818bb
2018-10-03 16:01:26 -07:00
Marco Paniconi bc066684ca vp8: Increase rate threshold for overshoot-drop
Increase the rate threshold for the dropping when
overshoot is detected during encoding. This helps
to prevent some unneccessary drops for hard content.

Change-Id: I258bf33883d46347efd44e1e192cb25c444d05fe
2018-10-03 15:29:17 -07:00
Angie Chiang 129e6fe524 Merge changes I66b35ef7,Ic9ed6ed6,Ie5818689
* changes:
  Add mv_dist/mv_cost to TplDepStats
  Change interface of motion_compensated_prediction
  Separate lambda from nb_mvs_inconsistency()
2018-10-03 01:05:21 +00:00
Chi Yo Tsai 6b191dc402 Merge "Change the frame used to set up encoder in tests to 0." 2018-10-03 00:07:55 +00:00
chiyotsai 3b946ed252 Change the frame used to set up encoder in tests to 0.
Change-Id: Ied460b6ff53a58050d53dec8d32b627de5de3f3a
2018-10-02 16:03:45 -07:00
Jingning Han d3d30fde93 Merge "Minor clean-up in tiny_ssim.c" 2018-10-02 21:16:35 +00:00
Jingning Han 77e389dca4 Merge "Force even arf group length where possible." 2018-10-02 21:16:19 +00:00
Jingning Han c230d7f6e6 Merge "Keep metric log only for the displayable frames" 2018-10-02 18:24:29 +00:00
Jingning Han b944b8becc Merge "Fix vpxenc per frame psnr and ssim print" 2018-10-02 18:24:25 +00:00
Jingning Han 794b033f78 Minor clean-up in tiny_ssim.c
Report the correct filename in error message.
Explicitly assign floating point value to double type.

Change-Id: I42fd2da6e16b1e3e7ec221d5d562a728a93c0196
2018-10-02 10:22:56 -07:00
Paul Wilkins 6804737ef5 Force even arf group length where possible.
This patch tweaks the calculation of the active maximum GF interval
and also the break out clause for the GF interval loop. The changes
force the maximum and where possible the  break out value to be odd
which in turn will result in an even length ARF group if ARF coding is
selected (vs GF only coding).

The primary aim was to improve coding with multi layer arf groups.
For the single layer case there are small net gains in 3 out of 4 sets
(low,md, hd) and a small net drop for the NF2K set.

For multi-layer the gains (opsnr, ssim, psnr-hvs : -ve = better) were:-

Low res: -0.109, -0.038, -0.036
Mid res: -0.204, -0.171, -0.242
Hd res: -0.330, -0.471, -0.496
NF 2k: -0.165, -0.149, -0.157

Change-Id: I245f8561f5d1bd34312a0133c670c2154a0da23f
2018-10-02 16:52:37 +00:00
Jingning Han e5c304fb5a Keep metric log only for the displayable frames
The end-to-end reconstruction quality is represented only by the
displayable frames. Drop the coding stats from ARF frames.

Change-Id: Ib8241db448611f4b6477f107930eaa273f960e20
2018-10-01 15:31:56 -07:00
Jingning Han acd82ad67c Fix vpxenc per frame psnr and ssim print
Fix compiler error and make the encoder properly log the psnr and
ssim.

Change-Id: I7b35541131acaa60117bb1e458508b82a4b4677e
2018-10-01 15:31:07 -07:00
Hui Su 6b7848d4c9 Introduce the ml_var_partition_pruning feature
Add the ml_var_partition_pruning encoder speed feature that
uses a neural net model to prune partition-none and partition-split
search. The model uses prediction residue variance and quantization
step size as input features.

Encoding speed gain for speed 0(tested over 20 hdres clips):
            QP=30    QP=40
average     17.7%    18.3%
max        24.46%    26.6%

Coding loss:
lowres 0.071%;  midres 0.098%;  hdres 0.163%

Currently it is enabled for speed 0 low-bit depth only. It needs to be
tuned for other settings.

Change-Id: Ifb7417daa6bb6e7c97bb676269ce54ab0dc7b8c8
2018-10-01 13:48:39 -07:00
Jingning Han 2beb5c9f91 Merge "Remove deprecated get_arf_buffer_indices()" 2018-09-29 04:20:31 +00:00
Jingning Han dbd3b8210f Merge "Remove deprecated arf_update_index from GF_GROUP" 2018-09-29 04:20:27 +00:00
Hui Su 99081038e7 Merge "Add ml_var_partition experiment" 2018-09-29 02:00:16 +00:00
Angie Chiang d4789c7304 Merge changes I93308a09,If85c36b4,I918eb36a
* changes:
  Add vpx_clear_system_state to new mv search func
  Change mv color to red
  Call vp9_full_pixel_diamond_new in tpl mv search
2018-09-28 23:09:19 +00:00
Angie Chiang 7fc5f6739a Add mv_{dist/cost}_sum to TplDepFrame
Change-Id: Iacce1f88630ba93ff72d745a83dd4b853b6b61af
2018-09-28 16:06:14 -07:00
Angie Chiang 6f8861e293 Add mv_dist/mv_cost to TplDepStats
Change-Id: I66b35ef76c229d4eb3bf3c913619a0e219c4c2f9
2018-09-28 16:06:14 -07:00
Angie Chiang ebe10bcc33 Change interface of motion_compensated_prediction
Change the interface of vp9_full_pixel_diamond_new

Change-Id: Ic9ed6ed61c5178f3f445f40860ebaac7ea17f75d
2018-09-28 16:05:34 -07:00
Angie Chiang 33cc467047 Separate lambda from nb_mvs_inconsistency()
Change-Id: Ie5818689233ae01742ca595e2c8c3f3664bb426c
2018-09-28 14:23:23 -07:00
Jingning Han c3d330496e Remove deprecated get_arf_buffer_indices()
Change-Id: I6d0c8a1a61d861aa0353cde76a833c7c0b222279
2018-09-28 11:30:56 -07:00
Jingning Han b898e68c7b Remove deprecated arf_update_index from GF_GROUP
As we move to unify the GOP structure layout control, the variable
arf_update_idx and arf_ref_idx are deprecated.

Change-Id: Iadcb9e6033d419d4b2015fe747c23be59a7da787
2018-09-28 11:28:35 -07:00
Jingning Han 1794bac3fb Merge "Fix minor bug in calculation of max arf group length." 2018-09-28 17:38:26 +00:00
Jingning Han 4e61b2eec7 Merge "Adjustment of GOP intra factor for multi-layer." 2018-09-28 17:38:14 +00:00
Jingning Han b628161cbc Merge "Add MID_OVERLAY_UPDATE frame type" 2018-09-28 17:18:54 +00:00
Jingning Han 7a50129f65 Merge "Refactor gf_overlay frame type update" 2018-09-28 17:18:46 +00:00
Paul Wilkins 5cbfa48fe6 Merge "Revert "Merge "Adapt GOP size threshold to the allowed layer depth""" 2018-09-28 17:16:01 +00:00
Paul Wilkins 01ba4cf0ac Fix minor bug in calculation of max arf group length.
Their is no valid last boosted Q availably when estimating the maximum
group length for the first ARF group in a clip, so use a value based on
the current max q.

Change-Id: Ida0b4bfb7ce7433089ad808abed7f59c88527a81
2018-09-28 17:13:31 +01:00
Paul Wilkins ec7ab809c5 Adjustment of GOP intra factor for multi-layer.
This provides and alternative (still to be tuned for edge cases)
approach to adjusting the gop intra factor when multi-layer coding
is in effect that does not alter single layer coding.

Change-Id: Iba86d65a6e68e86aa031b7e1f0b6a4c55761b1b8
2018-09-28 17:10:53 +01:00
Hui Su a2cd017016 Add ml_var_partition experiment
Make partition decisions using machine learning models. The goal is to
achieve better coding quality than the variance-based parititioning
without much encoding speed loss.

To enable this experiment, use --enable-ml-var-partition for config.

When eanbled, the variance-based partitioning is replaced by this ML
based partitioing for speed 6 and above in real time mode(except low
resolution or high bit-depth).

Current coding gains(average PSNR):
                speed 6      speed 7      speed 8
rtc              2.04%        2.65%        3.90%
ytlivehr         3.11%        4.53%       11.57%
hdres(rtc mode)  5.10%

Further testing and tuning is needed to see if the speed and quality
tradeoff is reasonable.

Change-Id: I0da5a2fbc22c3261832b32920ee36d9b19d417af
2018-09-28 09:07:54 -07:00
Hui Su 308454502c Merge "Fix a loophole in nonrd_pick_partition()" 2018-09-28 16:06:00 +00:00
Paul Wilkins d2641ff1df Revert "Merge "Adapt GOP size threshold to the allowed layer depth""
This reverts commit 5efde3914f, reversing
changes made to 3a29159372.

This is badly broken and may help somewhat for multi-layer but is hurting
massively in single layer encodes.

I ran this through this morning and while it often helps in SSIM it is badly down
for global PSNR and PSNR-HVS with some clips down by 35-40%. This is in line
with previous experiments where I have found that a bigger boost helps SSIM
but hurts PSNR and PSNR HVS.

I was also working on changes to the I factor that gave some improvements
in single layer though these were based upon the active Q mostly. I also have
looked at a bug for the first group where int_lbq is not properly defined and
will submit an interim patch for this while I look for a better solution.

In the meantime I think we should revert this.

The (Global PSNR, SSIM, PSNR-HVS) for the patch as is in my runs for
single layer vs a couple of days ago seem to be (-ve is better).

Low res 0.346, -1.475, 0.239
mid res  1.581, -1.300, 1.731 (worst result down by 30-40% in psnr)
hdres 0.665, -0.712, 1.043 (worst result down by 17-19% in psnr)
NF2k 0.927, 0.111, 1.3220 (Worst result down by 5-7% in psnr)

Change-Id: I55952b71b8cfc5a84484b3b659c5f8a530f3a755
2018-09-28 13:17:31 +01:00
Jingning Han c1b6e220e3 Add MID_OVERLAY_UPDATE frame type
Add a MID_OVERLAY_UPDATE abstract to support multi-layer
ARF-Overlay frame based approach. When setting the frame update
type to be USE_BUF_FRAME, the encoder will use show_existing_frame
to process the intermediate ARF frames. When setting the frame
update type to be MID_OVERLAY_UPDATE, the intermediate ARF frames
will go through an overlay frame for display.

Change-Id: Ia0c91452c09d39312ac22d855cdf681b7da851c5
2018-09-27 21:10:06 -07:00
Jingning Han 89c4ba1c77 Merge "Remove deprecated variables from GF_GROUP structure" 2018-09-28 03:52:38 +00:00
Jingning Han 0171529f86 Merge "Remove unused for-loop in multi-layer arf bit allocation" 2018-09-28 03:52:27 +00:00
Jingning Han 5efde3914f Merge "Adapt GOP size threshold to the allowed layer depth" 2018-09-28 03:52:18 +00:00
Jingning Han 0c84f1e458 Refactor gf_overlay frame type update
Factor out common code.

Change-Id: Ia548842557d85ab692fe658acf97d61f008e9588
2018-09-27 16:28:56 -07:00
Hui Su 2176d88fc3 Fix a loophole in nonrd_pick_partition()
In some rare cases, all possible paritions may be skipped during RD
search. The patch makes the encoder do rectangular partition search if
both partition-none and partition-split are not allowed.

Tested on the rtc and ytlivehr testsets with speed 5 and 7, no coding
stats changes were observed.

Change-Id: I8b6d8b62b6d2431be8e73317d113311c98f631d5
2018-09-27 13:37:40 -07:00
Jingning Han 28880c3a28 Remove deprecated variables from GF_GROUP structure
Change-Id: I8c02216a369be6a51af9872f3ce05045038fc481
2018-09-27 12:00:56 -07:00
Jingning Han 6d082201f1 Remove unused for-loop in multi-layer arf bit allocation
The for-loop is not taking effect any more.

Change-Id: Ief2763990a6d4f487a5eb4972012d86379573d55
2018-09-27 10:50:28 -07:00
Jingning Han 59993f1661 Adapt GOP size threshold to the allowed layer depth
Increase the total prediction error budget linearly with the
allowed ARF layer depth. This in general improves the compression
performance, but does hit corner cases on a few clips at very
low bit-rate range (corresponding to 26 - 28 dB range). To mitigate
such problem, we temporarily work around this problem by limiting
the first GOP size to be ~8 so as to not drain up the bit resource.

The overall compression performance improvements over the current
multi-layer ARF system in speed 0 are:

           overall PSNR      avg PSNR        SSIM
lowres     -0.47%            -0.13%         -1.51%
midres     -1.30%            -1.16%         -2.80%
hdres      -0.91%            -0.84%         -2.15%

Change-Id: Ia4880ab63e98e15a9db99aea6eabfd3d1da9270d
2018-09-27 10:45:26 -07:00
Johann Koenig 3a29159372 Merge "add cfi sanitizer" 2018-09-27 15:08:12 +00:00
Angie Chiang 12d6c66f50 Add vpx_clear_system_state to new mv search func
Change-Id: I93308a0906165b8fa56b59e199c3de29b572f666
2018-09-26 18:06:05 -07:00
Angie Chiang b9ce3304de Change mv color to red
Change-Id: If85c36b44b41e8cf025a5e08d7055ec32a14d26b
2018-09-26 18:04:40 -07:00
Johann 39ee898f89 add cfi sanitizer
Change-Id: I4262bb631c248ad188f09a37d774d1759695b0d7
2018-09-26 16:52:54 -07:00
Johann 97ff0463f4 CONFIG_WEBM_IO: include webmids.h
This was previously brought in with the examples. When building
with --disable-examples and --enable-codecs-srcs, this file
gets lost.

Change-Id: Id8bd67cb78c4f06647f34e85f425dfc701c640c0
2018-09-26 16:14:19 -07:00
Angie Chiang 2be8b384ab Call vp9_full_pixel_diamond_new in tpl mv search
The function is called in motion_compensated_prediction when
CONFIG_NON_GREEDY_MV is on.

The parameter lambda is used to adjust the importance of
mv consistency between neighbor blocks.

The lambda value is set to a random value for now, and still needs
to be tuned.

Change-Id: I918eb36a686eaa56b4009058f5f329e90c75870b
2018-09-26 14:34:40 -07:00
Angie Chiang 2d86965fee Merge changes If96a8a1c,Iaf535fde,Icbde9880
* changes:
  Add vp9_full_pixel_diamond_new
  Add vp9_refining_search_sad_new
  Add vp9_diamond_search_sad_new
2018-09-26 19:06:29 +00:00
Jingning Han 4788d62066 Merge "Use layer dependent gfu_boost factor" 2018-09-26 04:18:15 +00:00
Angie Chiang a7aca1b5af Add vp9_full_pixel_diamond_new
This function will call vp9_diaomond_search_sad_new /
vp9_refining_search_sad_new accordingly.

Change-Id: If96a8a1c9c06b6b4ed3aac6d59bdb03f20c96df9
2018-09-25 17:05:06 -07:00
Angie Chiang 0b06f95d2a Add vp9_refining_search_sad_new
The new version of refining search function will take into account
neighbor motion vectors' inconsistency while doing mv search

Change-Id: Iaf535fde04805de3dc7dd9a32f1695bf454e2d63
2018-09-25 17:05:06 -07:00
Angie Chiang f3a2b935e5 Add vp9_diamond_search_sad_new
This new version of diamond search function will take into account
neighbor motion vectors' inconsistency while doing mv search

Change-Id: Icbde9880305cb8aea7937d6ddcef1597bf9be018
2018-09-25 17:05:06 -07:00
Jingning Han f6ddb92a33 Use layer dependent gfu_boost factor
When multi-layer ARF is enabled, use the corresponding gfu_boost
factor assigned to each ARF to compute the best_quality_index
adjustment. This on average improves the coding performance by
0.2% for lowres and hdres, 0.4% for ntflx2k. It seems this change
will only affect a small group of clips, e.g., pamphlet, bowing,
mobcal_720p, etc., which tend to gain 4-5%, whereas the rest
clips remain largely identical coding statistics.

Change-Id: Ie19636a6cf32214aefd73e21ead2aea647ddbca8
2018-09-25 16:08:37 -07:00
Hui Su cdaca1f355 Merge "Remove redundant code" 2018-09-25 16:53:38 +00:00
Johann Koenig 5fecb59436 Merge "clang-format v6.0.1" 2018-09-25 14:25:52 +00:00
James Zern 4858c52437 Merge "vp9,encoder: check pointers before member access" 2018-09-25 03:40:00 +00:00
Johann 08e6fd2fbb clang-format v6.0.1
Change-Id: I83c7e64fe70f7c49aa2492ed2d640c6756b7ebaa
2018-09-24 18:31:35 -07:00
Johann Koenig 78f1ae5ffc Merge "sanitizer: sse2 - fix unaligned double stores" 2018-09-24 23:11:49 +00:00
Johann Koenig af2ba81b94 Merge "segfault: fix missing alignment declaration" 2018-09-24 22:30:46 +00:00
Johann Koenig 4060456c03 Merge "fix integer overflow caused by uninitialized memory" 2018-09-24 22:30:09 +00:00
Matthias Räncker 4fa0727fbc sanitizer: sse2 - fix unaligned double stores
Signed-off-by: Matthias Räncker <theonetruecamper@gmx.de>
Change-Id: I838c8678e62f7cff13387b84d4f3ea42710a67ea
2018-09-25 00:09:17 +02:00
Hui Su 626ca102f3 Remove redundant code
in set_rt_speed_feature_framesize_independent().

use_nonrd_pick_mode is already set for speed >= 5, so need to set again
for speed >= 6.

Change-Id: Idb0a4b36d21e305bd63f19e98a70f615ad76f514
2018-09-24 11:27:54 -07:00
Hui Su 4b8cb2e66b Merge "Improve subpel MV search for speed 1" 2018-09-24 18:15:49 +00:00
Matthias Räncker 8b9edfc528 fix integer overflow caused by uninitialized memory
Signed-off-by: Matthias Räncker <theonetruecamper@gmx.de>
Change-Id: I55ec2a803eff89b07376459e334d4e949bfcb2cc
2018-09-23 17:30:06 +02:00
Matthias Räncker a1d8aec6b4 segfault: fix missing alignment declaration
These variables are being fed to sse2 functions, that use aligned
loads.

Signed-off-by: Matthias Räncker <theonetruecamper@gmx.de>
Change-Id: I796c3483c6f3425d63d9262b02b19da59d536600
2018-09-23 17:04:14 +02:00
James Zern 3448987ab2 Revert "Revert "Revert "Loopfilter MultiThread Optimization"""
This reverts commit bf6299010e.

segfaults, causes an assertion failure with corrupt input:
get_uv_tx_size: Assertion `mi->sb_type < BLOCK_8X8 ||
ss_size_lookup[mi->sb_type][pd->subsampling_x][pd->subsampling_y] !=
BLOCK_INVALID

BUG=webm:1562

Change-Id: I05a711cad3d8e7f1a8e64422b4356bdf4edb3d12
2018-09-22 22:25:46 +00:00
Jerome Jiang d99b6af68f Merge "vp8: exit with bad fragment size in decoder." 2018-09-22 07:15:21 +00:00
Jingning Han 6a08df3d2a Merge "Rework is_compound_allowed logic at encoder" 2018-09-22 04:00:11 +00:00
Johann Koenig 51dac6d774 Merge "internal stats: fix mem leak and initialize memory" 2018-09-21 22:36:53 +00:00
Johann Koenig 0a6278b95a Merge "better-hw-compatibility: fix out of bounds access" 2018-09-21 22:30:52 +00:00
Johann Koenig d78b76557b Merge "Revert "third_party/googletest: update to v1.8.1"" 2018-09-21 22:20:29 +00:00
Johann Koenig 88cd62f8bd Revert "third_party/googletest: update to v1.8.1"
This reverts commit 7d777ce613.

Reason for revert: Generates build warnings on VS10/VS12

third_party\googletest\src\include\gtest/gtest-printers.h(1036): error C2770: invalid explicit template argument(s) for 'AddReference<const ::std::tr1::tuple_element<I,std::tr1::tuple<_Arg0,_Arg1>>::type>::type testing::internal::TuplePolicy<TupleT>::get(const std::tr1::tuple<_Arg0,_Arg1> &)' [C:\src\buildbot\test-libvpx\tests\i9vRsze8hQ\.build-x86-win32-vs10\test_libvpx.vcxproj]

Original change's description:
> third_party/googletest: update to v1.8.1
>
> BUG=webm:1559
>
> Change-Id: I7a0b16c7bf3f97db2d8650a190b93aae7e12a948

TBR=tomfinegan@chromium.org

Bug: webm:1559
Change-Id: Ia1a7354084c778a4c4e91b33fef6462e88986d1e
2018-09-21 22:20:17 +00:00
Matthias Räncker a439b3c977 better-hw-compatibility: fix out of bounds access
With --enable-better-hw-compatibility an access to array element -1
can be observed for VP9/ActiveMapTest.Test/0
../vp9/encoder/vp9_rdopt.c:3938:53: runtime error:
  index -1 out of bounds for type 'RefBuffer [3]'

There doesn't seem anything that would prevent ref_frame from being 0.
If there is no reference frame it can probably be assumed that it
isn't scaled.

Signed-off-by: Matthias Räncker <theonetruecamper@gmx.de>
Change-Id: I0a29cd0ffc9a19742e5e72203d5ec5d0a16eac7a
2018-09-21 23:35:42 +02:00
Johann Koenig c2bbe7154f Merge "sanitizer: fix unaligned loads" 2018-09-21 18:34:02 +00:00
Jerome Jiang e3522e0feb vp8: exit with bad fragment size in decoder.
BUG=webm:1555
Change-Id: Ie024c9f5a21f4ed05ab6b93f1677662eeef9e6d8
2018-09-21 11:18:17 -07:00
Angie Chiang 71b5897f39 Merge "Use different corner detection score" 2018-09-21 17:32:17 +00:00
Hui Su 1b14cb4e94 Improve subpel MV search for speed 1
Do one more subpel MV search each round. This improves coding
efficiency slightly:

lowres 0.12%
midres 0.11%
hdres  0.13%

Also renames the control flag for subpel MV search quality.

Encoding speed loss is less than 1%.

This only affects speed 1.

Change-Id: I3aecd25342f2dcacea6c143db494f7db6282cb92
2018-09-21 09:37:31 -07:00
Jingning Han 459905b93f Rework is_compound_allowed logic at encoder
Allow the encoder to fully utilize the decoder's capability to
handle both 1 fwd + 2 bwd case and 2 fwd + 1 bw case.

Change-Id: I3f984d52552ddb701b80b042d979f8fe09dd3a80
2018-09-21 08:24:24 -07:00
Matthias Räncker e376f1d5d4 internal stats: fix mem leak and initialize memory
Without calloc valgrind reports usuage of uninitialized data in
vpx_get_ssim_metrics.

Signed-off-by: Matthias Räncker <theonetruecamper@gmx.de>
Change-Id: I9cd38b8031ea3f22c1436894ddaf9e0ccf5a654e
2018-09-21 14:47:51 +02:00
Jingning Han e65f9e8bce Merge "Generalize encoder comp_var_ref setting" 2018-09-21 03:32:44 +00:00
Jingning Han 93edd79f0e Merge "Skip checking compound modes with same sign bias for sub8x8" 2018-09-21 03:32:08 +00:00
Jingning Han 796478a8c9 Merge "Update the comp_refs counts" 2018-09-21 03:32:01 +00:00
Jingning Han a1e87a5746 Merge "Skip RD check for compound modes that have same sign bias" 2018-09-21 03:31:57 +00:00
Jingning Han 711f0a6f93 Merge "Sync ref frame writing with decoder" 2018-09-21 03:31:50 +00:00
Jingning Han b1fad41aab Merge "Add frame_start/end to gf_group" 2018-09-21 03:31:45 +00:00
Matthias Räncker 9d3c5d33d1 sanitizer: fix unaligned loads
Another instance of unaligned 4-byte loads.

Signed-off-by: Matthias Räncker <theonetruecamper@gmx.de>
Change-Id: I06afc5405bb074384eec7a8c8123e5803e522937
2018-09-21 02:36:43 +02:00
Angie Chiang 4c6e284dc5 Use different corner detection score
This corner detection score is better at measuring the level of
details in each block.

Change-Id: I16327a7664144ddc463c29babd11d0ca2fbb54a0
2018-09-20 16:44:39 -07:00
Johann Koenig 6bf5b9bf63 Merge "third_party/googletest: update to v1.8.1" 2018-09-20 22:08:17 +00:00
Johann Koenig 282087a14c Merge "sanitizer: fix unaligned load/stores" 2018-09-20 21:56:21 +00:00
Matthias Räncker a93705f7f9 sanitizer: fix unaligned load/stores
When built with -fsanitizer=address,undefined a number of tests,
such as ByteAlignmentTest.SwitchByteAlignment or
ByteAlignmentTest.SwitchByteAlignment produce runtime errors about
unaligned 4-byte loads/stores. While normally not really a problem,
this does technically violate the language and it is eays to fix in
a standard conforming way using memcpy which does not produce
inferior code.

Signed-off-by: Matthias Räncker <theonetruecamper@gmx.de>
Change-Id: Ie1e97ab25fe874f864df48b473569f00563181ae
2018-09-20 22:37:42 +02:00
Johann 7d777ce613 third_party/googletest: update to v1.8.1
BUG=webm:1559

Change-Id: I7a0b16c7bf3f97db2d8650a190b93aae7e12a948
2018-09-20 13:37:24 -07:00
Johann Koenig ff0b9bb054 Merge "fix UB when initializing parameterized tests" 2018-09-20 19:52:33 +00:00
Angie Chiang 8f26f74ab9 Merge changes Ibbe7a1c1,I4333a207
* changes:
  Add feature score for each block
  Correct mv rows/cols bug in read_frame_dpl_stats
2018-09-20 18:15:48 +00:00
Jingning Han bc40b13f33 Generalize encoder comp_var_ref setting
Generalize the encoder comp_fixed_ref and comp_var_ref assignments.
Make it fully support 2 fwd + 1 bwd and 1 fwd + 2 bwd settings
that VP9 decoder allows.

Change-Id: Id74da9a66327189a3fdf382d447243003c431131
2018-09-20 10:05:42 -07:00
Jingning Han 7cc12cbe0e Skip checking compound modes with same sign bias for sub8x8
Drop the check of compound modes where the two reference frames
share the same reference frame sign bias in sub8x8 coding blocks.

Change-Id: I47b45256582b2b5ea1372c9130d8f28cd226a29c
2018-09-20 08:58:55 -07:00
Jingning Han 4770f851a8 Update the comp_refs counts
Generalize the comp_refs counts update support the case where one
has 1 fwd and 2 bwd reference frames too.

Change-Id: I979216a95d45efef51026158f94612bef39d3c6d
2018-09-20 08:58:06 -07:00
Matthias Räncker ad228021b3 fix UB when initializing parameterized tests
When running tests built with
-fsanitize=undefined and--disable-optimizations
the sanitizer will emit errors of the following general form:

runtime error: member call on address 0xxxxxxxxx which does not
point to an object of type 'WithParamInterface'
0xxxxxxxxx: note: object has invalid vptr
 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 ...
              ^~~~~~~~~~~~~~~~~~~~~~~
              invalid vptr

This can be traced to calls to WithParamInterface<T>::GetParam before
the object argument has been initialized. Although GetParam only
accesses static data it is a non-static member function. This causes
that call to have undefined behaviour.
The patch makes GetParam a static member function.

upstream pull request:
https://github.com/google/googletest/pull/1830

The alternative - if the pull request is denied - would be to
modify all parameterized tests to have them derive from
::libvpx_test::CodecTestWith*Params as the first base class.

Signed-off-by: Matthias Räncker <theonetruecamper@gmx.de>
Change-Id: I8e91a4fba5438c9b3e93fa398f789115ab86b521
2018-09-20 11:51:26 +02:00
Angie Chiang 4c1434e88c Add feature score for each block
The feature score is used to indicate whether a block's mv is reliable
or not.
Now we use Harris Corner Detector method to compute the score.

Change-Id: Ibbe7a1c1f3391d0bf4b03307eaabb5cc3cfb1360
2018-09-19 17:41:16 -07:00
Angie Chiang bb0e75463c Correct mv rows/cols bug in read_frame_dpl_stats
When the frame size is not multiples of mv search bsize,
the fractional part will increment the mv rows/cols by 1

Change-Id: I4333a207406610c540059a9356a82084832ca85b
2018-09-19 15:43:54 -07:00
Jingning Han d7ed861ce3 Skip RD check for compound modes that have same sign bias
The compound mode can only be run between two reference frames
with different sign bias flags. Skip the search over same sign
bias reference frames in the rate-distortion optimization.

Change-Id: I4a57feedea880883cf87200de51862beac108310
2018-09-18 15:58:00 -07:00
Jingning Han 1e1f78e34c Sync ref frame writing with decoder
Enable the encoder to produce compound reference frame writing
that supports both 2 fwd + 1 bwd and 1 fwd + 2 bwd cases.

Change-Id: I63d2141435e2de7d8115d52b974fc41c2e608405
2018-09-18 15:58:00 -07:00
Jingning Han 4b058f81fa Add frame_start/end to gf_group
Keep the start and end frame index for each group of pictures.

Change-Id: I23c0d22e643218cf7486b238c2986101282d3fbe
2018-09-18 15:58:00 -07:00
Johann Koenig 0aa83d61a1 Merge "Fix buffer overrun of postproc_state.limits" 2018-09-18 18:59:59 +00:00
Johann Koenig 2600f70c00 Merge "Fix stack corruption with x86 and --enable-pic" 2018-09-18 18:59:49 +00:00
Jingning Han 13055d915c Merge "Rename set_arf_sign_bias() to set_ref_sign_bias" 2018-09-18 18:45:13 +00:00
Jingning Han 2f3c1b521f Merge "Re-work set_arf_sign_bias()" 2018-09-18 18:45:07 +00:00
Jingning Han 8d97ba4a67 Merge "Update frame index per buffer at encoder" 2018-09-18 18:44:58 +00:00
Hui Su 734c3d2b66 Merge "Remove unnecessary code" 2018-09-18 18:36:59 +00:00
Hui Su 1988a2c417 Merge "Remove the SECOND_LEVEL_CHECKS_BEST macro" 2018-09-18 18:36:48 +00:00
Matthias Räncker 347d018115 Fix stack corruption with x86 and --enable-pic
x86inc.asm's cglobal macro is frequently used to declare more
arguments than the function actually has. Normally, this is
done to aquire an alias to a register that would correspond to
that positional function argument if it existed. This is safe
when used in this manner.
In the case fixed here, however, the alias is used to temporarily
store adresses obtained through the GOT in memory. Because those
extra arguments don't actually exist, those stores corrupt the
callers stack frame.
SSE2/VpxHBDSubpelVarianceTest.Ref is a test that may fail as a
result.
To simply fix the space allocated to actual arguments that have
been loaded into registers already is reused.
This avoids having to allocate extra space for local variables.

Also removed duplicate code while at it.

Signed-off-by: Matthias Räncker <theonetruecamper@gmx.de>
Change-Id: I505281ecaa6be586185fe6a2d34d62bdf40c839f
2018-09-18 17:33:19 +00:00
Hui Su 627e217b85 Remove unnecessary code
in vp9_find_best_sub_pixel_tree().

Change-Id: I0677c05b3e402fc17dd1e7e6fae787d305e90f89
2018-09-18 09:53:23 -07:00
Hui Su 618b7c16ab Remove the SECOND_LEVEL_CHECKS_BEST macro
This macro is used only once and makes the code relatively harder to
read and modify.

Change-Id: I8f0344a7050758ed9770ffca211b0237fe7d8b34
2018-09-18 09:41:14 -07:00
Jingning Han d1607bc674 Rename set_arf_sign_bias() to set_ref_sign_bias
Properly reflect its functionality that assigns the reference
frame sign bias to all the reference frames.

Change-Id: I7b597feeb06acd4c3a004cd51e4b285357315360
2018-09-18 04:10:50 +00:00
Jingning Han f67b4665ff Re-work set_arf_sign_bias()
Make it support automatic checking and assigning the reference
frame sign bias for all the reference frames.

Change-Id: Ie82f8f872e742130a652b6d5bc109039ac46ae3b
2018-09-17 21:09:12 -07:00
Jingning Han dfbd724ef3 Update frame index per buffer at encoder
Update the frame index counting from key frame offset for all
the processed frames at the encoder. This would allow encoder to
automatically decide frame sign bias next.

Change-Id: Ibbdc2a29b7245be27422272e1fb539596eed63d1
2018-09-17 20:58:14 -07:00
Jingning Han 7323218a5e Merge "Add a frame_index entry to RefCntBuffer" 2018-09-18 03:53:16 +00:00
Jingning Han 98ea9241ca Merge "Assign GOP frame offset to all the coding frames" 2018-09-18 03:53:09 +00:00
James Zern 73dbf6aa56 Merge "cosmetics: normalize include guards" 2018-09-18 03:38:15 +00:00
James Zern 42e48983aa vpx_mem: allow VPX_MAX_ALLOCABLE_MEMORY to be overridden
this allows the define to be set by the build environment

Change-Id: Ib40111c5d9bae417b031b8b40a7bc135c6734044
2018-09-17 12:41:33 -07:00
Jingning Han 6dcba16d7d Add a frame_index entry to RefCntBuffer
This entry will only be effectively used at the encoder side.
Adding it to the RefCntBuffer data structure would help make the
associated logic a lot simpler. Its effect on the decoder side
would be explicitly sent through the bit-stream.

Change-Id: I1660dce9e0bb6e28c3315d5e0df6dc4a9298f71f
2018-09-17 11:46:17 -07:00
Jingning Han 940a3c3834 Assign GOP frame offset to all the coding frames
Overload the use of arf_src_offset to account the relative frame
offset for all the coding frames within a GOP.

Change-Id: Ia86dede37c6a93d9f23098c15dbd936acefd75dc
2018-09-17 11:11:58 -07:00
Paul Wilkins 2945e9ebfa Remove multi_arf_last_grp_enabled flag.
Delete flag and associated code.

Change-Id: I899d258a4cd7b84de9136ccfa27cf8a50108b130
2018-09-17 15:55:31 +01:00
Paul Wilkins f0f2ba17d2 Remove multi_arf_enabled.
Remove deprecated multi_arf_enabled flag and associated code.

Change-Id: I73f06362a10faa5b3bd91a78eedb201a96434f18
2018-09-17 15:53:56 +01:00
Paul Wilkins 32f86223ac Remove multi_arf_allowed variable.
Removes deprecated multi_arf_allowed variable and dependent code.

Change-Id: Ic1cf341f807c38207e728c48a4c4442387db93ff
2018-09-17 15:28:23 +01:00
Matt Oliver c71c81ae11 project: Change SMP license to LGPL v2.1. 2018-09-17 16:40:36 +10:00
James Zern 2a5805e2cb cosmetics: normalize include guards
use the recommended format [1] of:
<PROJECT>_<PATH>_<FILE>_H_

[1] https://google.github.io/styleguide/cppguide.html#The__define_Guard
"All header files should have #define guards to prevent multiple
inclusion. The format of the symbol name should be
<PROJECT>_<PATH>_<FILE>_H_."

Change-Id: I2e8ab0b32fb23c30fa43cff5fec12d043c0d2037
2018-09-15 12:25:43 -07:00
James Zern 54827e5782 vp9,encoder: check pointers before member access
verify pointers passed to vp9_cyclic_refresh_free() and
vp9_setup_pc_tree() before attempting to free members of the structs.

based on the change in libaom:
ie41de6b5a AV1FrameSizeTests.LargeValidSizes: avoid segfault.

Change-Id: Ib81759923cb442e19f42e6edb4b61171d8799ba6
2018-09-14 21:36:26 -07:00
Liu Peng cb671194c9 fix a bug of tiny_ssim to handle odd frame sizes
Change-Id: Id8ef0eb211517a8f8ec764ec398d16efb9320540
2018-09-14 21:51:40 +00:00
Liu Peng e196a6ae71 fix a bug of tiny_ssim when the bit depth is 8
Change-Id: I2563e661c71b474fe04b70cd9b713d478a27ac5f
2018-09-14 20:10:15 +00:00
Angie Chiang b3a837cbf7 Merge changes Ic6b9330f,Ibe14a023
* changes:
  Fix mv_arr assignment
  Dump tpl mvs for mv search block
2018-09-14 00:10:19 +00:00
Angie Chiang 8d95d2682a Fix mv_arr assignment
Change-Id: Ic6b9330ffb9b75b3a8441024fbf8ba53c134621b
2018-09-13 12:27:22 -07:00
Angie Chiang 7fccd0bef5 Dump tpl mvs for mv search block
Change-Id: Ibe14a02391b960e030c4a48e61718e43a5a65788
2018-09-13 12:27:15 -07:00
Angie Chiang 99f7cf4ce5 Merge "Dump ref frame when DUMP_TPL_STATS is on" 2018-09-13 16:58:43 +00:00
Jingning Han f9b28eab5b Merge "Initial step in deprecating previous dual arf code." 2018-09-13 16:57:52 +00:00
Jingning Han a9659f6392 Merge "Remove deprecated first_inter_index" 2018-09-13 16:19:46 +00:00
Jingning Han b6e4d9c0ba Merge "Remove unused variables from VP9_COMP" 2018-09-13 16:19:39 +00:00
Paul Wilkins f0841f0a40 Initial step in deprecating previous dual arf code.
Always use cpi->multi_layer_arf branch if enable_auto_arf >= 2.

Use enable_auto_arf value to indicate max number of ARF
levels to use in multi-arf case.

Further cleanup to of old code follow in seperate patches.

Change-Id: I25cd1e4a119a2d482a15705f5126389054764f9f
2018-09-13 14:21:46 +01:00
Jingning Han da4d6a5e45 Remove deprecated first_inter_index
With the refactoring of logics that determines if a frame needs
re-code runs to adapt to the target bit-rate, the variable
first_inter_index is no longer in effect use. Hence remove it.

Change-Id: I045894ad1f8b1e00fa40d5a55d762bad0d31b27d
2018-09-12 22:51:19 -07:00
Jingning Han f05ba3960a Remove unused variables from VP9_COMP
Change-Id: I853e0925d29becb9c1f84e5c00d84649fb070a07
2018-09-12 22:49:07 -07:00
Angie Chiang 39c02a7eb1 Dump ref frame when DUMP_TPL_STATS is on
Also add a python script to parse the dumped results.

Change-Id: I1abea5a7c04d852ec40ce37d758af21960b6e589
2018-09-12 16:36:28 -07:00
Harish Mahendrakar edc203c4c7 Merge "Revert "Revert "Loopfilter MultiThread Optimization""" 2018-09-12 23:15:24 +00:00
Jingning Han 74b4dd480e Merge changes I7173f2fe,I460b6c4b,I5070657f,I2b3e1e16
* changes:
  Remove some deprecated FRAME_UPDATE_TYPE elements.
  Remove some deprecated constants.
  Remove unused rate control data elements
  Remove extra_arf_allowed.
2018-09-12 21:06:53 +00:00
Matthias Räncker a07707125f Fix buffer overrun of postproc_state.limits
Always allocate cpi->common.postproc_state.limits using unscaled width.

With ./configure --enable-pic --enable-decode-perf-tests
--enable-encode-perf-tests --enable-encode-perf-tests
--enable-vp9-highbitdepth --enable-better-hw-compatibility
--enable-internal-stats --enable-postproc --enable-vp9-postproc
--enable-error-concealment --enable-coefficient-range-checking
--enable-postproc-visualizer --enable-multi-res-encodin
--enable-vp9-temporal-denoising --enable-webm-io --enable-libyuv
segfaults tend to occur in VP9/DatarateOnePassCbrSvcSingleBR.* tests.

This is an analogue to issue
https://bugs.chromium.org/p/webm/issues/detail?id=1374
where a buffer allocated using a scaled width is reused after scaling
back to the original size. Unfortunately, in this case the unscaled
width doesn't appear to be known in the immediated context of the
allocation, so the the signature of vp9_post_proc_frame needs to be
changed to provide that information in order to provide a similar fix
as in #1374.

Signed-off-by: Matthias Räncker <theonetruecamper@gmx.de>
Change-Id: I6f943aafbb3484ee94c5b38d7fcdd9d53fce3e5f
2018-09-12 16:57:46 +00:00
Paul Wilkins 8eea617d51 Remove some deprecated FRAME_UPDATE_TYPE elements.
Removal of some frame types relating to deprecated multi-arf work.

Added a dummy value for the USE_BUF_FRAME frame type in the
declaration of the rd_frame_type_factor[FRAME_UPDATE_TYPES] structure.

Change-Id: I7173f2fe33a53117e1bde6f9621efc1a5951240b
2018-09-12 13:16:15 +01:00
Paul Wilkins ec8be10518 Merge "Fix rate control bug with recode all." 2018-09-12 11:57:02 +00:00
Paul Wilkins 5f8ce092bd Remove some deprecated constants.
Removal of some # defines relating to deprecated multi-arf work.

Change-Id: I460b6c4bee9bf0ef588eddc47329c2b17f60e5ba
2018-09-12 12:37:05 +01:00
Paul Wilkins b02d08cc84 Remove unused rate control data elements
Removal of rate control structure elements related to Zoe's
deprecated multi laryer ARF work.

Change-Id: I5070657f91df7bd3f9137cf74016f737313417c8
2018-09-12 12:29:18 +01:00
Paul Wilkins f23391572f Remove extra_arf_allowed.
Removed VP9_COMP element that is no longer used.

Change-Id: I2b3e1e16244074e3510c1467b0e7532213c4ae05
2018-09-12 12:06:36 +01:00
Paul Wilkins 80997347ba Merge "Enable rectangular partition search for speed 1" 2018-09-12 11:03:50 +00:00
Paul Wilkins 3da4d03dd4 Merge changes I306c582c,Id5389285
* changes:
  Remove configure_multi_arf_buffer_updates()
  Remove update_multi_arf_ref_frames()
2018-09-12 11:01:38 +00:00
Paul Wilkins 8164923198 Merge changes I97111413,Idcd6dcbc
* changes:
  Clean up define_gf_group()
  Clean up deprecated gop structure code
2018-09-12 11:01:20 +00:00
Paul Wilkins 4797ed2edb Merge "Set GF frame layer depth to be 0" 2018-09-12 11:00:49 +00:00
Venkatarama NG. Avadhani bf6299010e Revert "Revert "Loopfilter MultiThread Optimization""
This reverts commit 753fd86e86.

This also has the fix for the DoS reported in bug 1558.

BUG=webm:1558

Change-Id: I65ea84e0c11d6bd40d8cb0587dfe934b3ac11dce
2018-09-12 12:04:55 +05:30
Jingning Han 48cbff1d81 Remove configure_multi_arf_buffer_updates()
The bit-stream syntax doesn't support lst2/3/bwd reference frame
update. Remove the deprecated function that goes such assumption.

Change-Id: I306c582c2efc63928e4231adef2ee549076a987c
2018-09-11 23:39:14 +00:00
Jingning Han 09aaba2e08 Remove update_multi_arf_ref_frames()
The bit-stream syntax doesn't support the use of lst2/3 frames.
Remove the update_multi_arf_ref_frames() function that assumes
such functionality.

Change-Id: Id5389285c84fe6c578c52d210aa47ef3cb789f8e
2018-09-11 16:38:52 -07:00
Jingning Han a62a64ba0c Merge "Simplify vp9_frame_type_qdelta()" 2018-09-11 23:32:34 +00:00
Angie Chiang b853762124 Merge "Store mv/inter_cost/recon_error/err for ref frames" 2018-09-11 22:52:18 +00:00
Jingning Han 84b2daa422 Clean up define_gf_group()
Remove deprecated extra_arf_allowed code.

Change-Id: I97111413e6465475e750106fddef8f344db53405
2018-09-11 14:27:16 -07:00
Jingning Han 617a021b44 Clean up deprecated gop structure code
Gradually integrate the single-/multi-layer ARF and dual ARF
encoder control. Remove deprecated code.

Change-Id: Idcd6dcbca3f8d7597878d83dec421e16be819f55
2018-09-11 14:07:25 -07:00
Jingning Han cb544f1990 Simplify vp9_frame_type_qdelta()
Make direct use of frame type in the available VP9_COMMON structure.
Eliminate the need to map through rf_level to fetch the frame type.
This change doesn't alter the coding stats. It simplifies the
vp9_frame_type_qdelta() function logic and removes unnecessary
reference to rf_level.

Change-Id: I1a7b2f5abcae39aa4a60d08a6011dde38ecf3b58
2018-09-11 21:02:03 +00:00
Jingning Han 0a14a65f3f Set GF frame layer depth to be 0
Set the golden frame layer depth as 0 - the base layer in temporal
domain.

Change-Id: If63e1524a567fcff6162f4283811298551516be5
2018-09-11 10:02:11 -07:00
Jingning Han 302fb36649 Remove unused constant definition
ARF_DECAY_BREAKOUT is no long used.

Change-Id: I553f8a3087389f0343444e2551581e9de02d3427
2018-09-11 09:27:33 -07:00
Jingning Han b8b471103c Localize variable definitions
Localize variable definitions in setup_frames() and
two_pass_first_group_inter().

Change-Id: I66e842791d679be6d22cef50e0b395b5aa380eac
2018-09-11 09:24:27 -07:00
Jingning Han a2192701b1 Merge "Rework two_pass_first_group_inter()" 2018-09-11 04:04:31 +00:00
Jingning Han e407b1a616 Merge "Separate frame context index for GOP layers" 2018-09-11 04:04:14 +00:00
Jingning Han 6bc958713f Merge "Assign layer depth for all coding frames" 2018-09-11 04:03:52 +00:00
Hui Su 6665540ed2 Merge "Refactor block_rd_txfm()" 2018-09-11 02:18:18 +00:00
Jingning Han 9868499e4b Rework two_pass_first_group_inter()
This function is used to in part decide if to trigger recode loop
for the first normal P frame in a GOP. Rework its design logic to
support the GOP with multi-layer ARF. Allow recode when there is
a transition from ARF/OVERLAY/USE_BUF to normal P frame.

The overall coding performance for multi-ARF gets slightly better
(less than 0.1% for show_existing_frame case). Tested on a few
clips, the encoding speed remains similar too. This change primarily
serves to help integration of multi-layer ARF and dual-ARF systems.

Change-Id: Ia44e44526b05029b1546985b3eb649e767d5444f
2018-09-10 12:00:08 -07:00
Hui Su d4600d4a64 Refactor block_rd_txfm()
Merge two identical if branches.

Change-Id: Ie012ba9c116a30ef6fa2e7868c7a4ba886b99bc6
2018-09-10 11:53:04 -07:00
Angie Chiang 7378c04420 Store mv/inter_cost/recon_error/err for ref frames
These information will help with making better mv search decision

Add functionality to dump tpl_stats for offline analysis

Change-Id: Ic2ec34368499c9bccb4d1f21a12b66453847fcf2
2018-09-10 10:39:40 -07:00
Jingning Han 740083a97b Separate frame context index for GOP layers
Use separate frame context index to code frames at different layers.
The maximum index cap is set as 3. This improves the compression
performance of multi-layer ARF by 0.15% across the test sets.

The overall coding gains from multi-layer ARF are

         avg PSNR       ovarall PSNR        SSIM
lowres   -3.9%            -3.7%            -3.2%
midres   -3.5%            -3.2%            -2.3%
nflx2k   -4.3%            -4.6%            -3.0%

Change-Id: I8a0b345fdd47823c018544a6b4748753faf89dc1
2018-09-10 09:19:46 -07:00
James Zern 96e1c6b7ce fix vp9_svc_adjust_frame_rate signature
match the const in the header; quiets a visual studio warning.

since:
04b3d49ba vp9-svc: Allow for setting framerate per spatial layer.

Change-Id: I0a216eb8fe1a689fe6822bbfac70f7c98e9b1a70
2018-09-08 13:55:25 -07:00
Paul Wilkins bb58dfade7 Fix rate control bug with recode all.
This patch fixes a rate control bug that can manifest if the recode
loop is activated for all frame types. Specifically things go wrong when the
recode loop is used on an overlay frame that has a rate target of 0 bits.

The patch prevents adjustment of the active worst quality and repeat recode
loops for overlay frames.

The bug showed up during artificial experiments on re-distribution of bits in
ARF groups but does not activate in any current encode profile, as even best
best quality does not currently allow recodes for all frames.

Change-Id: I80872093d9ebd3350106230c42c3928e56ecb754
2018-09-08 02:45:08 +00:00
Jingning Han 2d65fc2ce6 Merge "Fork auto-alt-ref control" 2018-09-08 02:26:52 +00:00
Jingning Han d02748a607 Merge "Extend auto-alt-ref parameter range" 2018-09-08 02:26:47 +00:00
Angie Chiang 166916d9cf Merge "Add non-greedy-mv experimental flag" 2018-09-08 01:02:21 +00:00
Jingning Han b068888028 Assign layer depth for all coding frames
Assign layer depth for the base layer ARF and the normal frames.

Change-Id: I81cbb2846c3176336622f9006701c0219652905a
2018-09-07 16:03:51 -07:00
Jingning Han 7bbc2ca3fe Merge "Add NORMAL_BOOST macro" 2018-09-07 22:27:23 +00:00
Hui Su 9bd2bde10d Enable rectangular partition search for speed 1
This patch enables rectangular partition search on speed 1. The encoding
speed loss is reduced thanks to recently added speed features.

This only affects speed 1 low bit-depth encoding.

Coding gains:
           avg_psnr     ovr_psnr    ssim
lowres      0.577%       0.621%    0.665%
midres      1.147%       1.215%    1.148%
hdres       0.758%       0.790%    0.769%

Tested encoding speed on 15 midres and 15 hdres clips, average speed
loss:
           QP=30       QP=40        QP=50
midres     4.43%       3.72%       -1.05%
hdres      4.41%       5.65%        3.77%

Change-Id: Ifc0712becccc69f7498796359ff12dbfa63fd7b3
2018-09-07 10:37:59 -07:00
Jingning Han 74a334da7a Fork auto-alt-ref control
Temporarily fork the auto-alt-ref control meaning. When it is set
to be 1, use single layer ARF as baseline. The value 2 would enable
dual ARF system. Any number above it would trigger automatic multi-
layer ARFs.

We would gradually refactor and integrate dual ARF and multi-layer
ARF systems next, and eventually make auto-alt-ref directly control
the layer depth.

Change-Id: I292d27111ae8a596b97444afecf4b896043e543f
2018-09-07 10:31:25 -07:00
Jingning Han 8db47dfee5 Extend auto-alt-ref parameter range
Extend the upper limit from 2 (dual ARFs) to maximum ARF layers.
This would later allow --auto-alt-ref to directly control the
ARF layer depth later on.

Change-Id: I6324fe980122e73dc98f81c8d7de1193a1a16e51
2018-09-07 10:20:56 -07:00
Jingning Han 6c5f88a280 Add NORMAL_BOOST macro
Normal frame boost factor is set to be 100 as the baseline for
ARF boost. Replace the hard coded number with a macro.

Change-Id: I81ce30138f7819844e7a2d811de9e1ccbeb85da5
2018-09-07 10:03:33 -07:00
Marco Paniconi 4ddfa331c4 Merge "vp9-svc: Allow for setting framerate per spatial layer." 2018-09-06 16:17:50 +00:00
Paul Wilkins 9f493870d0 Merge "Fix short first kf bug." 2018-09-06 14:24:05 +00:00
Paul Wilkins bb582b50fc Merge "Revert "Revert "Prevent double application of min rate in two pass.""" 2018-09-06 14:23:54 +00:00
Jingning Han 4c06c02ad5 Merge "Adaptive ARF factor decision" 2018-09-06 03:45:22 +00:00
Jingning Han 6ec984e440 Merge "Recursive rate allocation for multi-layer ARF coding" 2018-09-06 03:45:13 +00:00
Jingning Han 509b27e4a0 Merge "Enable adaptive rate allocation for multi-layer ARFs" 2018-09-06 03:45:08 +00:00
Hui Su c0e9273f40 Merge "Initialize the best partition before partition RDO" 2018-09-05 22:02:22 +00:00
Hui Su 2b30f33e2d Initialize the best partition before partition RDO
This fixes the multi-thread encoder test failure.

Change-Id: I0c1845922068e71097a387db0969ca419accb3ed
2018-09-05 13:00:56 -07:00
Angie Chiang f2e126394c Add non-greedy-mv experimental flag
The experiment aims at making non-greedy mv search decision

Change-Id: I3d77048ce106771fe003f250d07b7ddf0112536f
2018-09-05 11:46:45 -07:00
Marco Paniconi 04b3d49bac vp9-svc: Allow for setting framerate per spatial layer.
Add duration to set_svc_ref_frame_config.

BUG=b/113346831

Change-Id: I63613aed6b1183f98d04831600a6bdd645c740df
2018-09-05 08:55:57 -07:00
Jingning Han bf3f26a935 Adaptive ARF factor decision
Re-count the factors to decide bit boost factor for the
intermediate layer ARFs. Make the gfu_boost factor assigned to
each ARF adapt to its local factors.

This and the recursive change 5bfe9eb together improves the
multi-layer ARF compression performance:

          avg_psnr      ovr_psnr     ssim
lowres    -0.39%        -0.54%       -1.6%
midres    -0.98%        -1.26%       -2.3%
hdres     -0.95%        -1.13%       -2.3%

Change-Id: I5fec3ea75cae58825787dc88dadc7e8697a041ea
2018-09-05 08:52:56 -07:00
Jingning Han c929797b68 Recursive rate allocation for multi-layer ARF coding
Recursively calculate the rate boost for the ARF frames at the
given layer depth from the remaining available bit resource after
the prior layer ARFs consumption.

Change-Id: I0e31bac4f87b895ca20605dc1307a8fc0d2a516d
2018-09-05 08:38:21 -07:00
Jingning Han 20ed66478a Enable adaptive rate allocation for multi-layer ARFs
Increase the bit allocation for the intermediate layer ARFs. The
current strategy assigns higher offset to the lower layer ARFs.
The needed budget is borrowed from the base layer ARF allocation.

Change-Id: I16b6e9cce4dab8e73e7b097674d1a8504205026e
2018-09-05 07:31:52 -07:00
Jingning Han 00207bc812 Merge "Increase encoder buffer for multi-layer ARFs" 2018-09-05 14:30:19 +00:00
Jingning Han 018ffd385f Merge "Structure the multi-layer ARF locations" 2018-09-05 14:30:13 +00:00
Hui Su ba1c053df9 Merge "Move partition search ML models to a seperate file" 2018-09-05 04:31:58 +00:00
Jingning Han 9987c48455 Merge "Assign target bits for multi-layer ARF system" 2018-09-04 18:29:25 +00:00
Hui Su 88cd871418 Move partition search ML models to a seperate file
Clean up vp9_encodeframe.c.

Change-Id: I4035fee94da746c74d72f71ca8334f91c5d10116
2018-09-04 10:54:29 -07:00
Marco Paniconi 57038c687d vp9-svc: Fix to first_spatial_to_encode for pattern constraint.
Change-Id: I876f69acf9420b3b013cb3048bbfa8ff059e2e50
2018-09-04 10:18:37 -07:00
Jingning Han 6367818c50 Increase encoder buffer for multi-layer ARFs
When multi-layer ARF mode is enabled, increase the encoder buffer
to account for the situation where several ARFs are coded together
in a frame packet.

Change-Id: I4e53095f6b6ac5a3c8d79414411ac39880bf1523
2018-09-04 10:00:26 -07:00
Jingning Han 30c22d842c Structure the multi-layer ARF locations
Fine tune the multi-layer ARF location decisions. Support deeper
layer structure.

Change-Id: I3e44cf52b6813f6267bcd7266f9aa1b7ded57f8e
2018-09-04 09:46:47 -07:00
Hui Su 7554416063 Merge "ML based rectangular partition search pruning" 2018-09-04 16:38:12 +00:00
Paul Wilkins 1f46c31844 Fix short first kf bug.
This change is in response to quality issue in b/112953058

The quality regression observed is a result of a bug that manifested
because of a very short key frame group at the start of a chunk.
The group was so short that it was less than the minimum allowed
length of an ARF group, so the initial group was coded as a GF only
group. However, group length was not set correctly and the result
was a frame coded with a target of 0 bits.

This causes two problems:

Firstly one very poor frame that caused the issue to be raised.

Secondly that one  frame obviously overshoots its 0 target very heavily
and this has the effect moving the needle significantly in terms of the
adaptive rate control (specifically the estimate of bits per macro block
used to estimate the active Q range). Consequently there is undershoot
for most of the rest of the chunk and the overall rate ends up much lower
than the target (14Mb/s vs a target of 22Mb/s). (The sharp drop in the
overall rate is also noted in the issue.

BUG=b/112953058

Change-Id: Ide9cce57acd3dee0f9496b752902e7b4735f2c7f
2018-09-04 16:24:17 +00:00
Jingning Han ae7d53202e Assign target bits for multi-layer ARF system
Keep the ARF and P frame rate allocation distribution. All the
intermediate ARFs are treated same as regular P frames.

Change-Id: I7807b8e6a8f19b6e1b09b9b7d119b3c88ef90b67
2018-09-04 07:28:33 -07:00
Jingning Han 7b6b6ac16a Merge "Properly update the raw_src_frame for psnr calculation" 2018-09-04 14:10:34 +00:00
Jingning Han c19db84816 Merge "Build arf index stack" 2018-09-04 14:08:45 +00:00
Marco Paniconi 7a32bc8f3a Merge "vp9-svc: Add bypass flag to constrain inter_layer." 2018-09-04 04:07:53 +00:00
Marco Paniconi a2f78c7c97 vp9-svc: Add bypass flag to constrain inter_layer.
The additional constraint imposed on inter-layer
prediction should only be done for non-bypass (fixed)
svc mode.

Change-Id: Ia22cdb7bc21684776c9a13397e177a1e1c3d55a2
2018-09-03 10:18:01 -07:00
Paul Wilkins ce29b59282 Revert "Revert "Prevent double application of min rate in two pass.""
This rate control bug in the original patch is not the underlying cause
of the quality regression but simply unmasked a problem which stems
from applying 0 bits to the last frame in a short KF group at the start
of a chunk.

This reverts commit d10b1f2336.

Change-Id: I32c91a24a14d013853bb8e5587aa69600e6a0063
2018-09-03 16:19:58 +01:00
Marco Paniconi a8d8f37d3c vp9-svc: Fix condition for pattern constraints
For fixed/non-flexible SVC mode: on non-key spatial
enhancement layers modify constraint on the inter-layer
prediction to include the first_spatial_layer_to_encode.

Change-Id: I6a59174976ad72d555653704dcd3b03c52e31b6f
2018-09-02 22:24:09 -07:00
Jingning Han cb338d452f Properly update the raw_src_frame for psnr calculation
Update the raw_src_frame to be the current input source frame in
the show_existing_frame mode.

Change-Id: Ia8edf49ca948c45ffe6c60556756b36124ab092a
2018-09-01 03:49:50 +00:00
Jingning Han fc905edb3a Build arf index stack
Stack the ARF frame indexes. Use the most recent one as the ARF
reference frame for frame coding.

Change-Id: I88a2202fa5deb2587d861b434d27ab8de0642cf7
2018-08-31 20:48:33 -07:00
Marco Paniconi d748cfdad2 vp9-svc: Add first_spatial_layer_to_encode per superframe
VP9E_SET_SVC_LAYER_ID sets the first spatial layer to
encoder per superframe, so add this parameter to svc encoder.
This is needed, for example, to properly set is_key_frame for
spatial layers when base spatial layer is skipped encoded.

Change-Id: Ifd4ac77f539197ec021e62f4c624a6cc79d64f43
2018-08-31 15:45:56 -07:00
Hui Su 2713dba538 ML based rectangular partition search pruning
Add a ML model to predict if rectangular partition search can be skipped
without much coding loss. This model is enabled for speed 0 low bitdepth
only.

Impact on coding performance is minor:
             avg_psnr     ovr_psnr     ssim
lowres       -0.005%       0.005%     0.017%
midres        0.100%       0.114%     0.134%
hdres         0.048%       0.083%     0.074%
jvet480p      0.035%       0.027%     0.044%
jvet720p      0.094%       0.090%     0.174%

Tested encoding speed over 20 midres and hdres clips, average speed
gain is about 8%; maximum speed gain is 23%.

Change-Id: I5d4029dec7134c53ac68ab6cf0c8077dc0b767ed
2018-08-31 11:50:01 -07:00
Jingning Han b1b0074051 Merge "Fix arf_src_offset calculation" 2018-08-31 18:19:47 +00:00
Jingning Han 75c416d02c Merge "Set minimum frame size to be 1 byte" 2018-08-31 18:19:42 +00:00
Jingning Han 7387fc532a Merge "Prepare multi-layer ARF coding structure" 2018-08-31 18:19:32 +00:00
Jingning Han 80e6b12d47 Merge "Build up multi-layer ARF processing order" 2018-08-31 18:19:13 +00:00
Jingning Han 3548366920 Merge "Add element stack operations for arf index control" 2018-08-31 18:19:07 +00:00
Jingning Han 97e9cbe88b Fix arf_src_offset calculation
The offset should be computed with respect to the current coding
process standing.

Change-Id: I63fc303eb062d5fd68b8d1faa3b4172cdfcce168
2018-08-31 16:27:26 +00:00
Jingning Han bffa4a68b0 Set minimum frame size to be 1 byte
The show_existing_frame mode still needs to be sent to the decoder.
Account for this as 1 byte. This would make the encoder properly
update its state.

Change-Id: I32a59ccb5d0e02cc6367c1a264b2de72dc1432a7
2018-08-31 16:27:04 +00:00
Jingning Han 583ea4d07b Prepare multi-layer ARF coding structure
Build the frame processing order and type queue for multi-layer
ARF coding structure.

Change-Id: I5e14c60279020dc65a883d2997ca1ca9ce739488
2018-08-31 09:26:02 -07:00
Jingning Han 430d4567a5 Build up multi-layer ARF processing order
Use DFS to build the multi-layer ARF processing order.

Change-Id: Iba4b20476eb5c8a3db49a24b2b0dec325fade65b
2018-08-31 08:46:51 -07:00
Jingning Han 5512dbb889 Add element stack operations for arf index control
Support arf index stack operation.

Change-Id: Ifcf521ffc95a520344824ffc159883b71e8fc7a0
2018-08-31 08:42:19 -07:00
James Zern 94ec37097e Merge "cosmetics,lf threading: normalize struct member names" 2018-08-31 03:22:42 +00:00
Johann 00238b7c01 silence c++ abi warning
Linking c++ libraries built with gcc 6 and gcc 7 on arm
generates some warnings because of incompatibilities between those
compilers:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77728

libvpx does not generate a c++ library. C++ is only used for examples and tests.

Change-Id: I3d5d5ef3fb66743bff26a833d6641898975e9f71
2018-08-30 14:28:20 -07:00
Marco Paniconi e5d2860216 vp9: Fix rate control stats for bypass mode in sample encoder
Allow rate control stats to work for bypass mode
in vp9_spatial_svc_encoder.c

Change-Id: I66764a006a73b1fd13c07b4fc4e0c88b2bb2a035
2018-08-30 13:01:34 -07:00
James Zern 753fd86e86 Revert "Loopfilter MultiThread Optimization"
This reverts commit dafe064289.

Corrupted files may cause the decoder to hang as row progress in the
loopfilter is used to progress each thread.

BUG=webm:1558

Change-Id: I0674ce9af14d3fb7b2da8124e7b600616c8e734a
2018-08-30 09:59:10 -07:00
Johann Koenig 0bfab06084 Merge "rtcd: fix --required flag" 2018-08-29 20:17:49 +00:00
Johann 36840e95ca rtcd: fix --required flag
Always parse --required options. Previously they were only parsed for
x86_64.

Make entries passed in additive if there are existing required flags.

Mark 'neon' as required for armv8/aarch64.

BUG=chromium:876548

Change-Id: I55c6aad4536a9d8423e223e5616f3aa26d6b2941
2018-08-29 12:10:28 -07:00
Hui Su 30d29529c9 Merge "Skip unnecessary motion search" 2018-08-29 15:39:31 +00:00
Hui Su 545bd0ca0e Skip unnecessary motion search
If a ref frame is masked out, we do not need to do motion search for it.
It makes speed 0 a little faster.

Change-Id: I68f71255b2798b24fd1d5b28ed24a2ef87251413
2018-08-28 10:30:43 -07:00
Jerome Jiang 5de95cb09f Merge "vp9: Fix ref frame update in denoiser in bypass mode." 2018-08-28 17:27:57 +00:00
Hui Su d1272e9e5e Merge "Revert "Prevent double application of min rate in two pass."" 2018-08-28 17:04:58 +00:00
Jingning Han 0d203054b3 Rework enc/dec mismatch detection
The previous enc/dec mismatch detection assumes the previously
reconstructed frame would always stay at frame buffer pool index
at 0. It could hence cause certain delay in enc/dec mismatch
detection when the immediate reconstruction frame is not yet
propagated to index 0 in the buffer map pool.

This change always keeps the latest decoded show frame buffer
index and directly gets the reconstructed frame from encoder and
decoder buffer pools to check for mismatch.

Change-Id: If53092cbc42ab78d55af5b83f12a489fc362f3ae
2018-08-27 16:40:45 -07:00
Jerome Jiang 26169380d8 vp9: Fix ref frame update in denoiser in bypass mode.
BUG=b/112292577
Change-Id: I8fc5711e44d0317e299aa49f781e9c438bba9d82
2018-08-27 15:32:16 -07:00
Marco Paniconi b2f9b627e3 vp9-svc: Change default pattern for bypass mode
For sample encoder: keep default pattern for bypass
mode to example#0.

Change-Id: Icddc4600d750a23a44b26517a327b546fd8eb412
2018-08-27 12:08:29 -07:00
Jerome Jiang d85c8515e3 Merge "SVC: extend api to specify temporal id for each spatial layers." 2018-08-27 17:54:52 +00:00
Hui Su 8d56220599 Merge "Rework the ref_frame_skip_mask feature in RDO" 2018-08-23 22:00:50 +00:00
Hui Su d10b1f2336 Revert "Prevent double application of min rate in two pass."
This reverts commit 416b7051d7.

Reason for revert: it causes visual quality drop as described in b/112953058.

Original change's description:
> Prevent double application of min rate in two pass.
> 
> The initial allocation of bits in the two pass code to each frame
> should be within the min max limits on the command line. However,
> when forming an ARF group the cost of the ARF is shared by frames
> in that group such that the residual bits for a frame could drop below
> the min value. This change prevents the minimum being re-applied
> after the cost of the ARF has been deducted as this may otherwise
> cause low rate sections to overshoot their target.
> 
> Test runs comparing to a baseline run with min and max section pct
> 0-2000% vs one closer to the YT use case (50-150%) suggest that
> this fix not only results in better rate control but also gives a better
> rd outcome.
> 
> For example the HD set vs 0-2000% baseline (opsnr, ssim).
> Old code (50-150):  +0.751, +1.099
> New code(50-150): +0.241, -0.009
> 
> Change-Id: I715da7b130bf53ba8aa609532aa9e18b84f5e2ef

TBR=yaowu@google.com,paulwilkins@google.com,debargha@google.com,builds@webmproject.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Change-Id: Ic9849e4e0db64e9d92bbb9df9cc923230a15c4df
2018-08-23 14:47:18 -07:00
Jingning Han 85d8fd18d0 Merge "Sync prev_frame/last_show_frame update with decoder" 2018-08-23 17:21:27 +00:00
Jingning Han 0bbf508457 Merge "Skip update prev_mi frame in show_existing_frame mode" 2018-08-23 17:21:23 +00:00
Jingning Han b2453114c7 Merge "Refactor encoder frame count update" 2018-08-23 17:21:18 +00:00
Jingning Han 1f4c8dcf14 Sync prev_frame/last_show_frame update with decoder
Make the encoder side handling of prev_frame and last_show_frame
update synchronized with the decoder behavior.

Change-Id: I0f265391cba182d7cc266a1c327fe6b92e24ab17
2018-08-22 22:11:10 +00:00
Jingning Han 8b9b38bd5e Skip update prev_mi frame in show_existing_frame mode
When the current frame is coded by directly using a reference
frame in buffer, no need to update the prev_mi frame information
for next frame encoding control.

Change-Id: I33fda8e70cdb31eb5b13b63e3dbd6e96ff85154d
2018-08-22 22:10:37 +00:00
Jingning Han aab1b2912b Refactor encoder frame count update
This refactoring allows the encoder to skip frame count update in
the show_existing_frame mode.

Change-Id: Id69707976ccdad144cba93a8f5d36b6947611f91
2018-08-22 15:09:18 -07:00
Jerome Jiang dbcb89be24 Merge "Revert "vp8: Fix memory address overflow in decoder."" 2018-08-22 20:20:15 +00:00
Hui Su fa37ac13b5 Rework the ref_frame_skip_mask feature in RDO
Previously we often skip all compound inter prediction modes,
causing large coding loss. This patch modifies how we set the
ref_frame_skip_mask so that compound modes are considered in RDO.

This affects speed>=1.

Coding gains(overall psnr):
          lowres       midres     hdres     average
speed 1    0.54%       0.43%      0.64%      0.53%
speed 2    0.59%       0.48%      0.60%      0.56%

Tested encoding speed on 10 HD sequences, average speed loss is
5% for speed 1; 2% for speed 2.

Change-Id: Ib8758af7ee7c9812022bd21c5fe61631e2bb8e5c
2018-08-22 11:18:39 -07:00
Jerome Jiang ca9ab3fc46 Revert "vp8: Fix memory address overflow in decoder."
This reverts commit 45cf384738.

BUG=875626,875680,webm:1496

Change-Id: I78037b5e57dbf6cfe326b29beaad1128868f09f2
2018-08-22 11:03:32 -07:00
Jingning Han fe47e82929 Set refresh_frame_context flag off in show_existing_frame mode
Match the decoder expectation, set off refresh_fame_context flag
in show_existing_frame mode.

Change-Id: I5258635b715ea04f41a4a087178709f707449b71
2018-08-22 09:36:12 -07:00
Jingning Han 836d383de2 Drop empty line in vp9_get_compressed_data()
Change-Id: Iadb043128e0f813c75cc726e5a41ce94b9d1de24
2018-08-21 20:32:23 -07:00
Jingning Han 7e1137a5dc Merge "Allow codec to skip temporal filter for intermediate ARFs" 2018-08-22 02:47:57 +00:00
Jingning Han 247c5a117a Merge "Control reference frame refresh flags for USE_BUF_FRAME" 2018-08-22 00:08:47 +00:00
Jingning Han a81a341557 Merge "Safely swap the show frame buffer pointer in show_existing mode" 2018-08-21 23:44:28 +00:00
Jingning Han 36e8d47427 Merge "Skip loop filter operation in show_existing_frame mode" 2018-08-21 23:44:16 +00:00
Jingning Han fe6ae7d53c Merge "Point show frame buffer towards existing frame buffer" 2018-08-21 23:44:10 +00:00
Jingning Han 503cbe35ed Merge "Skip frame encoding when show_existing_frame is on" 2018-08-21 23:43:37 +00:00
Jingning Han 0b5acf0f66 Merge "Add USE_BUF_FRAME enum to FRAME_UPDATE_TYPE" 2018-08-21 23:43:28 +00:00
Jingning Han 1b6a8f12e2 Merge "Unify set_arf_sign_bias function" 2018-08-21 23:43:18 +00:00
Jingning Han f774e0d540 Allow codec to skip temporal filter for intermediate ARFs
Allow the encoder to skip temporal filter for intermediate ARFs
that are later used in show_existing_frame mode.

Change-Id: Ieed635bf7672b62f5c287bde43765f80362a345e
2018-08-21 16:42:41 -07:00
Jingning Han 7a3c9b578d Control reference frame refresh flags for USE_BUF_FRAME
The enum USE_BUF_FRAME makes the use of show_existing_frame. In
this setting, all the reference frame buffer condition will stay
unchanged.

Change-Id: I5b7b28488dbd94982f721667128f004e4e6a00d8
2018-08-21 12:31:21 -07:00
Jingning Han c87895b144 Safely swap the show frame buffer pointer in show_existing mode
Point the current frame buffer towards the existing reference frame.
In the meantime, release the original new_fb pointer.

Change-Id: Ic83a698cac5cdaaabdf61acffb936ec130a84d1c
2018-08-21 11:14:24 -07:00
Jingning Han 28cd84af76 Skip loop filter operation in show_existing_frame mode
Skip the loop filtering for frame coding in show_existing_frame
mode. This matches the decoder operation for show_existing_frame
mode.

Change-Id: I96f275cf5384eb5fe8c0404ec4142cf5b580ac16
2018-08-21 10:50:03 -07:00
Jingning Han 1ec8fc9da7 Point show frame buffer towards existing frame buffer
When the show_existing_frame mode is on, directly point the new
frame pointer towards the existing reference frame buffer entry.

Change-Id: Ic50b25655fe95ea702fb529afacb7701ec17adcb
2018-08-21 10:29:04 -07:00
Jingning Han 3b0d06b599 Skip frame encoding when show_existing_frame is on
No need to process through the frame encoding stage when a current
frame is coded using show_existing_frame.

Change-Id: I36c6f04e344326fa6ecc95cd0a4e4fd6f467fdcb
2018-08-21 10:05:33 -07:00
Jingning Han dcbdab221e Add USE_BUF_FRAME enum to FRAME_UPDATE_TYPE
This enum indicates the use of show existing frame, and conducts
no reference frame buffer update.

Change-Id: I8bf3121376640baf24b580ebea58e9ccbdd641da
2018-08-21 09:47:56 -07:00
Jingning Han 73bb421ec0 Merge "Clean up var define in apply_temporal_filter()" 2018-08-21 03:50:31 +00:00
Harish Mahendrakar dce9e8a563 Merge "Loopfilter MultiThread Optimization" 2018-08-21 03:40:50 +00:00
Jingning Han 0de8be5a77 Unify set_arf_sign_bias function
Determine if an ARF is on the future side by checking if its
offset meets the gop frame length. This unifies the support to
single- and multiple-layer ARF cases.

Change-Id: I5ab26f54311c345a9b574ffca5ff0a8dbcf4c031
2018-08-20 17:07:59 -07:00
Jingning Han ba1a52b9cb Remove unneeded frame_till_gf_update_due assignment
This will get update after define_gf_group() is called and returned.
No need to update it inside.

Change-Id: Ia42c6f7ef16bca3f1ee88392f3b90b9ebe409da8
2018-08-20 12:59:15 -07:00
Jingning Han 0936ce931a Add multi_layer_arf flag
This flag will control the use of multiple layer arf + show
existing frames.

Change-Id: Ic6b9e8e67b2db7d32706bdf0a14663a39f57295f
2018-08-20 12:47:16 -07:00
Jingning Han a5494e3376 Add a comment in init_gop_frames()
Make the meaning of the operations therein clearer.

Change-Id: I0dce92a4c14218307df098e3da7a1c7cc45008a7
2018-08-20 09:13:06 -07:00
Jingning Han 51d933769f Merge "Skip frame bit-stream writing for show-existing frame" 2018-08-20 15:59:31 +00:00
Jingning Han ff94446775 Merge "Support code show_existing_frame in bit-stream header" 2018-08-20 15:59:27 +00:00
Jingning Han 9d578ad63d Merge "Refactor init_gop_frame()" 2018-08-20 15:59:07 +00:00
Supradeep T R dafe064289 Loopfilter MultiThread Optimization
Adding LPF within the tileworker hook. This means that LPF will be done
immediately after decode, without waiting for all threads to sync.

Performance Improvement -

Platform        Resolution      2 Threads       4 Threads
X86             720p            7.24%           22.04%
                1080p           5.29%           17.02%
ARM             720p            4.61%           8.75%
                1080p           5.55%           12.03%

x86 Improvement measured on Intel Core i7-6700 CPU @ 2.10GHz set
in performance with turbo mode off
ARM Improvement measured on Nexus 6 Snapdragon 805 Quad-core  @ 2.65 GHz

Change-Id: Ifa73c71b40db3fa7fa16f54f4e3aa06d1258caae
2018-08-20 12:07:37 +05:30
Jingning Han 3356cbfe0e Skip frame bit-stream writing for show-existing frame
Make the bit-stream writer match the decoder behavior, when the
show existing frame feature is used.

Change-Id: Ibc8153f8668da0f9a2ed8af3b42dae91a5ac08c7
2018-08-17 16:18:32 -07:00
Jingning Han 21694259e0 Support code show_existing_frame in bit-stream header
Allow the bit-stream writer to support potential use of
show_existing_frame. At this point, cm->show_existing_frame is
always 0.

Change-Id: I64fed1d72db6d4902d56774854ce24fb7a082e0c
2018-08-17 16:18:32 -07:00
Jingning Han eb327e418e Refactor init_gop_frame()
Remove implicit dependency on overlay frame update to break the
gop initialization loop.

Change-Id: I6a6d070cdf22a0e30c298523707bd746fd03f450
2018-08-17 16:18:32 -07:00
Jingning Han a57301cfa4 Clean up var define in apply_temporal_filter()
Change-Id: Iffb90d1ce61a70de52196247e18a31485038e6dd
2018-08-17 22:05:30 +00:00
Jingning Han 98efa1c43f Add inline to mod_index()
Change-Id: I97c03e8d7c0907e728dc183f47564f206388118b
2018-08-17 21:50:56 +00:00
Hui Su 5226940ef2 Merge "Improve enhanced_full_pixel_motion_search" 2018-08-17 15:29:35 +00:00
Tom Finegan 501b2306aa Merge "external_frame_buffer_test: rm duplicate protected:" 2018-08-17 15:15:40 +00:00
Jerome Jiang 917fba2a07 Merge "Refactor: move svc example files to from vpx/ to examples/" 2018-08-17 07:03:54 +00:00
Jerome Jiang 80f9a57843 Merge "Refactor some swapping code with a local func." 2018-08-17 07:02:29 +00:00
James Zern e03ac81de4 external_frame_buffer_test: rm duplicate protected:
Change-Id: I8e41a3c39bb243c0c4d212b5ce36d1179923d783
2018-08-16 23:51:39 -07:00
Hui Su ba2cd73376 Improve enhanced_full_pixel_motion_search
Do full pixel MV search around all 3 MV candidates.

Coding gains for speed 0:
         avg_psnr   ovr_psnr   ssim
lowres   -0.088%    -0.095%   -0.117%
midres   -0.175%    -0.177%   -0.148%
hdres    -0.115%    -0.146%   -0.146%

Coding gains for speed 1:
         avg_psnr   ovr_psnr   ssim
lowres   -0.089%    -0.104%   -0.124%
midres   -0.151%    -0.171%   -0.195%
hdres    -0.110%    -0.105%   -0.132%

Tested encoding speed with speed 1 QP=30,40 over 10 midres sequences,
average speed loss is about 1%.

Change-Id: I9e6de035f4ed2e814e6494aefc2f84aae333a6b4
2018-08-16 16:13:18 -07:00
Jingning Han daa8482fa7 Use YUV components to build the temporal filter
Use both luma and chroma components simultaneously to estimate the
non-local mean kernel and build the temporal filter. It improves
the compression performance primarily for chroma components. Tested
in speed 0 and vbr mode, the coding gains are:

          Overall PSNR   SSIM     PSNR_U      PSNR_V
low        -0.10%       -0.12%    -0.48%      -0.49%
mid        -0.13%       -0.16%    -0.58%      -0.88%
720p       -0.31%       -0.24%    -0.75%      -0.72%
hd         -0.09%       -0.10%    -0.59%      -0.79%
nefl2k     -0.30%       -0.13%    -0.53%      -0.50%

Change-Id: I24d39997818322b0d69bd9dbeda02c60cd2b2e1b
2018-08-16 14:47:31 -07:00
Jerome Jiang ed8f189ccc Refactor: move svc example files to from vpx/ to examples/
svc_encodeframe.c and svc_context.h are only used by the example
encoder.

Change-Id: Idb41a5a9d6a229a0bc7d2bc8dbe6575a74efc54c
2018-08-16 14:44:12 -07:00
Jingning Han ca6bfe2cd4 Unify the YUV plane temporal filter operation
Unify the temporal filter operations for the luma and chroma
components. Handle them in a single loop over the pixels in the
processing block.

Change-Id: I9ea1946f3a6fb37da6867aa78140d45cad0facf0
2018-08-16 13:02:45 -07:00
Jerome Jiang a5c17a689f SVC: extend api to specify temporal id for each spatial layers.
BUG=b/112294545

Change-Id: I5be230c8969d69af3ad87068fdf3834ef1af11d9
2018-08-16 10:28:34 -07:00
Jingning Han 8839cffe2d Merge changes I6ab0ece8,I5f878e9b
* changes:
  Temporarily revert to vp9_temporal_filter_apply_c
  Simplify temporal filter strength calculation
2018-08-16 16:53:30 +00:00
Marco Paniconi 6c62530c66 Merge "vp9: Add flatness metric to cyclic refresh setup." 2018-08-16 05:09:05 +00:00
Jerome Jiang 10fb9b94f4 Refactor some swapping code with a local func.
Change-Id: Ic55c156ab12703b05ec0d54d83ed16d40e640abe
2018-08-15 21:12:02 -07:00
Marco Paniconi 2b2a757199 vp9: Add flatness metric to cyclic refresh setup.
For screen-content with aq-mode = 3: identify spatial
flat superblocks in the setup stage and don't mark them as
candidates for refresh. Spatially flat blocks are already
removed from refresh at a later stage in the encoding (in pick_mode),
but doing this at the setup stage of cyclic refresh (before encoding)
allows refresh to more quickly hit the text areas. Only drawback is
an extra source variance calculation for a set of superblocks on
each frame.

Adjust the refresh rate: lower it to reduce overshoot since
more texture areas are hit faster with this change.

Change-Id: I88fa20e52fdbf1a938ae814f9b48c887f1f909d2
2018-08-15 20:53:54 -07:00
Jerome Jiang 27106e507b svc: Force the quantizer to be same as that in encoder config.
Change-Id: I0377ca2ebf63792d7a27de4b8e7e08b38659ecde
2018-08-15 17:56:28 -07:00
Jerome Jiang 47808645a8 Merge "vp9: Remove good mode and speed 0-4 from some datarate tests." 2018-08-15 23:15:57 +00:00
Jingning Han 557fab3678 Merge "Add vpxenc control to turn on/off tpl model" 2018-08-15 04:02:04 +00:00
Jerome Jiang fd1de451cd Merge "vp9: fix memory alloc for adaptive_rd_thresh_row_mt." 2018-08-14 18:58:19 +00:00
Jerome Jiang b4e783da57 vp9: fix memory alloc for adaptive_rd_thresh_row_mt.
When the feature is enabled and the memory is not available, allocate
it. There was a case where speed feature changed in the middle of stream
but the number of tiles stayed the same, memory was not re-allocated.

Another case is where speed for base layer is different than that of
higher quality layers (same resolution). Removed the speed constraints
forcing base layer using same speed setting.

Thus the memory for adaptive_rd_thresh_row_mt stayed NULL but the
feature was enabled.

Add an end to end test to cover this case.

Change-Id: I2f1f802ef98a554571b30094d3600b9439228457
2018-08-14 10:55:31 -07:00
James Bankoski 7b925825a1 Merge "Make Sharpness parameter affect visual sharpness" 2018-08-14 16:01:05 +00:00
Jingning Han d20d9259a5 Add vpxenc control to turn on/off tpl model
The default is set to turn on the temporal dependency model at
speed 0. Use --enable-tpl to control turning it on/off when calling
vpxenc.

Change-Id: I61614cd8100ae57dc01fd46b2a69c5b67287f18a
2018-08-14 08:54:48 -07:00
Jingning Han 56735e5532 Merge "Fix potential encoder failure case in tpl model" 2018-08-14 15:50:50 +00:00
Jim Bankoski 944ede0285 Make Sharpness parameter affect visual sharpness
1: Lower rdmult used in trellis optimization

2: Shut off the end of block optimization that tries end of block
at every sub position if any of the coefficients are > 1.

3: Change the rounding and zbin factor according to sharpness.

4: Disable the skip block check that calculates RD using SSE from
predictor.

Change-Id: I247b61a26fa22f12f8b684e7cd6d4e368de7c3e4
2018-08-14 07:35:00 -07:00
Jingning Han 058046c880 Fix potential encoder failure case in tpl model
When the group of picture runs over 24 in length, skip the use of
temporal dependency model, since the model assumes maximum 25
lookahead frames.

Change-Id: I6386dd33bcdaf1229fae978130b4c3b43d071918
2018-08-13 16:54:23 -07:00
Jerome Jiang 2a17bb0717 vp9: Remove good mode and speed 0-4 from some datarate tests.
Low speeds in good mode are too slow.

Move CBR large tests to non-'Large' ones such that they can run in
Jenkins per commit.

Change-Id: I1da73ca96ee89abcf3566d51ff52f1f2e904a048
2018-08-13 16:35:13 -07:00
Marco Paniconi c70552c01e Merge "vp9-svc: Fixes for cyclic refresh for SVC." 2018-08-13 23:20:58 +00:00
Jerome Jiang a4cc75cd75 Merge "vp9: don't release buffer for current frame." 2018-08-13 22:25:53 +00:00
Marco Paniconi 62261d5eb8 vp9-svc: Fixes for cyclic refresh for SVC.
Add metrics that are being updated per-frame to
the layer struct, so each layer using the cyclic
refresh has the correct update. This is more consistent
for the rate control and refresh rate.

Some improvement in screen content clips.
Neutral for SVC on rtc set.

Change-Id: I0a9862fb6b6a79e894e2ff30c120dc4aa26fcda5
2018-08-13 14:53:41 -07:00
Marco Paniconi 25ca4edf74 vp9: Small refactor on overshoot detection, for cbr real-time.
Change-Id: I70997d35a2371bb4614d716ef0c587fa12ea0f4a
2018-08-13 09:40:47 -07:00
Marco Paniconi f1d44c1f45 vp9-svc: Unittest for screen mode with quality layers
Add datarate unittest for SVC screen content mode,
with 2 quality layers.

Change-Id: I9c8ad5462fd046698052bea6d7343c2b7e16668f
2018-08-12 21:31:33 -07:00
Marco Paniconi b8642738c9 vp9-svc: Fix to updated SET_SVC_REF_FRAME_CONFIG control
Add flag to separate two cases of bypass (flexible) SVC mode:
usage of using the SET_SVC_REF_FRAME_CONFIG vs passing in the
frame_flags in the vpx_encode (only used for temporal layers).

This fixes failures in Datarate Temporal layer test,
introduced in commit: a66da31

Change-Id: Ie62f933987c20792d1f963d645e98c1903bdd423
2018-08-11 13:17:42 -07:00
Jerome Jiang 7de10a5f92 vp9: don't release buffer for current frame.
when resync is needed, we flush all frame buffers on key frame.

BUG=b/112406540
BUG=oss-fuzz:9722

Change-Id: Ie53feb12126f25877436eba40317400bf69c6207
2018-08-11 00:00:31 +00:00
James Zern 7c326d7ad0 Merge "loop_filter_rows_mt: use sb_rows to limit workers" 2018-08-10 23:24:18 +00:00
Jerome Jiang f8d4492d22 Refactor: Move code updating ref frames for svc & denoiser.
Make new functions and move them to vp9_denoiser.c and
vp9_svc_layercontext.c

Change-Id: Ia34266ee2831d0f1316b7a641cbbf40fe64e1a0c
2018-08-10 11:27:04 -07:00
Jerome Jiang b626452402 Merge "vp9-svc: Update to SET/GET_SVC_REF_FRAME_CONFIG api" 2018-08-10 16:46:20 +00:00
Jerome Jiang 9a9c01217b Merge "Fix frame drop threshold in vp9 datarate test." 2018-08-10 16:43:31 +00:00
Hui Su 6cd224f358 Merge "Use the pred_mv feature for speed 0" 2018-08-10 16:08:09 +00:00
Jerome Jiang dd601bb067 Fix frame drop threshold in vp9 datarate test.
Change-Id: Ifbd753183782e680a9ae77c55d75f4d9b3fb2477
2018-08-10 08:26:48 -07:00
Marco Paniconi ada6a428f0 Merge "vp9: Allow for overshoot detection for non-screen CBR mode." 2018-08-10 02:24:43 +00:00
Jerome Jiang 538b40699f Merge "Change target bitrate in vp9 datarate test." 2018-08-10 00:54:45 +00:00
Marco Paniconi 97848890a9 vp9: Allow for overshoot detection for non-screen CBR mode.
For CBR real-time mode: refactor usage of speed feature to
handle overshoot on slide/scene change. Add 2 modes to indicate
how slide/scene change is processed for re-setting Q/rate control.
Keep the speed setting to 1 for speed >= 5, otherwise set to 0.

Video content and screen content are now handled in similar way,
though with different thresholds.

Some fixes to thresholds and reset: correct the reset of the buffer
level to optimal level for each temporal layer, if scene change
frame will be encoded at max_q.

Also increase the min_thresh for video mode (non-screen content):
this is to avoid scene change detection on cases like large
lighting changes, cameras focus. And increase in min_thresh
makes it more robust to sudden increase in noise level.

Change-Id: I256d350da6e92d2ddc09f100fc06ac147cbc1e49
2018-08-09 17:38:20 -07:00
Jerome Jiang e70b8ccd52 Change target bitrate in vp9 datarate test.
Due to the change of clip.

Change-Id: Ibbe1a865df0837c349770287d148081230d10aaa
2018-08-09 16:59:00 -07:00
Jingning Han 80d1063c7b Temporarily revert to vp9_temporal_filter_apply_c
The logic inside will be changed through a set of experiments
coming up next.

Change-Id: I6ab0ece8534a796b96a10ee5a9690b19c878a664
2018-08-09 16:54:39 -07:00
Marco Paniconi 69e9a39498 Merge "vp9-svc: Fix for scene detection for SVC" 2018-08-09 22:30:07 +00:00
Hui Su f7148bbdde Use the pred_mv feature for speed 0
Before this patch, pred_mv is used only when the
adaptive_motion_search speed feature is on(speed>=1).
This patch enables pred_mv for speed 0 as well.

Coding gains:
         avg_psnr   ovr_psnr   ssim
lowres   -0.31%     -0.32%    -0.38%
midres   -0.37%     -0.41%    -0.42%
hdres    -0.30%     -0.31%    -0.29%

Tested encoding speed over 18 midres sequences with QP=40. The
overall speed loss is about 0.6%.

Change-Id: I8987e9efb5a70d2bf8779fc2a43838009f9bbd8a
2018-08-09 14:35:22 -07:00
Jerome Jiang a66da31380 vp9-svc: Update to SET/GET_SVC_REF_FRAME_CONFIG api
Add update_buffer_slot to SVC API to allow for refreshing
any of the 8 reference buffers. Remove frame_flags from
the struct.

Remove svc tests from vp8 build.

BUG=b/112292577
Change-Id: I0551c349d2b311227245a8ed1639cdbbaf5bc5db
2018-08-09 13:56:46 -07:00
Marco Paniconi 45ce790711 vp9-svc: Fix for scene detection for SVC
For spatial layers: use the correct mi_cols/rows in the
scene detection. The scene detection for spatial layers
is only called once per superframe, but we were using wrong
mi_cols/rows (those for base spatial were being used).

Also increase frame_since_key threshold to account for spatial
layers.

Change-Id: I2731da49684a798c4718693a0468eda7db82d2bd
2018-08-09 13:49:42 -07:00
Jerome Jiang 7c4aed2ccc replace video clips used in vp9 datarate CBR tests.
replace hantro_collage with niklas vga clip.

Change-Id: I79b89ce12823095a5ee75025b2ddce9e8ef1452a
2018-08-09 12:58:08 -07:00
Jingning Han 15d041e1e2 Simplify temporal filter strength calculation
Change-Id: I5f878e9b6581bcb427ecc29ce490feb68378f8af
2018-08-08 09:24:16 -07:00
James Zern 0acc7365ab loop_filter_rows_mt: use sb_rows to limit workers
Previously if the number of tiles decreased within a clip and there were
fewer super block rows than workers the mi_row calculation would cause
rows to be skipped. The num_workers stored is the max allocated amount,
use sb_rows to limit the active ones if the row count is smaller as
additional threads will provide no benefit.

Change-Id: I1750296c8c21082de2594afecc4d6a3929db1f12
2018-08-07 20:14:42 -07:00
Hui Su aab2aff9aa Merge "Add enhanced_full_pixel_motion_search feature" 2018-08-07 23:38:37 +00:00
Scott LaVarnway a0b2ff6644 Merge "VPX: Improve HBD vpx_hadamard_32x32_sse2()" 2018-08-07 23:37:31 +00:00
Wan-Teh Chang a03465e44c Merge "Fix typos in the comment for size_group_lookup." 2018-08-07 23:28:04 +00:00
James Zern e7cf9fd444 Merge "vpx_highbd_d153_predictor_4x4_sse2: reduce load size" 2018-08-07 21:53:17 +00:00
James Zern de3bea6683 Merge "test/stress.sh: switch req. for 100 threads to 64" 2018-08-07 21:52:22 +00:00
James Zern 54270e6844 test/stress.sh: switch req. for 100 threads to 64
>64 is invalid for vp9 currently so no testing would be done.

Change-Id: Ic0ccd606d5e76258adb27b7c44dcbd82e94c84d1
2018-08-07 11:37:33 -07:00
Hui Su 763c9e08dd Add enhanced_full_pixel_motion_search feature
Do some extra full pixel search to improve motion vector quality.
Currently it is enabled for speed 1 only; disabled for real time mode.

Coding gain for speed 1:
         avg_psnr   ovr_psnr   ssim
lowres   -0.23%     -0.23%    -0.35%
midres   -0.33%     -0.35%    -0.38%
hdres    -0.28%     -0.29%    -0.28%

Tested encoding time over 10 HD sequences. Overall speed overhead is
1.5% for QP=30; 0.6 % for QP=40.

Change-Id: Ic2ea4d78c4979de9d5090c9d7c702944f155f8af
2018-08-07 11:10:57 -07:00
Jerome Jiang d619f02861 Merge "vp9 test: Enable aq 3 and error resilient in realtime for layers." 2018-08-07 18:06:56 +00:00
James Zern 64220915dc vpx_highbd_d153_predictor_4x4_sse2: reduce load size
this avoids reading 4 pixels into another block, which may be operated
on by a different thread. quiets a tsan warning.

Change-Id: Id27ad9d61819b0e5de0230647b4b510f7c265a71
2018-08-07 11:06:08 -07:00
James Zern c47fd31b5a cosmetics,lf threading: normalize struct member names
VP9LfSync / VP9RowMTSync: remove trailing '_' on mutex and condition
variable member names

Change-Id: Iac6bb8fb7c271ae5429d41688e485bc58ea40f23
2018-08-06 20:17:45 -07:00
Johann Koenig 89b1c07344 Merge "vp9: address integer sanitizer warning" 2018-08-07 00:33:03 +00:00
Hui Su 1907d91c42 Merge "Remove unnecessary calls to load_pred_mv()" 2018-08-06 23:00:35 +00:00
Johann a532c243bb vp9: address integer sanitizer warning
Comparing the size values with subtraction requires casting. Sort in
descending order.

(a < b) - (a > b)
If a is greater, this is 0 - 1 = -1
If the  values are equal, this is 0 - 0 = 0
If b is greater, this is 1 - 0 = 1

Change-Id: I5c20fd10fbc97c391c6858235c44d25d7db57f0e
2018-08-06 15:55:07 -07:00
Jerome Jiang 52829e79ac vp9 test: Enable aq 3 and error resilient in realtime for layers.
aq mode 3 is never properly tested in non realtime mode.

BUG=webm:1553

Change-Id: I0663c9724ee57ba5c528a20b31ef8b6df0e03f6c
2018-08-06 15:36:23 -07:00
James Zern e1acea5a28 Merge "vpxenc: replace uint16 with uint16_t" 2018-08-06 19:24:31 +00:00
Hui Su bc37fc0fec Remove unnecessary calls to load_pred_mv()
This improves compression performance slightly:
Speed 1:
         avg_psnr   ovr_psnr   ssim
lowres   -0.03%     -0.03%    -0.05%
midres   -0.16%     -0.20%    -0.30%
hdres    -0.13%     -0.13%    -0.16%

Speed 2:
         avg_psnr   ovr_psnr   ssim
lowres   -0.02%     -0.02%    -0.03%
midres   -0.08%     -0.06%    -0.10%
hdres    -0.08%     -0.08%    -0.10%

Change-Id: Id357c1f98042f3c7af56f99e534bc81ea9a7cf36
2018-08-06 17:15:30 +00:00
Jerome Jiang 4c7e8b55a6 Merge "vp9: new struct BEST_PICKMODE containing search results." 2018-08-06 16:57:00 +00:00
Mirko Bonadei 2f9aec0e65 vpxenc: replace uint16 with uint16_t
libyuv r1714 disables non-POSIX types by default:
55f5d91f Disable old int types by default.

Change-Id: Ia7086516b0d53d0ff3974e545d41f8b502aaec0d
2018-08-04 11:03:08 -07:00
Wan-Teh Chang 26d47e076a Fix typos in the comment for size_group_lookup.
"b_width_log2" and "b_height_log2" should be "b_width_log2_lookup" and
"b_height_log2_lookup", respectively.

Change-Id: I3ad49e45007cd9fcf5dd463c7d01e22745939231
2018-08-03 14:19:29 -07:00
Marco Paniconi 6fd9d0244c vp9: Add screen-content mode to overshoot detection.
For real-time 1 pass mode: overshoot detection and max_Q
reset should only be for screen-content mode.
This fixes some failures in the 1 pass VBR tests, from
the commit: 2fae9991

Change-Id: I70cbe4e6fd83cfe0c7662f13b779551bf4f319cb
2018-08-03 11:00:08 -07:00
Marco Paniconi e802f3a87d Merge "vp9: Adjust qp_thresh on slide change overshoot detection" 2018-08-03 17:28:57 +00:00
Hui Su 3f3d0adec1 Merge "Refactor vp9_full_pixel_search()" 2018-08-03 17:11:15 +00:00
Hui Su c3a3d62a11 Merge "Handle partition cost better in RD search" 2018-08-03 17:11:00 +00:00
Marco Paniconi 89d23138d2 vp9: Adjust qp_thresh on slide change overshoot detection
For real-time screen-content mode: increase the
qp_thresh for max_Q setting on slide changes.
This will make bitrate spikes less likely on slide changes.

Change-Id: Ie13524a06490214456b1c9c042a864ea0d0750c5
2018-08-03 09:29:16 -07:00
Marco Paniconi 5f914dd902 vp9: Add zero_temp_sad count to scene detection.
For real-time screen-content mode: makes the
scene/slide change detection more robust.

Change-Id: I28d8d28b42bb92d527811f814bf14bbbbb53ab25
2018-08-02 21:23:11 -07:00
Marco Paniconi d22b1d6957 Merge "vp9: Disable re_encode_overshoot feature for speed >= 6." 2018-08-03 03:21:12 +00:00
Marco Paniconi bea5c7e48e vp9: Increase min_thresh for slide change detection
For real-time screen-content mode: increase min_thresh
to avoid some false detection.

Change-Id: I3e93dea63cbd65e3ad5d0af7eabf0d3686fe9943
2018-08-02 17:29:04 -07:00
Jerome Jiang 981488754e vp9: new struct BEST_PICKMODE containing search results.
One of steps to refactor the nonrd pickmode. Simplify arguments list for
functions.

Change-Id: I5f23375caa36be2ae0fbd2ff851b303150a7aa8f
2018-08-02 17:03:30 -07:00
Hui Su 91e2bcdc0c Refactor vp9_full_pixel_search()
Code cleanup; add some comment.
Also remove a reduncant call to vp9_get_mvpred_var() at the end when
method is MESH.

Change-Id: I4b58e7e1c42161642708f8b0342ab3c0ce39ed7d
2018-08-02 16:02:38 -07:00
Marco Paniconi 2fae99911e vp9: Disable re_encode_overshoot feature for speed >= 6.
For real-time screen content mode: for speed >= 6 disable
the re_encode_overshoot feature. This means for speed >= 6
the Q and rate control is reset on slide changes based on
the scene/slide detection and the current Q (and not on a
first pass encoded frame at current Q).

This reduces encode time on slide changes, but may be less
accurate in deciding when to reset/max-out the Q.

Change-Id: Id0fdcafd55bc43bd8b3afee211e524f37c8ddce6
2018-08-02 11:36:13 -07:00
Jerome Jiang a3ff9370ae Merge "vp9: Refactor nonrd pickmode: new mv search" 2018-08-02 16:50:44 +00:00
Hui Su 25d6542251 Handle partition cost better in RD search
Take partition cost into consideration during rectangular partition
mode search.

Compression change is neutral. Encoding speed can be a little faster
at low quality settings. With QP=55 at speed 0, average speed up over
15 midres sequences is about 2.7%.

Change-Id: I6d423459675b5f1e4e1475dbbf6f67ab970a4832
2018-08-01 20:29:13 -07:00
Jerome Jiang 9a156855c9 Merge "vp9 svc: Adjust overshoot threshold in datarate test." 2018-08-02 02:55:58 +00:00
Jingning Han 29064a6069 Merge "Use mesh full pixel motion search to build the source ARF" 2018-08-02 02:36:48 +00:00
Jingning Han 88cc292b22 Merge "Add frame pointer to support recon frames in tpl model" 2018-08-02 02:36:41 +00:00
Jingning Han 7b5553e139 Use mesh full pixel motion search to build the source ARF
Append mesh search to the diamond shape search to refine
the full pixel motion estimation for source ARF generation.
It improves the average compression performance.

Speed 0
        avg PSNR     overall PSNR     SSIM
mid      -0.18%        -0.18%        -0.22%
hd       -0.25%        -0.23%        -0.36%
nflx2k   -0.22%        -0.23%        -0.37%

Speed 1
       avg PSNR     overall PSNR      SSIM
mid     -0.10%         -0.08%        -0.11%
hd      -0.25%         -0.27%        -0.38%
nflx2k  -0.20%         -0.20%        -0.34%

The additional encoding time is close to the sample noise
range. For bus_cif at 1000 kbps, the speed 0 encoding time
goes from 83.0 s -> 83.6 s.

Change-Id: I48647f50ec3e8f7ae4550a4bde831f569f46ecf3
2018-08-01 18:15:08 -07:00
Jerome Jiang 340aa428a5 vp9 svc: Adjust overshoot threshold in datarate test.
BUG=webm:1554
Change-Id: I69f9353266a290ae3c6ac9e51c960fff6e1af205
2018-08-01 17:44:57 -07:00
Jerome Jiang 82a38538e0 vp9: Refactor nonrd pickmode: new mv search
Move new mv search to a separate function.

Change-Id: I6ef22d03ccad7b87cb5cd611094de204d508f63e
2018-08-01 13:13:31 -07:00
Jerome Jiang 319c93f20d vp9: Refactor nonrd pickmode: interp filter.
Move interp filter search to new function.

Change-Id: I6ac57d5b3800c9944732a84a4d4a825a6c0f4c35
2018-08-01 05:01:14 +00:00
Jerome Jiang d552d88ea8 vp9: Refactor nonrd pickmode: tx_size.
Remove duplicated code to calculate transform size.

Change-Id: Id71772607eea911f24b59168c0629ba5ff891afb
2018-08-01 05:01:00 +00:00
Jerome Jiang e1ef0a2709 Merge "vp9 svc: Fix the scaling factor in intra only test for 1 SL." 2018-08-01 04:53:15 +00:00
James Zern 9a1c70166d Merge "vp9_encoder: make setup_tpl_stats() static" 2018-08-01 03:53:02 +00:00
Jerome Jiang 013014f81b vp9 svc: Fix the scaling factor in intra only test for 1 SL.
Change-Id: I7f71c165f6d3a6d02229798286269389c3c5528c
2018-07-31 20:27:38 -07:00
Marco Paniconi 98e134acec Merge "vp9: Clamp tx_size in model_rd_large" 2018-08-01 02:18:47 +00:00
James Zern a8157ce3e8 vp9_encoder: make setup_tpl_stats() static
Change-Id: If96519fb1cb4963cb6548c803253359a35621eb0
2018-07-31 18:19:02 -07:00
Marco Paniconi cf74866767 vp9: Clamp tx_size in model_rd_large
For nonrd_pickmode: add clamp/check to make
sure tx_size is not set to lower than 8X8,
for the model_rd_large function (which is only
called for big block sizes).

No change in behavior.

Change-Id: I9c6093068e406ac16cfd6784ba75868906225378
2018-07-31 15:30:38 -07:00
James Zern 7d750e27e3 vp9: enable tpl model in high-bitdepth w/8-bit output
this keeps the output between CONFIG_VP9_HIGHBITDEPTH=0/1 the same when
targeting 8-bit.

Change-Id: I5290681fdd3e0c1620578e5f804f68010c6dd210
2018-07-31 15:26:16 -07:00
Jerome Jiang ff9e455fe7 Merge "vp9: Disable aq mode for some datarate tests." 2018-07-31 19:50:01 +00:00
Jerome Jiang e3b45f180e vp9: Disable aq mode for some datarate tests.
It caused failure on vp9 datarate tests for temporal layers.

Change-Id: Id6e260efa33b3b08070391a91a013efef2706fb5
2018-07-31 10:50:08 -07:00
Marco Paniconi 79415f7e34 vp9: Remove assert from model_rd in non-rd pickmode.
The assert checks for tx_size >= 8x8, but 4x4 can
be set in some cases.

Change-Id: I8bf9683e1add768becaa1208e1709ad0470e3850
2018-07-31 09:35:15 -07:00
Jingning Han 6b22c999ca Add frame pointer to support recon frames in tpl model
Add frame pointer to re-use spare frames to store the reconstructed
frames.

Change-Id: I870aa048fc9b7d8b356aa73df3a92b4670425f95
2018-07-30 21:13:55 -07:00
Jingning Han 3ff77503c9 Merge changes Ibafb6157,Ibebced5d
* changes:
  Move frame pointer assignment outside block loop in tpl model
  Refactor tpl_model_store input parameters
2018-07-31 04:11:45 +00:00
James Zern a3c2796039 Merge "test/stress.sh: add --token-parts coverage for vp8" 2018-07-31 03:08:55 +00:00
Marco Paniconi 12781ef3cc Merge "vp9: Add scene change detection flag to cyclic refresh setup" 2018-07-31 03:06:42 +00:00
Jingning Han d06839dd27 Move frame pointer assignment outside block loop in tpl model
Change-Id: Ibafb61577a6293c6ad32bda484a786602afda2e6
2018-07-31 02:54:49 +00:00
Jingning Han 2372837df8 Refactor tpl_model_store input parameters
Simplify the pass-in data structure. Use a reference TplDepStats
pointer to replace multiple data sent in.

Change-Id: Ibebced5d7f411d2c4a8a34a9b7eb87453fb78d13
2018-07-30 19:48:23 -07:00
Marco Paniconi 2be0118dc9 vp9: Add scene change detection flag to cyclic refresh setup
Disable cyclic refresh on slide/scene change frame. It was already
disabled on the re-encode for the slide change, but this change
makes sure its always disabled on a detected slide change (which
may not be re-encoded at high Q).

Change-Id: I1195c855bca25985d4d41e5b657adf124e901760
2018-07-30 19:08:37 -07:00
Jingning Han 597f56efdf Merge "Use diamond search to build tpl model and arf frames" 2018-07-31 00:57:14 +00:00
Jerome Jiang 871a4b69ac Merge "Enable aq mode 3 for all datarate tests." 2018-07-30 23:56:06 +00:00
Jerome Jiang f0c57a3f74 Merge "vp8: Fix memory address overflow in decoder." 2018-07-30 23:27:59 +00:00
Jerome Jiang b0ee2beff9 Enable aq mode 3 for all datarate tests.
Change-Id: I4e9c73d6d1d9ea560f04cc37aaf99d58ec2ab551
2018-07-30 15:51:00 -07:00
Jingning Han 2eba086685 Merge "Remove unused variables from VP9_COMP" 2018-07-29 13:58:13 +00:00
Martin Storsjo 510ae7b5a5 arm: Consistently use unified syntax for asm
The ".syntax unified" directives in a few source files aren't valid
ADS assembly directives, and they break compilation for windows,
since ads2armasm_ms.pl doesn't handle them.

Explicity add them via ads2gas.pl and ads2gas_apple.pl instead,
and tweak one instruction to be valid unified syntax.

Change-Id: I37f1709f163d11474597161fe02eb433859cb9b8
2018-07-28 11:56:49 +03:00
James Zern 0f2ef13d16 test/stress.sh: add --token-parts coverage for vp8
Change-Id: I46f39cbc0441d09f5ad0b3887d2372b0be9abd4f
2018-07-27 23:37:38 -07:00
Angie Chiang 2d79df4940 Merge "Remove an extra vp9_encode_frame call" 2018-07-28 01:06:58 +00:00
Jingning Han b8a66f8c45 Remove unused variables from VP9_COMP
Change-Id: I3bdd44e65b56c7600b9faadd2c117138c3911c14
2018-07-27 15:25:56 -07:00
Jingning Han dbe4d9c4eb Use diamond search to build tpl model and arf frames
Use diamond search for full pixel motion estimation to build
the temporal dependency model and the source arf frame. This gives
better full pixel motion estimation accuracy. It improves the
compression performance.

In speed 0,
         avg PSNR     overall PSNR     SSIM
midres    -0.32%        -0.30%        -0.65%
hdres     -0.88%        -0.91%        -1.31%
nflx2k    -0.47%        -0.48%        -0.81%

In speed 1,
        avg PSNR      overall PSNR     SSIM
midres    -0.24%        -0.28%        -0.50%
hdres     -0.82%        -0.83%        -1.18%
nflx2k    -0.58%        -0.60%        -0.89%

The encoding speed change is minor due to the fact that such motion
estimation is triggered once at the beginning of each group of
picture coding.

Change-Id: Ib25c0ff4f7450c85fd7a38d24319bd7ae1b9dac8
2018-07-27 11:16:50 -07:00
Harish Mahendrakar 9d782631ab Merge "Add New Neon Assemblies for Motion Compensation" 2018-07-27 17:48:21 +00:00
Angie Chiang 46c0cc11b6 Remove an extra vp9_encode_frame call
The coding performances drop slightly in speed 0
lowres 0.021%
midres 0.043%
hdres 0.087%

The speedups in speed 0 are observed as follow
city_cif.y4m 4.5% speedup
pamphlet.y4m 6.9% speedup

Change-Id: I2f6209964ffdf7a93919b79033d8e6f9bc44d824
2018-07-27 10:47:00 -07:00
Jerome Jiang 8dfb7caeb5 Merge "vp9: release frame buffer on key frame." 2018-07-27 17:31:35 +00:00
Marco Paniconi 187fac45a4 Merge "vp9: 4x4 tx_size for nonrd-pickmode for screen content" 2018-07-27 06:02:34 +00:00
Marco Paniconi d22ea1d3f6 vp9: 4x4 tx_size for nonrd-pickmode for screen content
Force 4x4 transform size under some conditions for real-time
screen-content mode. Improvemet on text in some screen clips.

Change-Id: I77cafa23ea1060ef4334dc07eac53189bf80e0ec
2018-07-26 22:35:25 -07:00
Jerome Jiang 8c82bda0b8 vp9: release frame buffer on key frame.
Add tests with corrupted frames and periodic key frames.

BUG=webm:1545

Change-Id: Ic0684bdafd01507036f56465387b9d2187b1458e
2018-07-26 21:51:05 -07:00
Hui Su d0ad2e25d1 Merge "Fix multi-thread encoder result test" 2018-07-27 02:19:58 +00:00
Hui Su 9775236ca1 Fix multi-thread encoder result test
Fix multi-thread encoder result test induced by
the prune_ref_frame_for_rect_partitions speed feature.

BUG=webm:1552

Change-Id: Idc3b3759651f76285ffd90059c6a2846c4d91a00
2018-07-26 16:10:08 -07:00
Venkatarama NG. Avadhani 090b3b02c2 Add New Neon Assemblies for Motion Compensation
Commit adds neon assemblies for motion compensation which show an improvement
over the existing neon code.

Performance Improvement -

Platform        Resolution      1 Thread        4 Threads
Nexus 6         720p            12.16%          7.21%
@2.65 GHz       1080p           18.00%          15.28%

Change-Id: Ic0b0412eeb01c8317642b20bb99092c2f5baba37
2018-07-26 15:33:46 -07:00
Tom Finegan 6a908db22a Move CONFIG_SIZE_LIMIT check in yv12config.c.
Avoids a C90 compile error.

BUG=webm:1551

Change-Id: Iee0f208de053c2a399aafa015d370c0496878816
2018-07-26 13:59:21 -07:00
Harish Mahendrakar e55e3f8031 Merge "vpxdec: only call row-mt control for vp9" 2018-07-26 19:57:53 +00:00
Harish Mahendrakar fb71f98165 vpxdec: only call row-mt control for vp9
BUG=webm:1549
Change-Id: Ib31b22f0d982e3a7c6a200274582cda7528d1ec9
2018-07-26 12:07:18 -07:00
James Zern 3b921d49b0 Merge "vp9: fix OOB read in decoder_peek_si_internal" 2018-07-26 05:15:34 +00:00
Marco Paniconi cb1868f9c3 Merge "vp9: Modify condition for force test of intra" 2018-07-26 04:47:30 +00:00
Jingning Han 9c13add33a Merge "Clean up get_overlap_area function" 2018-07-26 03:53:22 +00:00
Jingning Han 5de72718b2 Merge "Factor out mode estimation process in tpl model build" 2018-07-26 03:53:11 +00:00
Marco Paniconi 16b0322022 vp9: Modify condition for force test of intra
For real-time/nonrd_pickmode: under some conditions
force check of intra modes for flat blocks with motion.
Reduces artifacts for screen-content mode.

Change-Id: If320f41a90982b14c48d91150f59f048a62982b1
2018-07-25 20:48:28 -07:00
Marco Paniconi 23fc20e363 vp9: Avoid early breakout on slide change
For real-time screen content: don't allow early
breakout in nonrd-pickmode on slide change.
Avoid artifacts.

Change-Id: I09c6927a5d85b46ce059ea5954a3719a7362fb99
2018-07-25 17:30:38 -07:00
James Zern 0681cff1ad vp9: fix OOB read in decoder_peek_si_internal
Profile 1 or 3 bitstreams may require 11 bytes for the header in the
intra-only case.

Additionally add a check on the bit reader's error handler callback to
ensure it's non-NULL before calling to avoid future regressions.

This has existed since at least (pre-1.4.0):
09bf1d61c Changes hdr for profiles > 1 for intraonly frames

BUG=webm:1543

Change-Id: I23901e6e3a219170e8ea9efecc42af0be2e5c378
2018-07-25 17:18:30 -07:00
Marco Paniconi a365f3fc2b Merge "Revert "vp9: Adjust reset segment for real-time screen-content"" 2018-07-25 21:19:41 +00:00
Hui Su 125f2f062f Merge "Fix typos in txfm_rd_in_plane()" 2018-07-25 19:29:06 +00:00
Marco Paniconi 904589ba93 Revert "vp9: Adjust reset segment for real-time screen-content"
This reverts commit d72cd51d83.

Reason for revert: <INSERT REASONING HERE>
Doesn't seem to really remove the artifact that was the cause for this change. Reverting for now.

Original change's description:
> vp9: Adjust reset segment for real-time screen-content
> 
> For real-time screen content mode when the short_circuit
> flat_blocks feauture is enabled: reset segment to 0 for
> coding block if its flat, regardless of temporal source_sad.
> Reduces some artifacts on flat areas.
> 
> Change-Id: I9620e424bedc5a13f87cc4f66af7c0e86043c89c

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

# Not skipping CQ checks because original CL landed > 1 day ago.

Change-Id: I83ee9fd75bfb621a4f3e9afbcc07e7c6ca5c51d6
2018-07-25 18:23:25 +00:00
Yaowu Xu 16c80212bd Merge "Improve help message for arnr-type" 2018-07-25 17:21:26 +00:00
Jingning Han 2294a2e995 Clean up get_overlap_area function
Remove unneeded variable definition.

Change-Id: Ifc8097b249acee86301e2040df8d39ecaca5ab17
2018-07-25 08:14:43 -07:00
Scott LaVarnway 36ea670e3c VPX: Improve HBD vpx_hadamard_32x32_sse2()
BUG=webm:1546

Change-Id: I48224f047547b666c519e0cc23706dd0bda5df20
2018-07-25 05:39:52 -07:00
Scott LaVarnway c14f1abef2 Merge "VPX: avg_intrin_sse2.c, avg_intrin_avx2.c cleanup" 2018-07-25 10:45:46 +00:00
Paul Wilkins 199de0bb7e Merge "Limit min Q for normal frames." 2018-07-25 08:23:33 +00:00
Jingning Han 2ca41bd903 Factor out mode estimation process in tpl model build
Use standalone function to process the mode search and rd cost
estimate for a given coding block.

Change-Id: I77cdbded43966c4546e5407ae318129d89d888a4
2018-07-24 23:03:44 -07:00
Marco Paniconi 9416e222e0 Merge "vp9: Fix to the segment weight for cyclic refresh." 2018-07-25 03:55:51 +00:00
Marco Paniconi 6ef6268387 Merge "vp9: Modify logic for flat blocks in nonrd-pickmode." 2018-07-25 03:33:02 +00:00
Yaowu Xu 14fc43825e Improve help message for arnr-type
BUG=webm:1346

Change-Id: Ia6c1cee3704a6b44515d883b4d0632ac567bc9a2
2018-07-24 17:18:08 -07:00
Hui Su e99779cfa2 Fix typos in txfm_rd_in_plane()
Change-Id: I1c62e51f5ccd33ff74abc3385410525bcae2fedd
2018-07-24 16:55:46 -07:00
Marco Paniconi af6d8456a3 vp9: Modify logic for flat blocks in nonrd-pickmode.
For real-time screen content mode: when slide change
is detected, for spatially flat blocks (source_variance = 0) on
the re-encoded frame, skip inter modes (so force intra) if
non-zero temporal variance is detected for the coding block.
Add flag to keep track of re-encoded frame at max Q.
Reduces artifacts on slide change.

Change-Id: I28151f412aba6ab8cb03f30087c7ce16d443654b
2018-07-24 15:21:09 -07:00
Wan-Teh Chang 94a65e8fba Check size limit in vpx_realloc_frame_buffer.
If CONFIG_SIZE_LIMIT is defined, vpx_realloc_frame_buffer should fail if
width or height is too big.

This carries over commit ebc2714d71a834fc32a19eef0a81f51fbc47db01 of
libaom: https://aomedia-review.googlesource.com/c/aom/+/65521

Change-Id: Id7645c5cefbe1847714695d41f506ff30ea985f6
2018-07-24 12:14:54 -07:00
Scott LaVarnway d4d9fc13cd VPX: avg_intrin_sse2.c, avg_intrin_avx2.c cleanup
Change-Id: I710b66dc571a6bd38fbcc2528486d5e028a68b37
2018-07-24 05:29:55 -07:00
Scott LaVarnway c934d9d65c Merge "VPX: Improve HBD vpx_hadamard_32x32_avx2()" 2018-07-24 12:11:37 +00:00
Paul Wilkins bbfc3290a4 Limit min Q for normal frames.
This patch limits the active min Q for normal frames based on the previous
KF/GF/ARF. In a few cases, especially at the end of a clip where there
has been systemic underspend, (as is often the case with slide shows),
this prevents the encoder rapidly dropping Q on normal frames (just to
try and use up bits), such that they end up with a lower Q than the key
frame / GF / ARF off which they key.

Change-Id: Ic8def5c0d1e37ca2202e007ec1d13e501c0a91dd
2018-07-24 12:49:13 +01:00
Marco Paniconi 49434a1201 Merge "vp9: Adjust reset segment for real-time screen-content" 2018-07-24 06:47:54 +00:00
Marco Paniconi d72cd51d83 vp9: Adjust reset segment for real-time screen-content
For real-time screen content mode when the short_circuit
flat_blocks feauture is enabled: reset segment to 0 for
coding block if its flat, regardless of temporal source_sad.
Reduces some artifacts on flat areas.

Change-Id: I9620e424bedc5a13f87cc4f66af7c0e86043c89c
2018-07-23 22:27:17 -07:00
Hui Su 8991d1c18d Merge "Add prune_ref_frame_for_rect_partitions feature" 2018-07-24 02:41:25 +00:00
Jingning Han 260965646d Merge "Pass in block size for motion search function" 2018-07-24 02:36:37 +00:00
Jingning Han 9488dfced0 Merge "Make the tpl model update operated in 8x8 block unit" 2018-07-24 02:36:27 +00:00
Jingning Han 623471efa3 Merge "Refactor overlap area computation" 2018-07-24 02:36:22 +00:00
Jingning Han f99b7bc647 Merge "Map coding block size to transform block size" 2018-07-24 02:36:15 +00:00
Jingning Han b935b61874 Merge "Refactor tpl model update function" 2018-07-24 02:36:10 +00:00
Jingning Han 53980fd3bc Merge "Scale the distortion mectric with tx size" 2018-07-24 02:36:04 +00:00
Scott LaVarnway dd2c6cc34a VPX: Improve HBD vpx_hadamard_32x32_avx2()
~14% improvement.

BUG=webm:1546

Change-Id: I0b25f62f053e13c2185e4e8bd54e52250251efd0
2018-07-24 00:31:38 +00:00
Jingning Han 4cb688fc9d Pass in block size for motion search function
Use parameter block size to control the motion estimation function
in tpl model building.

Change-Id: I4d9ec28aa15d0fb51a94aacd9bd50810add7ce29
2018-07-23 14:30:22 -07:00
Jingning Han 92733261a8 Make the tpl model update operated in 8x8 block unit
Store and update the temporal dependency model in the unit of
8x8 block.

Change-Id: Ic580495242b51db9beaf38dae67968cbd212be4d
2018-07-23 14:13:10 -07:00
Scott LaVarnway 9e6fa9bfb8 Merge "VPX: Add vpx_hadamard_32x32_avx2" 2018-07-23 21:09:38 +00:00
Jingning Han e650974202 Refactor overlap area computation
Account for the variable operating block sizes.

Change-Id: I4eac4d0b84cf55fbf5c693007c991afe6171ca6a
2018-07-23 14:03:29 -07:00
Scott LaVarnway a83d11f9c4 VPX: Add vpx_hadamard_32x32_avx2
BUG=webm:1546

Change-Id: I64629ed83cb7acd0f2ac49b9c31f369d17a1aed2
2018-07-23 12:49:50 -07:00
Hui Su b54cdcc3de Add prune_ref_frame_for_rect_partitions feature
Add a speed feature to prune reference frames for rectangular
partitions. Rectangular partition RD search happens after square
partition RD search. With this feature, we keep record of the ref
frames picked by square partitions, and only consider those ref
frames during rect partition RD search.

With this feature on, the computation cost of rect partition RD
search is greatly reduced, so we can afford to skip rect partition
RD search less aggressively.

Overall, both compression and encoding speed are improved. Only
speed 0 is affected.

Coding gains:
              lowres    midres    hdres
ovr psnr      0.00%    -0.36%    -0.37%
avg psnr      0.00%    -0.36%    -0.36%

Tested encoding speed with QP=40 on about 30 sequences.
Speed gains:
              lowres    midres    hdres
average       13.4%      7.1%     6.1%
max           28.0%     12.0%     9.8%

Change-Id: Id5f36dd2ac75028ae98550d67b0a524aa251b692
2018-07-23 10:22:32 -07:00
Paul Wilkins 1ca82d2ead Merge "Fixed "MAX" boost for static kf sections." 2018-07-23 13:50:57 +00:00
Paul Wilkins 1842f17e78 Merge "Fix issue with short static KF groups." 2018-07-23 13:50:51 +00:00
Paul Wilkins 5dab9c6ce9 Merge "Limit Max GF boost for slide shows" 2018-07-23 13:50:44 +00:00
Paul Wilkins c98308a40d Merge "Tweaks to determination of slide show groups." 2018-07-23 13:50:34 +00:00
Paul Wilkins 4e8e7975e1 Merge "Improved coding on slide show content." 2018-07-23 13:50:13 +00:00
Scott LaVarnway e858863dda Merge "VPX: Add vpx_hadamard_32x32_sse2" 2018-07-22 23:10:12 +00:00
Scott LaVarnway 31f5369808 Merge "VPX: Improve HBD vpx_hadamard_16x16_sse2()" 2018-07-22 23:09:42 +00:00
Jingning Han 5fb8d15a12 Map coding block size to transform block size
Change-Id: I89e18262a2736c0e86f7c30513179806a926827e
2018-07-22 07:33:42 -07:00
Jingning Han 73b65e763c Refactor tpl model update function
Fill up all the blocks inside an operating unit with the provided
statistics.

Change-Id: I93556e0daf9f08cbe62d3c12cf38b5e26ad7c799
2018-07-22 07:33:37 -07:00
Jingning Han 6dba2fb777 Scale the distortion mectric with tx size
Properly scale the distortion metric according to the tranfer
function gain of the transform block size.

Change-Id: I8e3539d8936f5db78c1352f902f72ef19fc09ed8
2018-07-22 07:33:37 -07:00
Jingning Han 2013915a95 Merge "Replace hard coded numbers in tpl model" 2018-07-22 14:32:52 +00:00
Scott LaVarnway 94b96e4d16 VPX: Add vpx_hadamard_32x32_sse2
BUG=webm:1546

Change-Id: Ide5828b890c5c27cfcca2d5e318a914f7cde1158
2018-07-21 18:10:05 +00:00
Harish Mahendrakar c4f943d7d7 Merge "Add Flag to Enable Row Based MultiThreading" 2018-07-21 00:42:56 +00:00
Venkatarama NG. Avadhani 37dc766570 Add Flag to Enable Row Based MultiThreading
This commit adds a command line argument "--row-mt". Passing "--row-mt=1" will
set the row_mt flag in the decoder context. This flag will be used to
determine whether row-wise multi-threading path is to be taken when the
row-wise multi-threading functions are added.

Change-Id: I35a5393a2720254437daa5e796630709049e0bc2
2018-07-20 18:18:29 +00:00
Jingning Han cecf1d17d0 Replace hard coded numbers in tpl model
Change-Id: I1adedfccf9aa874d0980f1181066b3682614a8cb
2018-07-20 09:43:23 -07:00
Scott LaVarnway 3e4e4f85a3 VPX: Call vpx_hadamard_16x16_c() in vpx_hadamard_32x32_c()
instead of vpx_hadamard_16x16().

Change-Id: Ie16aacad39d7f429e282dd4c93e57c07000d0f29
2018-07-20 09:17:13 -07:00
Paul Wilkins 59bbb33134 Fixed "MAX" boost for static kf sections.
Apply a fixed maximum boost for static key frame
groups /  slide show content (if > 8 frames long).
This insures sufficient boost on shorter sections
whilst preventing excessive boost on longer sections.

Change-Id: I5b857dab023d674cfd55bced3437f3bce3b4f1cb
2018-07-20 16:39:55 +01:00
Paul Wilkins 7409225a3c Fix issue with short static KF groups.
Where a KF group is very short but static make sure
it is coded as a single GF group. Previously there was a
bug where such groups could be coded as an arf group
with the arf in the next scene.

Change-Id: I4504ae2b03c4877fcecfa58dd503879aa4eefac4
2018-07-20 16:38:32 +01:00
Paul Wilkins e70fa980b8 Limit Max GF boost for slide shows
Set an upper limit on the maximum boost for a static
GF only group such as in slide shows as part of tweaks
to quality / rate trade off.

Change-Id: Ic72575328419cdcf82ad3a20a1d9b947538c25c6
2018-07-20 16:36:05 +01:00
Paul Wilkins d5630bfb14 Tweaks to determination of slide show groups.
Slight adjustment to rules for defining static groups.
Adjustment of small bias towards 0,0 motion in first pass.

Change-Id: Id1d3753979ad54622f983f4de08472738317ec8e
2018-07-20 16:35:41 +01:00
Paul Wilkins 31228b3595 Improved coding on slide show content.
This patch adds in detection of slide show content and allows
for coding of long GF only groups up to a length of 240 frames rather
than coding a large number of shorter ARF groups that gradually
lower the Q.

In test samples this patch gave rise to a substantial improvement in
overall psnr and a drop in data rate. In some cases the average psnr
fell, however, with the boost and minQ values set as they are.
This is to be expected because average psnr is dominated by the
best frames in the sequence and previously a relatively poor key frame
could be followed by progressively better alt refs. For example a key
frame at q7.5 but subsequent alt refs improving it to lossless.

For slides displayed for several seconds,  savings of >= 20% (or
commensurate quality gains) are likely.

This patch allows for long GF groups in static sections before and after
complex transitions (e.g. fades) with one or more normal ARF groups
during the transition. However, it enforces a single "normal" length
GF group after the transition before any extended group is allowed.
The reason for this is that the ARF that spans the transition my not have
a very high quality and hence may not be a good GF for the long static
section that follows.

Change-Id: I66cc404c3b85e87dae9829b49d9d631cbf04e037
2018-07-20 16:34:20 +01:00
Scott LaVarnway 17ffb22d4d VPX: Improve HBD vpx_hadamard_16x16_sse2()
~12% improvement.

Change-Id: Ieca4d870a4c1c5ea2c689e27fc4550fcbab9f867
2018-07-20 04:11:04 -07:00
Scott LaVarnway e09e99aa14 Merge "VPX: Add Hadamard32x32Test" 2018-07-20 10:48:56 +00:00
Jingning Han 03e1bd3972 Merge "Refactor transform calls in tpl model build" 2018-07-19 20:02:19 +00:00
Jerome Jiang 45cf384738 vp8: Fix memory address overflow in decoder.
Ref frame buffer is corrupted but it's not checked before it's used to
compute the reconstructed previous frame buffer.

BUG=webm:1496
Change-Id: Ief0e85b91b19576632685d17c8176c8d29158028
2018-07-19 11:34:40 -07:00
Scott LaVarnway d28965a4d5 VPX: Add Hadamard32x32Test
Change-Id: Idad619e963cb2f9bf8c62acac0e061639ec7e0b4
2018-07-19 07:47:58 -07:00
James Zern 3156a28f61 Merge "vpx_sum_squares_2d_i16_neon(): Make |s2| a uint64x1_t." 2018-07-19 05:47:27 +00:00
Jingning Han 985cf2142d Refactor transform calls in tpl model build
Support multiple transform block size. Prepare for more accurate
prediction search.

Change-Id: I845f5cf909ed2cba12cfc3627816cc4b37eddbe0
2018-07-18 21:05:50 -07:00
Tom Finegan 61a87b3628 Merge "shell tests: Drop incorrect uses of readonly." 2018-07-19 00:07:21 +00:00
Marco Paniconi fba011507b vp9: Screen-content after slide-change: increase refresh rate
For screen-content real-time CBR mode: on a detected slide change
that is encoded at max Q (to prevent excessive overshoot), increase
the perc_refresh in the cyclic refresh following the slide change.
Use counter to increase refresh up to some #frames from slide change.

This is attempt to increase quality ramp-up after slide change without
causing too much excess overshoot.

Change-Id: Ie4ec4361082803a522f4a8794b3bb0178c9cf307
2018-07-18 14:53:24 -07:00
Jingning Han b1284dffdb Reland "Enable tpl model for speed 0"
This is a reland of 9c2c234a0b

Threaded mismatch has been addressed.

Original change's description:
> Enable tpl model for speed 0
>
> Enable adaptive Lagrangian multiplier for arf in speed 0, AQ mode 0,
> and low bit-depth settings. This improves the peak compression
> performance:
>
>           avg PSNR       overall PSNR       SSIM
> low       -0.462%         -0.535%          -0.358%
> mid       -0.780%         -0.857%          -0.868%
> hd        -0.914%         -1.017%          -0.471%
> 720p      -0.624%         -0.671%          -1.553%
> nflx2k    -0.764%         -0.784%          -0.908%
>
> The encoding time at speed 0 is slightly changed to be faster or
> slower:
>
> city_cif 1000 kbps
> 78.2 seconds -> 78.1 seconds
>
> bus_cif 1000 kbps
> 98.6 seconds -> 98.8 seconds.
>
> Change-Id: I18e7337bb61d985cbd3cf29e56439a6cdf675389

BUG=webm:1547

Change-Id: I025a21683ceed23d5f7147e200555b58b791315c
2018-07-17 18:20:46 +00:00
Jingning Han de302686b2 Merge "Fix 32-bit build for tpl model" 2018-07-17 18:10:28 +00:00
Jingning Han 81cd335bb8 Fix 32-bit build for tpl model
Clear system state to avoid encoding failure in 32-bit build.

BUG=webm:1547

Change-Id: Ia74c789d1993da09bc400baf24e971e19752e3c3
2018-07-17 09:13:26 -07:00
Marco Paniconi 2c45cd174a Merge "vp9: Force hybrid_intra on scene change" 2018-07-17 04:04:08 +00:00
Raphael Kubo da Costa bc30e6e39c vpx_sum_squares_2d_i16_neon(): Make |s2| a uint64x1_t.
This fixes the build with at least GCC 7.3, where it was previously failing
with:

sum_squares_neon.c: In function 'vpx_sum_squares_2d_i16_neon':
sum_squares_neon.c: note: use -flax-vector-conversions to permit conversions between vectors with differing element types or numbers of subparts
     s2 = vpaddl_u32(s1);
     ^~
sum_squares_neon.c: incompatible types when assigning to type 'int64x1_t' from type 'uint64x1_t'
     s2 = vpaddl_u32(s1);
        ^
sum_squares_neon.c: incompatible types when assigning to type 'int64x1_t' from type 'uint64x1_t'
     s2 = vadd_u64(vget_low_u64(s1), vget_high_u64(s1));
        ^
sum_squares_neon.c: incompatible type for argument 1 of 'vget_lane_u64'
   return vget_lane_u64(s2, 0);
                        ^~

The generated assembly was verified to remain identical with both GCC and
LLVM.

Bug: chromium:819249
Change-Id: I2778428ee1fee0a674d0d4910347c2a717de21ac
2018-07-17 03:52:11 +00:00
Jingning Han 2fd01596a5 Merge changes Iee11abf6,I8acbc718,Ia9a84311
* changes:
  Account for quantization effect in the tpl model
  Assign estimate qp for overlay frame
  Use the estimate qp to set motion search control
2018-07-17 03:49:47 +00:00
Marco Paniconi 3e09814a02 vp9: Force hybrid_intra on scene change
For real-time screen content mode: when scene/slide change
is detected and re-encode is decided, force hybrid_intra
mode search if slide change is big and alot of Intra modes
were used. hybrid_intra mode will use rd-based intra mode
search for small blocks.

Overall better PSNR on clip with slide changes, with similar
encoded frame size. Encode time lightly higher on average with
this change.

Change-Id: I503835253b777b9f98d74e75a52a8000b76c310c
2018-07-16 20:02:13 -07:00
Jingning Han d3c316a3e4 Account for quantization effect in the tpl model
Account for the likely quantization effect in the temporal
dependency model.

Change-Id: Iee11abf651353098494e57cccf0ac26ce7535924
2018-07-16 14:33:42 -07:00
Jingning Han 5b6203aa59 Assign estimate qp for overlay frame
Assign the estimated qp for the overlay frame too. Cap the minimum
quantization parameter to be 1 to avoid lossless coding in the
temporal dependency model setup.

Change-Id: I8acbc7182045dbf3017b6712a119b18407b76ab0
2018-07-16 14:31:51 -07:00
Johann bc7c99e7ec Revert "Enable tpl model for speed 0"
This reverts commit 9c2c234a0b.

Causes multithreading test failures in 32 bit configurations.

BUG=webm:1547

Change-Id: Idb480b206a87b7cd6affbafffde8d8e1b6aee621
2018-07-16 12:28:16 -07:00
Paul Wilkins 2f7e0c32c4 Merge "Delete invalid assert." 2018-07-16 16:49:20 +00:00
Jingning Han 07507c830b Use the estimate qp to set motion search control
Set the multiplier for motion estimation using the estimate frame
quantization paramter in the temporal dependency model.

Change-Id: Ia9a843111c1504d7ae8b12113374831ee79c85b8
2018-07-13 16:50:53 -07:00
Jingning Han f130666dcf Set the estimate frame qp in tpl_frame
Assign the estimate frame quantization parameter in the tpl_frame
data structure.

Change-Id: I6149bdb1e15dbdae348f06ff61bf814004462232
2018-07-13 16:05:16 -07:00
Jingning Han b073af5925 Estimate the frame qp in a gop
Gather the availabel statistics to estimate the frame level
quantization parameter set in a group of pictures. This will be
called in the tpl model construction. No visible coding stats
change would occur.

Change-Id: Ic412e4afd9a60f1317a5f8eab6a4f6d5e48c4c07
2018-07-13 15:17:54 -07:00
Jingning Han 2718de54ca Merge "Enable tpl model for speed 0" 2018-07-13 18:59:25 +00:00
Jingning Han b2bedbda46 Merge "Refactor rc_pick_q_and_bounds_two_pass parameters" 2018-07-13 16:10:12 +00:00
Marco Paniconi 7ccf953b69 vp9: Enforce intra search on scene_change
For real-time non-rd pickmode: force check of
intra modes on INTER frames for scene changes.
Reduces artifacts on scene changes.

Change-Id: I5ae80869072db156791ace554c0a470f3785e9c6
2018-07-12 23:18:35 -07:00
Jingning Han ede9cbe617 Refactor rc_pick_q_and_bounds_two_pass parameters
Send the gf_group index as argument into the function. This
prepares later re-use of this function in the tpl model.

Change-Id: Id6203105629e687172c651a013d38c207b60ace7
2018-07-12 21:43:17 -07:00
Wan-Teh Chang f3904a9b96 Merge "Backport libaom bug fixes." 2018-07-13 01:40:35 +00:00
Wan-Teh Chang 26ed1ed4f0 Backport libaom bug fixes.
libaom commit 80a5b09337a80093e1e7ae5eb540020a22949805:
dec_free_mi: Reset cm->mi_alloc_size.

libaom commit fb0dd0bb80fc95ef016f1421b105a52fffa32816:
Clear cm->width and cm->height on alloc failure.

libaom commit ccb27264089a8cfa1334391ebbcb6a11b8dff442:
Misc. resize fixes along with the resize test
Note: only the change to enc_free_mi in av1/encoder/encoder.c
is merged.

Change-Id: I602813230d40125e59608fa013085dca3e160c33
2018-07-12 14:50:04 -07:00
Jingning Han 33df6acb2b Merge "Use regular filter type for tpl model motion compensation" 2018-07-12 17:30:05 +00:00
Jingning Han 73ecc9c5ff Merge "Clean up mc_flow_dispenser()" 2018-07-12 17:29:54 +00:00
Jingning Han 70dc980703 Merge "Add 32x32 Hadamard transform" 2018-07-12 17:29:43 +00:00
Jingning Han 9f28197d7c Merge "Relax multiplier adjustment limit" 2018-07-12 17:00:23 +00:00
Jingning Han abfa03ab93 Merge "Change the tpl model operating block size to 32x32" 2018-07-12 17:00:07 +00:00
Jingning Han 9c2c234a0b Enable tpl model for speed 0
Enable adaptive Lagrangian multiplier for arf in speed 0, AQ mode 0,
and low bit-depth settings. This improves the peak compression
performance:

          avg PSNR       overall PSNR       SSIM
low       -0.462%         -0.535%          -0.358%
mid       -0.780%         -0.857%          -0.868%
hd        -0.914%         -1.017%          -0.471%
720p      -0.624%         -0.671%          -1.553%
nflx2k    -0.764%         -0.784%          -0.908%

The encoding time at speed 0 is slightly changed to be faster or
slower:

city_cif 1000 kbps
78.2 seconds -> 78.1 seconds

bus_cif 1000 kbps
98.6 seconds -> 98.8 seconds.

Change-Id: I18e7337bb61d985cbd3cf29e56439a6cdf675389
2018-07-12 09:30:42 -07:00
Jingning Han 8dbf0dcc67 Use regular filter type for tpl model motion compensation
This slightly improves the compression performance by 0.05%.

Change-Id: Ice0b1f5e1f24a77008b093f7830e51fcd6cbfa8e
2018-07-12 08:50:31 -07:00
James Zern 829d1b2098 Merge changes Ibcc2f6fa,Id54818a8
* changes:
  test-data.sha1: update crbug-1539.rawfile
  test-data.mk: add missing crbug-1539.rawfile entry
2018-07-11 22:24:46 +00:00
Jerome Jiang 014e49a851 Merge "vp9 svc: Add test for intra-only for 1 SL." 2018-07-11 19:54:03 +00:00
James Zern 0d471fbfb1 Merge "decode_test_driver: break decompress loop on error" 2018-07-11 19:53:52 +00:00
James Zern c6209537ca test-data.sha1: update crbug-1539.rawfile
Use a valid frame rather than the one from the bug to avoid dealing with
trailing data. The decode would fail on x86 due to read size differences
in the entropy decoder.
The updated file was created from the first frame in:
vp90-2-02-size-08x08.webm

BUG=webm:1539

Change-Id: Ibcc2f6fa435bcf360a40fc9a202a8baba42b24da
2018-07-11 12:47:10 -07:00
James Zern 87ac7276a6 test-data.mk: add missing crbug-1539.rawfile entry
missed in:
d95d82b15 vpxdec,raw_read_frame: fix eof return

BUG=webm:1539

Change-Id: Id54818a838c0215457c3eb82f83bd4f3a791199b
2018-07-11 12:47:01 -07:00
Jerome Jiang 6dc668eed6 vp9 svc: Add test for intra-only for 1 SL.
In this case, verify that a key frame is inserted.

Change-Id: I70aa1974de956e657e413a34fd8bbcddf5d20c2c
2018-07-11 10:41:55 -07:00
Tom Finegan d587bca38d shell tests: Drop incorrect uses of readonly.
Change-Id: I0a01e1a7c04bbc026a1db0ba90d516548a1eaaed
2018-07-11 10:13:02 -07:00
Jingning Han db1222bf3c Clean up mc_flow_dispenser()
Remove unneeded statements.

Change-Id: Ic7a3079eb36e1ec6988390958565e13d5965b30d
2018-07-11 09:35:27 -07:00
Jingning Han 1d5380787a Add 32x32 Hadamard transform
Add 32x32 Hadamard transform in C implementation. Replace the
forward 32x32 2D-DCT in tpl model with Hadamard transform. This
would reduce the overhead encoding time due to running tpl model
by ~3x.

Change-Id: I1c743dab786b818d89f14928cc3998d056830aa9
2018-07-11 09:26:49 -07:00
Jingning Han 8bd4377ba6 Relax multiplier adjustment limit
Relax the Lagrangian multiplier adjustment limit from 1/4 to 1/2
fluctuation. This allows the temporal dependency model takes more
effect on changing the rate allocation across blocks.

Change-Id: Ida59ad628d35f196a1299d96e21bb684c20b0143
2018-07-11 09:23:14 -07:00
Jingning Han fdfec4c7be Change the tpl model operating block size to 32x32
Increase the temporal dependency model operating block size from
8x8 to 32x32.

Change-Id: I26b13493fe957d67c8646575370e651584b56ea5
2018-07-11 09:21:24 -07:00
James Zern 2c677a2afe decode_test_driver: break decompress loop on error
avoids duplicate errors should DecompressedFrameHook fail and a
potential end-less loop should dec_iter fail to advance.

Change-Id: Ifb2673d02188a8aad75cda8bb960bb56fe70d218
2018-07-10 20:52:22 -07:00
Jingning Han f13a37cbf7 Fix the denominator in tpl model
The factor mc_dep_cost includes intra_cost additiona already. Hence
no need to add it again in the denominator.

Change-Id: I750ae86e1d3019b4a3aebd03dec8db362589619e
2018-07-10 20:45:23 -07:00
Jingning Han 3ef16b45e6 Enable tpl model only for ARFs
Currently only enable the temporal model for ARFs.

Change-Id: I6e7fd7bba54c3e0cf56147f049fc3ead85542d04
2018-07-10 20:45:23 -07:00
Jingning Han 006dc3a224 Properly set the is_valid flag in tpl_frame
Use this flag to indicate the temporal dependency model for the
given frame is properly set up.

Use the pointer address to decide if the tpl_stats_ptr array needs
to be released.

Change-Id: I541fe098f51981010011ae0af2535d8a5762d254
2018-07-10 19:58:43 -07:00
James Zern 9364fc04f1 Merge "vp9_encoder: only alloc tpl stats if enabled" 2018-07-10 21:49:02 +00:00
Marco Paniconi 6a07d918d8 Merge "vp9: Initialize source variance in nonrd-pickmode." 2018-07-10 18:12:46 +00:00
James Zern 726dbfee5d Merge "vpxdec,raw_read_frame: fix eof return" 2018-07-10 17:56:32 +00:00
Marco Paniconi 51f4f9257d vp9: Initialize source variance in nonrd-pickmode.
It is already initialized at superblock level, but since
it is computed per coding block, based on some speed features,
better to initialize it in pick_inter.

No change in behavior, as currently the speed features
that enable use of source_variance in pick_inter are fixed
at the frame-level.

Change-Id: Ic787ac2f389ba1bced98716096e7b5cffba856a7
2018-07-10 10:06:22 -07:00
James Zern a01896354d vp9_encoder: only alloc tpl stats if enabled
defer the allocation to post speed feature setup

Change-Id: I20713a2b1856fd5479c883d50772a2b54bcbb3bc
2018-07-09 22:56:54 -07:00
James Zern d95d82b15b vpxdec,raw_read_frame: fix eof return
fixes an endless loop caused by successful read return on eof.

since:
00a35aab7 vpx[dec|enc]: Extract IVF support from the apps.

BUG=webm:1539

Change-Id: I64dbb94189ea6a745d53a4bacc033f5f58eafb37
2018-07-09 22:32:19 -07:00
Luca Barbato cdba956e1b Merge "[VSX] Add support to Power9-only vec_absd" 2018-07-09 20:13:44 +00:00
Marco Paniconi fc47d14892 vp9: Fix to the segment weight for cyclic refresh.
For screen-content mode with aq-mode=3: use the proper
segment weight (remove division by 2).

Change-Id: I747575062c644df7ead3fa41525fb6d6bac04f4d
2018-07-09 10:28:31 -07:00
Marco Paniconi de5a4fbb10 vp9-svc: Intra-only frame for spatial layers.
Use case is for layered (SVC) coding to allow higher
resolution layers to continue decoding with temporal references,
while base spatial layer is intra-only frame.

Made encoder changes to real-time path for encoding intra-only
frame. The intra-only frame will be followed by the overlay/copy
frame (with both packed in the same superframe).

Use existing control to enable intra_only frame.
Intra only is only applied to base spatial layer, and only
allowed under fixed/non-flexible SVC mode, and only for
1 < number_spatial_layers < 4.

Added svc datarate unittest for inserting intra_only frame
as sync frame. Added svc end to end tests to check mismatch.

Change-Id: I2f4f0106b2c4f51ce77aa2c1c6823ba83ff2f7a0
Signed-off-by: Marco Paniconi <marpan@google.com>
2018-07-09 09:36:35 -07:00
Paul Wilkins 8631eb4506 Delete invalid assert.
Delete assert that is not valid in all cases.

This can occur if the last group in a clip is a GF only
group. Here the frame count reflects the nominal
positioning of the "next" GF (were it to exist) one
frame beyond the of the end of the clip.

Change-Id: I0d36b83de0ab478dab032599ee7df7fff4a35cd5
2018-07-09 14:30:15 +01:00
Luca Barbato 73962af371 [VSX] Add support to Power9-only vec_absd
~5% gain for SAD.

Change-Id: Ief7d7691f837474f5b6b582129628276fdcce319
2018-07-08 16:03:47 +02:00
Zoe Liu 4745bc2ff3 Merge "Add hierarchical structure based ref frame update" 2018-07-06 21:47:28 +00:00
Sergey Silkin c94dacc231 vp9-svc: add more command line options to test app.
This adds the following command line options to
vp9_spatial_svc_encoder test app:
--drop-frame=<arg>        Temporal resampling threshold (buf %)
--tune-content=<arg>      Tune content type default, screen, film
--inter-layer-pred=<arg>  0 - 3: On, Off, Key-frames, Constrained

Change-Id: I653d1924fb6e525edb2d1e84739be0b88e773e1c
2018-07-04 08:03:54 +00:00
Sergey Silkin 2f8a126f0a Merge "vp9-svc: fix strings concatenation in test app." 2018-07-04 08:02:05 +00:00
Jingning Han 388755f99f Merge "Add enable-tpl-model guard" 2018-07-02 23:57:25 +00:00
Zoe Liu a6874985e2 Add hierarchical structure based ref frame update
Change-Id: I23559110bae8fa2328fe9bdb6672c7b1da84e17f
2018-07-02 14:53:12 -07:00
Marco Paniconi 03abd2c8f3 vp9: Adjust segment weight for cyclic refresh.
For screen-content: use the previous actual number of seg
blocks for the segment weight, used in the rate control
for setting frame-level Q.

Small overall increase in psnr on several screen-content clips.

Change-Id: Id414fb7f1b0ba578d464437d7f9c1783a0cad310
2018-07-02 13:08:53 -07:00
Jingning Han 9697a65804 Add enable-tpl-model guard
Skip operations that exercise the tpl model values if the model
is turned off.

Change-Id: I9ab3b56950f6b5a40ae4670a570885aaaadf8382
2018-07-02 12:34:21 -07:00
Jingning Han 1bb29e2455 Merge "Exploit the spatial variance in temporal dependency model" 2018-07-02 17:01:45 +00:00
Marco Paniconi 24b16ce7c9 vp9: Fix to screen content artifact for real-time.
Reset segment to base (segment#0) on spatially flat
stationary blocks (source_variance = 0). Also increase
dc_skip threshold for these blocks.

Reduces artifacts on flat areas in screen content mode.

Change-Id: I7ee0c80d37536db7896fa74a83f75799f1dcf73d
2018-07-01 19:28:48 -07:00
Jerome Jiang 5786379401 Merge "vp9: copy source on sync frame in denoiser." 2018-06-30 02:09:29 +00:00
Marco Paniconi 26792d95b3 Merge "vp9: Reset params for cyclic refresh on slide change" 2018-06-30 02:02:35 +00:00
Marco Paniconi 914d5abb5c Merge "vp9: Reduce quality artifact for real-time scene-content." 2018-06-30 00:04:39 +00:00
Marco Paniconi d9d93140c4 vp9: Reset params for cyclic refresh on slide change
Reset the last_coded_q_map and the sb->index in the cyclic_refresh
on a re-encode for slide change, so the refresh can start again
right after slide change.

Change-Id: I10cbc8354de8f7c2863b4212e6793b58a048b330
2018-06-29 16:01:19 -07:00
Hui Su 8e01faee05 Merge "Add partition breakout models for 720p resolution" 2018-06-29 22:59:51 +00:00
Jerome Jiang 884c4cae73 vp9: copy source on sync frame in denoiser.
Refresh all denoiser buffers on sync frame.

Add sync frame test with denoiser enabled.
Change-Id: I562a5ef5614b92a97565e6181a79eda51d9aeb99
2018-06-29 15:47:02 -07:00
Marco Paniconi e92a68c226 vp9: Reduce quality artifact for real-time scene-content.
Add scene detection flag to choose_partitioning to force split
of 64x64 block partition. This reduces artifacts on slide changes.

Bug:b/110978869

Change-Id: I9cc79a7c03f3aa2edeb28656b09a2177b72d59a8
2018-06-29 15:39:40 -07:00
Hui Su e51daf90fd Add partition breakout models for 720p resolution
Add partition search breakout models for 720p resolution,
currently enabled only for speed 0.

Compression performance change is neutral.
Tested encoding speed over 20 720p clips:

Speed gain(%)	QP=55	QP=45	QP=35
  max     	22.1	20.3	29.8
average	        10.3	 9.1	11.4

Change-Id: I07499728bbc5b80035fc66fad882ea556c8d07f2
2018-06-29 10:45:09 -07:00
Jingning Han 3df55cebb2 Exploit the spatial variance in temporal dependency model
Adapt the Lagrangian multipler based on the spatial variance in
the temporal dependency model. The functionality is disabled by
default. To turn on, set enable_tpl_model to 1.

Change-Id: I1b50606d9e2c8eb9c790c49eacc12c00d3d7c211
2018-06-29 09:15:55 -07:00
Jingning Han a2d35b234f Refactor to use unified multiplier for partition search
Change-Id: I26ced25ff2e20ec414d5ecaa7d26f4a69175896c
2018-06-29 07:58:47 -07:00
Paul Wilkins 4400b0137e Merge "Enhanced partition experiment." 2018-06-29 14:16:52 +00:00
Sergey Silkin c5b8a93ce0 vp9-svc: fix strings concatenation in test app.
Change-Id: I292a1a5c19fd4f23b332e346d0ccac1a9c8455fc
2018-06-29 09:41:20 +02:00
Jingning Han 4d2ec89de0 Merge "Skip temporal dependency build when the speed feature is off" 2018-06-29 05:02:40 +00:00
Jingning Han d26b6b5b9a Merge "Avoid operation on INT64_MAX value" 2018-06-29 04:18:08 +00:00
James Zern f4e6bf6e05 Merge "libyuv: disable AVX512 in clang" 2018-06-29 04:03:20 +00:00
Marco Paniconi 32235a77b7 vp9-svc: Adjust threshold for early exit on golden
Use the avg_frame_low_motion to reduce/turnoff this
early exit for higher motion content. Get some quality
back for higher motion clips and keep the same exit
thresh for low motion clips.

Change-Id: I95daf754dc0048b3e935d1a753f7f1101e6ffb77
2018-06-28 16:43:56 -07:00
Jingning Han 112652fb09 Skip temporal dependency build when the speed feature is off
Change-Id: I888761193882cc92720e0efaea5229a04a6ed67f
2018-06-28 15:26:27 -07:00
Johann bb286cd851 libyuv: disable AVX512 in clang
ARGBToRGB24Row_AVX512VBMI fails to compile on Mac:
row_gcc.cc: instruction requires: AVX-512 VBMI ISA AVX-512 VL ISA

BUG=libyuv:789

Change-Id: Ibd584e8c82e3ce86ec5460b4243f84f5dbdf4c81
2018-06-28 13:17:42 -07:00
Jingning Han 3e3ca7faa6 Avoid operation on INT64_MAX value
If the rate cost returns as INT_MAX, directly set the rdcost as
INT64_MAX.

Change-Id: I3ea1963aff10040dd9cef805beed9aebeedb93bc
2018-06-28 12:06:44 -07:00
Paul Wilkins b99749b540 Enhanced partition experiment.
This patch relates to motion artifacts as described in Issue 73484098

The aim of this patch is to promote the use of smaller partition
sizes in places where some of the sub blocks have very low
spatial complexity and some have much higher complexity.
The patch can have a small impact on encode speed, but much
less than alternative approaches such as lowering the rd thresholds
that limit the partition search when distortion is low.

The patch also applies a similar sub block strategy for AQ1.

Metrics results for our standard sets over typical YT rates.
(Overall PSNR, SSIM, PSNR HVS) % -ve better.

Low Res -0.274, -0.303, -0.330
Mid Res 0.001, - 0.128, -0.100
Hd Res -0.236, -0.371, -0.349
N 2K -0.663, -0.798, -0.708
N 4K -0.488, -0.588, -0.517

Change-Id: Ice1fc977c1d29fd5e401f9c7c8e8ff7a5f410717
2018-06-28 10:05:03 +01:00
Marco Paniconi ed4a7d5880 Merge "vp9-svc: Fix to early golden exit nonrd-pickmode" 2018-06-28 05:33:59 +00:00
Marco Paniconi 2916a49c87 Merge "vp9-svc: Set avg_frame_low_motion for lower layers." 2018-06-28 05:33:50 +00:00
Luca Barbato 0c46742f56 Merge "Support Power8/Power9 tuning" 2018-06-27 22:27:11 +00:00
Marco Paniconi 8fd78299b9 vp9-svc: Set avg_frame_low_motion for lower layers.
The avg_frame_low_motion metric is only computed on the
top spatial layer, and since its part of the layer context
struct, it needs to written to all lower spatial layers for
consistency.

Small/minor change in metrics.
Change-Id: I92a001c37aeb332e613212288b13a2ed9745af88
2018-06-27 13:44:34 -07:00
Marco Paniconi 2899a9d438 vp9-svc: Fix to early golden exit nonrd-pickmode
For SVC: apply the sse_zeromv early exit also to
the case where golden is second temporal reference.
Set the thresh_svc_golden threshold for this case.

This is reduce the encode time for case where golden
is second temporal reference for SVC.
Change-Id: I8c0c87dd746579d3c4f5e983c7f9dd0a1e1476e0
2018-06-27 13:42:25 -07:00
Luc Trudeau 31a7f65cd6 Add Speed Tests to Trans32x32Test
Speed tests are disabled by default.

Change-Id: I49f8da3d3e1e4d9c72b17fc47c098284e7d84236
2018-06-27 20:34:39 +00:00
Luca Barbato 201acea7fe Support Power8/Power9 tuning
Change-Id: I50b32f37f77224ebf0470545152c83ae2ed3cfa3
2018-06-27 21:12:39 +02:00
Luca Barbato 3c20815a40 Merge "[VSX] Drop the clang-4 workaround for vec_xxpermdi" 2018-06-27 19:04:49 +00:00
Zoe Liu eca56ea29a Merge "Add reference frame update flags for hierarchical" 2018-06-27 17:09:32 +00:00
Luc Trudeau 244b16685f Merge changes Ic2183e8b,If906ec9b
* changes:
  [VSX] Replace vec_pack and vec_perm with single vec_perm
  VSX Version of fdct32x32_rd
2018-06-27 16:09:48 +00:00
Hui Su f7b368dc5a Merge "Turn on ML partition search breakout on speed 0" 2018-06-27 15:38:45 +00:00
Luc Trudeau b0adbb6c22 [VSX] Replace vec_pack and vec_perm with single vec_perm
vpx_quantize_b:
VP9QuantizeTest Speed Test (POWER8 Model 2.1)
32x32 Old VSX time = 8.1 ms, new VSX time = 7.9 ms

vp9_quantize_fp:
VP9QuantizeTest Speed Test (POWER8 Model 2.1)
32x32 Old VSX time = 6.5 ms, new VSX time = 6.2 ms

Change-Id: Ic2183e8bd721bb69eaeb4865b542b656255a0870
2018-06-27 14:32:14 +00:00
Luc Trudeau dc93b6298b VSX Version of fdct32x32_rd
Low bit depth version only. Passes the Trans32x32Test test suite.

Trans32x32Test Speed Test (POWER9 Model 2.2)
32x32 C time = 212.7 ms (±0.1 ms), VSX time = 82.3 ms (±0.0 ms) [2.6x]

Change-Id: If906ec9b56ce3818cae0cc462c7277284ab29859
2018-06-27 09:59:35 -04:00
Johann Koenig 95e2c8399c Merge "third_party/libyuv: update to a37e7bfe" 2018-06-27 13:10:19 +00:00
James Zern f708c15a30 Merge "BUG FIX: Initialize AverageTestBase members" 2018-06-27 01:06:17 +00:00
Scott LaVarnway afb1114d53 BUG FIX: Initialize AverageTestBase members
bit_depth_ was not initialized (used in FillRandom)
and caused valgrind errors.

BUG=webm:1542

Change-Id: I09a9acd54de0dfa4f9006304f45eb20883c9908c
2018-06-26 17:16:41 -07:00
Jerome Jiang 5c8351bdd6 Merge "vp9 svc: Add tests for sync on 2nd & 3rd spatial layers." 2018-06-26 23:46:00 +00:00
Jerome Jiang 2b9d94189c Merge "vp9 svc: Move CheckLayerRateTargeting into class." 2018-06-26 23:34:50 +00:00
Jerome Jiang 9fb1eec444 Merge "vp9 svc: Fix uninitialized data members in frame sync tests." 2018-06-26 22:53:57 +00:00
Jerome Jiang 3f785f712c vp9 svc: Move CheckLayerRateTargeting into class.
No need to pass arguments that are already members of the class.

Change-Id: I887d33d6037b561dee5dd8d49bb112d9120cd2a7
2018-06-26 15:38:30 -07:00
Jerome Jiang b6dde59e50 vp9 svc: Add tests for sync on 2nd & 3rd spatial layers.
Change-Id: I4d8b6d114d9a407f5bb879ab059a66425976f1df
2018-06-26 15:35:55 -07:00
Jerome Jiang 4777112764 vp9 svc: Fix uninitialized data members in frame sync tests.
BUG=webm:1542

Change-Id: If3e0b32a6832740b9af2f5c2d9418a6664297f57
2018-06-26 15:03:11 -07:00
Hui Su c398d5a770 Turn on ML partition search breakout on speed 0
Enable ML based partition search breakout on speed 0 when frame
resolution is less then 720p and bitdepth is 8.

Compression performance change is neutral.
Tested encoding speed over 20 480p sequences:

Speed gain(%)	QP=30	QP=40	QP=50	QP=60
  max     	14.4	18.6	17.8	24.4
average	         4.6	 9.0	 8.0	13.2

Change-Id: Ia0d2947030ac774dc1533eb27ffc57f5b788a6ce
2018-06-26 10:20:10 -07:00
Marco Paniconi dd3d08f0c2 vp9: Add lower Q limt to cyclic refresh usage.
Disable the cyclic refresh for very low average Q.
This reduces encoded bitrate for static slides after the
the quality has ramped up well enough (low Q). And as the
cyclic refresh is not needed at low Q in most cases, this
has minimal/no effect on quality on RTC set.

Change-Id: Id6d449aa2351bb6886d72aafb2d406e967ed2789
2018-06-25 14:16:32 -07:00
Scott LaVarnway a5d499e165 Merge "Add vpx_highbd_avg_8x8, vpx_highbd_avg_4x4" 2018-06-25 18:45:45 +00:00
Marco Paniconi 583859d739 Merge "vp9: Fixes for lossless mode for real-time mode." 2018-06-25 18:30:18 +00:00
Marco Paniconi 74c890b98b Merge "vp9-svc: Fix to frame dropping when layer is skipped." 2018-06-25 18:19:41 +00:00
Hui Su 4fc3dadb32 Merge "Add a partition search breakout model" 2018-06-25 17:58:23 +00:00
Marco Paniconi 60f9cf2920 vp9: Fixes for lossless mode for real-time mode.
Fixes to nonrd coding mode for lossless mode: keep
skip_txfm to 0 (no skip) and disable the encoder breakout.
This makes the encoding lossless when that mode is selected
for real-time (nonrd pickmode).

Also the disable the cyclic refresh for lossless mode.

Change-Id: I20a11ef6df08accec472d26fabebd14d51f4d337
2018-06-25 09:56:10 -07:00
Marco Paniconi 3edef86c5f vp9-svc: Fix to frame dropping when layer is skipped.
Fix condition in frame dropper for SVC to handle case
where spatial layer is skipped encoded (due to 0 bitrate).

Change-Id: I24185178774d73e8bb1c406acc0292422dfbe174
2018-06-25 09:11:23 -07:00
Hui Su bb226f61dd Add a partition search breakout model
for q-index between 100 and 150.

This only affects speed 1 and 2, resolution under 720p, q-index between
100 and 150, low bit-depth.

Compression performane change is neutral.
Encoding speed gain is up to 16% for speed 1;
                       up to  6% for speed 2.

Results from encoding city_4cif_30fps:
speed 1, QP=36
before:  37.964 dB, 45581b/f, 2.73 fps
after:   37.958 dB, 45510b/f, 3.16 fps

speed 1, QP=28
before:  39.297 dB, 82452b/f, 2.14 fps
after:   39.297 dB, 82310b/f, 2.25 fps

speed 2, QP=36
before:  37.903 dB, 45586b/f, 4.08 fps
after:   37.895 dB, 45492b/f, 4.34 fps

speed 2, QP=28
before:  39.224 dB, 82272b/f, 3.03 fps
after:   39.223 dB, 82152b/f, 3.17 fps

Change-Id: Ieaefedad902df80aa9699545fa06294601955803
2018-06-24 16:04:50 -07:00
Jerome Jiang bd6b274dc0 Merge "VP9 SVC: Add tests for layer sync on base layer." 2018-06-24 02:25:26 +00:00
Jerome Jiang e3061c7e61 VP9 SVC: Add tests for layer sync on base layer.
Create tests for sync layer. The purpose of new tests is not to check
bitrate targeting, thus they're put in a new file.

Create a base class for svc tests, which is also inherited by svc datarate
tests, to reduce code redundancy.

Start decoding in the test from the frame of layer sync.

Change-Id: I7226d208279ad785873dffef51e0a8abef23b256
2018-06-23 15:04:42 -07:00
Zoe Liu 1f9b095183 Add reference frame update flags for hierarchical
Previous CLs have implemented the construction of the hierarchical
structure at the encoder side. This CL is to define and configure the
according flags that will guide the reference frame update according to
the constructed hierarchical structure.

Change-Id: Iae55f2400f7c7beff41feff9308f87bfc70c7b21
2018-06-22 19:31:09 -07:00
Zoe Liu 4a5f0c0899 Merge "Add extra altref option for hierarchical structure" 2018-06-22 21:47:47 +00:00
Zoe Liu e86670a2af Add extra altref option for hierarchical structure
This CL is to hook up the implemented hierarchical structure
construction as well as its corresponding bitrate allocation
functionality with the defining of a GF group.

Currently the hierarchical structure is off by default. Hence this CL
has no impact on coding performance.

Change-Id: I9e1ddfd877559e99072c23970f7fe103b64ed9ee
2018-06-22 11:28:58 -07:00
Scott LaVarnway a3c2774126 Add vpx_highbd_avg_8x8, vpx_highbd_avg_4x4
BUG=webm:1537

Change-Id: I5f216f35436189b67d9f350991f41ed31431d4fe
2018-06-22 17:55:48 +00:00
Zoe Liu a4b53b2e0e Single out ref frame update functionality
This CL is for a preparation to introduce hierarchical structure based
reference frame update.

Change-Id: Id00a6b721c97d24fc7f5499483b31762b3839a3e
2018-06-22 09:35:18 -07:00
Luca Barbato a665b23a9b Merge changes I51e7ed32,I99a9535b,Id584d8f6
* changes:
  ppc: add vp9_iht16x16_256_add_vsx
  ppc: add vp9_iht8x8_64_add_vsx
  ppc: add vp9_iht4x4_16_add_vsx
2018-06-22 08:13:54 +00:00
Jerome Jiang 0858824050 Merge "Add capibility to configure decoder in encode tests." 2018-06-22 04:32:46 +00:00
Jerome Jiang 48e1b2b97d Add capibility to configure decoder in encode tests.
This will allow us to test SVC features like Decode up to certain layers.

Change-Id: Icfb6f9d107108054cd0917197552e09ae48cbc52
2018-06-21 14:08:45 -07:00
Hui Su 4a42d0918e Merge "Add a partition search breakout model" 2018-06-21 19:18:01 +00:00
Zoe Liu 2be4b4981f Merge "Add bit allocation for hierarchical layer" 2018-06-21 16:31:36 +00:00
Johann 42c7213960 third_party/libyuv: update to a37e7bfe
Fix mingw builds for x86_32 by updating past:
https://chromium.googlesource.com/libyuv/libyuv/+/8fa02df3c0591754958a50

Pick up upstream fixes for clang 5 builds with --disable-optimizations.

Disable libyuv by default when building for msa. We have not been able
to update libyuv because of build issues with mips. This can be
revisited when we update the mips compiler used in Jenkins.

BUG=webm:1509,libyuv:793,webm:1514,webm:1518

Change-Id: Id0b9947cb5e0aa74f2f74746524ab6ff2d48796f
2018-06-21 06:23:44 -07:00
Jingning Han 770c68da83 Refactor block partition level rate distortion cost computation
Compute the rate distortion cost directly at the coding block level.

Change-Id: Ib3f8e1ac6b6ec68db4f96c037f567b19da7fb114
2018-06-20 17:32:45 -07:00
Zoe Liu f31b3154bd Add bit allocation for hierarchical layer
This CL migrates the bit allocation scheme from libaom and combines the
scheme for hierarchical layer with the updated scheme in libvpx that
uses a modified scheme to calculate the target bitrate per frame.

Change-Id: I63593ed528abd4a6a1a8681abf6c9cf06c7a2ee0
2018-06-20 16:14:57 -07:00
Johann Koenig 862d6f48c5 Merge "libyuv: remove problematic functions" 2018-06-20 22:53:19 +00:00
Jingning Han bf833ff771 Merge "Disable tpl model in high bd route" 2018-06-20 21:00:27 +00:00
Johann bbf2160c0b libyuv: remove problematic functions
These fail to build with clang on 32 bit with
--disable-optimizations

Upstream libyuv has addressed these and we will get updated
versions on the next roll. At the moment, we don't use
libyuv for copying alpha data and so this is a quick fix.

BUG=webm:1514

Change-Id: I0040c3ae048f8d896c2082deeb2e32070a32c453
2018-06-20 13:11:40 -07:00
Jingning Han 561e01e710 Disable tpl model in high bd route
Temporarily disable tpl dep model in the high bit-depth route to
prevent encoding failure.

Change-Id: Iebb3168a60b38dcc1273e25542530c4359dc679d
2018-06-20 12:29:46 -07:00
Hui Su 960582af76 Add a partition search breakout model
for q-index between 150 and 200.

Previously the ML based breakout feature is only supported for q-index
larger than 200.

This only affects speed 1 and 2, resolution under 720p, q-index between
150 and 200, low bit-depth.

Compression performane change is neutral.
Encoding speed gain is up to 30% for speed 1;
                       up to 20% for speed 2.

Results from encoding city_4cif_30fps:
speed 1, QP=38
before:  37.689 dB, 41007b/f, 2.91 fps
after:   37.687 dB, 40998b/f, 3.46 fps

speed 1, QP=48
before:  35.959 dB, 22106b/f, 3.66 fps
after:   35.950 dB, 22118b/f, 4.83 fps

speed 2, QP=38
before:  37.630 dB, 40999b/f, 4.42 fps
after:   37.633 dB, 41063b/f, 4.63 fps

speed 2, QP=48
before:  35.905 dB, 22177b/f, 4.90 fps
after:   35.889 dB, 22145b/f, 5.92 fps

Change-Id: Ibd4a2f4d7093fb248ab94ddd388cbaa8de2c5ef7
2018-06-20 09:57:52 -07:00
Marco Paniconi 69a6506a8f vp9-svc: Add support for spatial layer sync frames.
Add encoder control to allow application to insert
spatial layer sync frame. The sync frame disables
temporal prediction for that spatial layer.

This is useful for RTC application to have receiver
start decoding a higher spatial layer, without inserting
a key frame on base spatial layer.

If the layer sync is requested on the base spatial layer
this then force a key frame, otherwise it only disables
the temporal reference for that spatial layer, allowing
temporal prediction to continue for the other layers.

Although the temporal prediction is disabled and reset
on a layer sync frame, the inter-layer prediction for the
sync frame is enabled on INTER frames. So the meaning of
INTER_LAYER_PRED_OFF_NONKEY is modified to mean disable
inter-layer prediction on non-key and non-sync frames.

Added unittest for inserting layer sync frames.

Bump up ABI version.
Change-Id: Id458acc400a77c853551f125c4e7b6d001991f03
2018-06-20 09:53:05 -07:00
Jingning Han 729d7d6a2f Merge "Refactor partition mode cost calculation" 2018-06-20 04:10:38 +00:00
Zoe Liu e7294a6404 Merge "Add hierarchical structure in GF group" 2018-06-20 01:05:54 +00:00
Zoe Liu 022590427c Add hierarchical structure in GF group
Change-Id: I06fc4b0ad5a45c49e10a9601a2356fbc6e93d6da
2018-06-19 12:08:55 -07:00
Jingning Han 19d31c70a9 Merge "Build temporal prediction dependency propagation" 2018-06-19 03:36:32 +00:00
Jerome Jiang a3ccca3e0c Merge "vp9: Enable cyclic refresh for HBD in real-time." 2018-06-19 02:37:32 +00:00
Hui Su 649d372788 Merge "Improve the partition search breakout speed feature" 2018-06-19 00:04:02 +00:00
Jerome Jiang 59c41b7814 vp9: Enable cyclic refresh for HBD in real-time.
Keep denoiser and skin detection disabled since some key functions don't
work with >8 bits source.

Add test for HBD with denoiser and cyclic refresh enabled to make sure
nothing crashes.

BUG=webm:1534

Change-Id: Id61fe1e38ed1768f273870a6bdd5f163aa769fe4
2018-06-18 16:01:14 -07:00
Jingning Han 1251d43564 Build temporal prediction dependency propagation
This commit builds up the temporal prediction dependency propagation
within the group of pictures.

Change-Id: Id04cfc0323e6a5c4ac4a570d53e20d1229b3ee11
2018-06-18 12:48:59 -07:00
Jingning Han d99ba0399e Refactor partition mode cost calculation
Compute the coding block partition mode cost as additional rdcost
to the cumulative rate-distortion cost from each coding block. This
changes the coding performance slightly due to the rounding error.
The compression performance change is neutral.

Change-Id: Ibdccae0e79263a0e70af7592a8cb11458d795f8d
2018-06-18 12:01:13 -07:00
Hui Su 238cf66eb5 Improve the partition search breakout speed feature
Use a linear model to make partition search breakout decisions.
Currently the model is tuned for large quantizers and small resolutions.
So it is only used when q-index is larger than 200 and frame
width/height is smaller than 720. Also it's not yet supported for high
bit depth.

Tested speed 1 and 2 on lowres and midres. Compression performance is
neutral. At low bitrates, encoding speedup is up to 50% for speed 1;
up to 30% for speed 2.
Some sample numbers:

into_tree_480p, speed 1
QP=60 before:  35.228 dB, 3488b/f, 7.78 fps
      now:     35.217 dB, 3475b/f, 11.57 fps
QP=50 before:  37.492 dB, 7983b/f, 6.24 fps
      now:     37.491 dB, 7974b/f, 7.55 fps

PartyScene_832x480_50, speed 1
QP=60 before:  30.104 dB, 22426b/f, 3.28 fps
      now:     30.109 dB, 22410b/f, 4.43 fps
QP=50 before:  33.016 dB, 46984b/f, 2.78 fps
      now:     33.018 dB, 46998b/f, 3.35 fps

into_tree_480p, speed 2
QP=60 before:  35.175 dB, 3506b/f, 10.96 fps
      now:     35.185 dB, 3510b/f, 13.47 fps
QP=50 before:  37.448 dB, 8016b/f, 9.04 fps
      now:     37.459 dB, 8048b/f, 9.81 fps

PartyScene_832x480_50, speed 2
QP=60 before:  30.060 dB, 22537b/f, 4.42 fps
      now:     30.061 dB, 22541b/f, 5.38 fps
QP=50 before:  32.923 dB, 47134b/f, 3.85 fps
      now:     32.920 dB, 47073b/f, 4.31 fps

Change-Id: I674cba4f027c4c65f7837d5ec9179d6201e6ba86
2018-06-18 10:56:15 -07:00
Jingning Han 54dfdf28c7 Merge "Enable intra prediction search for tpl model" 2018-06-18 16:28:21 +00:00
Jingning Han 719168554e Merge "Enable motion compensated prediction for tpl dependency model" 2018-06-18 16:28:13 +00:00
Jingning Han dc9a7f1578 Merge "Remove unneeded buffer restore calls" 2018-06-18 16:27:48 +00:00
Luc Trudeau e769aeee80 include msvc.h for snprintf support in benchmarks
include vpx_ports/msvc.h to avoid issues with snprintf issues with MSVC.

Change-Id: Ida09cff8ee3b84e09fd61de131f84b32c113fa1a
2018-06-18 15:18:43 +00:00
Zoe Liu 3c4bfc6f2c Add update types for hierarchical refs
Change-Id: I0cd91187e1efc1441086772e5683fbf72d9371cf
2018-06-17 05:42:31 -07:00
Jingning Han 4cc5f30a7d Enable intra prediction search for tpl model
Support intra prediction mode search to find the best intra mode
cost for temporal dependency model building.

Change-Id: Ie62d6af8d0c9f65dee742876f3af9cdd5e3f1d63
2018-06-16 08:16:54 -07:00
Jerome Jiang 8648a64c83 Merge "VP9 HBD: Fix integer overflow problem in variance calc." 2018-06-16 00:12:44 +00:00
Jingning Han 52416c3e17 Remove unneeded buffer restore calls
Change-Id: I89c8ad6544e0cee60b5daf49bc18c7e31f08faa2
2018-06-15 17:07:07 -07:00
Jingning Han 6e37645b50 Enable motion compensated prediction for tpl dependency model
Support the motion compensated prediction search to find the motion
trajectory and hence to build the temporal dependency model.

Change-Id: I861ea85a0d4cc2897cb0dfe2e95378bf7d36209f
2018-06-15 16:35:21 -07:00
Jerome Jiang 4ec3f4d83f Merge "vp9 svc: add tests for inter layer prediction." 2018-06-15 23:33:15 +00:00
Jerome Jiang e28bc78204 VP9 HBD: Fix integer overflow problem in variance calc.
BUG=webm:1534
Change-Id: I535ac48e3dd2454cc7088c4f9a1e08ea74107da6
2018-06-15 16:03:00 -07:00
Tom Finegan e93b32d7c6 Merge "Clean up avx512 compiler support test." 2018-06-15 22:25:25 +00:00
Luca Barbato 4e26b2ea09 [VSX] Drop the clang-4 workaround for vec_xxpermdi
clang-6 seems to support it out of box.

E.g. VP9SubtractBlockTest.DISABLED_Speed with the workaround:
[    BENCH ]      4x4  286.5 ms ( ±0.2 ms )
Without:
[    BENCH ]      4x4  215.2 ms ( ±0.9 ms )

Change-Id: I28b3a2cc93c0d72f52f5a48cc06d8ed4ef26913f
2018-06-15 21:54:47 +00:00
Jingning Han 7538e4cc88 Merge changes I3436302c,I8969f5c3
* changes:
  Prepare motion estimation process for temporal dependency model
  Construct temporal dependency building system
2018-06-15 18:05:58 +00:00
Tom Finegan ab71db65a5 Clean up avx512 compiler support test.
Moves the check into a function, check_gcc_avx512_compiles,
that behaves somewhat similarly to check_gcc_machine_options.

Change-Id: I2bef3ddd98e636eef12d9d5e548c43282fac7826
2018-06-15 10:58:11 -07:00
Jingning Han dda0611008 Prepare motion estimation process for temporal dependency model
Set up needed stack for the motion estimation process to build up
the temporal dependency model.

Change-Id: I3436302c916a686e8c82572ffc106bf8023404b6
2018-06-14 20:31:26 -07:00
Jingning Han d94f1c84cc Construct temporal dependency building system
Schedule the frame processing to construct temporal dependency
statistics within a group of pictures. Align the corresponding
reference frames.

Change-Id: I8969f5c335a4a5c2614f4530b636fe13a25a8a98
2018-06-14 20:30:33 -07:00
Jingning Han 0e6ab498a1 Merge "Allocate tpl_dep_stats frame buffer" 2018-06-15 01:33:41 +00:00
Zoe Liu 492a1935bf Merge "Separate GF structure defining from bit allocation" 2018-06-15 00:00:13 +00:00
Jingning Han 239ccadd5a Merge "Add temporal model control as a speed feature" 2018-06-14 23:06:14 +00:00
Jerome Jiang 0f2b71edfc vp9 svc: add tests for inter layer prediction.
Change-Id: Ic8e07b07790e067c014677cf33c3b016fcf4cb39
2018-06-14 15:40:28 -07:00
Zoe Liu 2459ce8881 Separate GF structure defining from bit allocation
This CL separates the defining of the GF group structure from the
handling of its bitrate allocation. The encoder performance should stay
unchanged.

Change-Id: Ib77967757702bb4b284034e429d4c41ae86d0838
2018-06-14 14:35:05 -07:00
Jingning Han 775706c453 Allocate tpl_dep_stats frame buffer
Allocate buffers to support gather temporal dependency statistics
at the encoder.

Change-Id: I97d4594913a2423e8a916f20caf82ab0f5836961
2018-06-14 12:14:16 -07:00
Jingning Han 5ef063ed27 Add temporal model control as a speed feature
The model construction would incur 15% slowdown for speed 2. The
speed change on speed 0 is unnoticeable.

The current speed features set up would DISABLE temporal dependency
model for all speed settings.

Change-Id: Ic45dd962f3a54a8f5f0452502dc05e352dc09ca1
2018-06-14 11:30:28 -07:00
Jingning Han ea78306257 Add data structure for frame dependent mode decision
Add block and frame level data structures to support frame
dependent mode decision.

Change-Id: I996fc84155fcba8e2ec2a114bb0799d6aa5539dd
2018-06-14 09:50:56 -07:00
Alexandra Hájková 0652a3f76c ppc: add vp9_iht16x16_256_add_vsx
Change-Id: I51e7ed32d8d87c25ee126e8b4f8fc616d0327584
2018-06-14 16:39:10 +00:00
Zoe Liu 7a1ac4712b Merge "Unify frame_index in defining GF group structure" 2018-06-14 16:18:51 +00:00
Luc Trudeau 25e8514bde Merge changes I51776f0e,I843f3b34
* changes:
  [VSX] Optimize PROCESS16 macro
  VSX Version of SAD8xN
2018-06-14 16:13:02 +00:00
Jerome Jiang ed5b3db6c5 Merge "vp8: remove assertion in tree coder." 2018-06-14 06:56:40 +00:00
Luc Trudeau f9dc411d89 [VSX] Optimize PROCESS16 macro
The PROCESS16 macro now uses 8-bit lanes instead of 16-bit lanes.

SADTest Speed Test (POWER8 Model 2.1)
16x8  Old VSX time = 16.7 ms, new VSX time = 9.1 ms [1.8x]
16x16 Old VSX time = 15.7 ms, new VSX time = 7.9 ms [2.0x]
16x32 Old VSX time = 14.4 ms, new VSX time = 7.2 ms [2.0x]
32x16 Old VSX time = 14.0 ms, new VSX time = 7.4 ms [1.9x]
32x32 Old VSX time = 13.4 ms, new VSX time = 6.5 ms [2.0x]
32x64 Old VSX time = 12.7 ms, new VSX time = 6.3 ms [2.0x]
64x32 Old VSX time = 12.6 ms, new VSX time = 6.3 ms [2.0x]
64x64 Old VSX time = 12.7 ms, new VSX time = 6.2 ms [2.0x]

Change-Id: I51776f0e428162e78edde8eac47f30ffd2379873
2018-06-14 01:57:05 +00:00
Zoe Liu a20d8c3607 Unify frame_index in defining GF group structure
Following are completed in defining GF group structure in firstpass:
1. Remove redundant alt_frame_index;
2. Remove hard coded index value with the variable of frame_index.

Change-Id: I7b56e454559bbf704afc7410ea9832b20ffcd57e
2018-06-13 17:49:09 -07:00
Jerome Jiang 343352b556 vp8: remove assertion in tree coder.
Cast the counter to uint64_t in case it overflows.

The assert was to prevent c[0] * Pfac being overflow beyong unsigned int
since Pfac could be 2^8. Thus c[0] needs to be smaller than 2^24.

In VP9, the assert was removed and c[0] was casted to uint64_t.

Bug: 805277
Change-Id: Ic46a3c5b4af2f267de4e32c1518b64e8d6e9d856
2018-06-13 17:03:01 -07:00
Luc Trudeau e3ce12cfc1 VSX Version of SAD8xN
VSX versions of the SAD functions of width 8.

SADTest Speed Test (POWER8 Model 2.1)
8x4  C time = 68.7 ms (±0.3 ms), VSX time = 31.8 ms (±0.1 ms) [2.2x]
8x8  C time = 55.6 ms (±0.3 ms), VSX time = 18.3 ms (±0.1 ms) [3.0x]
8x16 C time = 46.5 ms (±0.1 ms), VSX time = 15.6 ms (±0.1 ms) [3.0x]

Change-Id: I843f3b34e103b72deeade4a939193d8b53cee460
2018-06-13 19:21:06 +00:00
Luc Trudeau f950248b9b Add Speed Tests for the SADTest test suite.
Speed tests are added for the SADTest test suite. These test use the
AbstractBench and print the median run time of SAD operations. Speed
tests are disabled by default.

Change-Id: I5d0957248f9b5b307ae2d757d5f8d4761a1dd712
2018-06-13 17:50:58 +00:00
Tom Finegan d998721f71 Merge "Fix avx512 related MSVC build failure." 2018-06-13 17:50:39 +00:00
Jingning Han 6585fc7536 Merge "Remove duplicate vp9_twopass_postencode_update def" 2018-06-13 17:32:12 +00:00
Tom Finegan 68a9b143d0 Fix avx512 related MSVC build failure.
Check GCC specific AVX512 flags only when GCC is enabled.

Change-Id: I15dc2a0dbf8bce37f4364fedfd34a0a34882104b
2018-06-13 09:39:28 -07:00
Jingning Han a7d1afee7e Remove duplicate vp9_twopass_postencode_update def
Change-Id: I370f37c85a02c032a8ba266b9b9445ee38eb0756
2018-06-12 13:46:43 -07:00
Marco Paniconi 37a0283b18 Merge "vp9 svc: Denoise golden when it's a temporal ref." 2018-06-12 05:40:32 +00:00
Jerome Jiang 88fa7efb1c vp9 svc: Denoise golden when it's a temporal ref.
When golden was the inter-layer reference, a block that selected the golden ref
would not be denoised.
But when golden is used as a second temporal reference then we should denoise
blocks that select the golden reference.
This changes allows for that.

Change-Id: Ifdea2ac88f6a74f73520fedcd7fec2f32c559ec9
2018-06-11 15:26:44 -07:00
Luc Trudeau 74a0b04f57 VSX Version of vp9_quantize_fp_32x32
Low bit depth version only. Passes the VP9QuantizeTest test suite.

VP9QuantizeTest Speed Test (POWER8 Model 2.1)
32x32 C time = 93.1 ms (±0.4 ms), VSX time = 6.5 ms (±0.2 ms) [14.4x]

Change-Id: I7f1fd0fc987af86baf2b74147a25aee811289112
2018-06-11 19:18:22 +00:00
Luc Trudeau b1434f3125 VSX Version of vp9_quantize_fp
Low bit depth version only. Passes the VP9QuantizeTest test suite.

VP9QuantizeTest Speed Test (POWER8 Model 2.1)
 4x4  C time = 86.3 ms (±0.7 ms), VSX time = 18.2 ms (±0.0 ms) [ 4.7x]
 8x8  C time = 57.7 ms (±0.3 ms), VSX time =  7.6 ms (±0.0 ms) [ 7.6x]
16x16 C time = 50.7 ms (±0.1 ms), VSX time =  4.9 ms (±0.0 ms) [10.3x]

Change-Id: Ic09bc786c57cc89bba14624064216b52996075eb
2018-06-11 19:18:01 +00:00
Hui Su 55a2abfe31 Merge "Small speedup of ml_pruning_partition()" 2018-06-11 18:07:18 +00:00
Marco Paniconi 8fd4b15434 Merge "vp9-svc: Fix to frames_since_golden update for SVC." 2018-06-11 16:54:43 +00:00
Jerome Jiang 8d49d02a24 Merge "vp9 svc: clean up first_spatial_layer_to_encode." 2018-06-11 16:43:44 +00:00
Marco Paniconi 7d97790438 vp9-svc: Fix to frames_since_golden update for SVC.
When the second (gf) temporal reference is used in SVC:
the reference is refreshed on base TL superframes, and so
the rc->frames_since_golden counter was also only updated on
base TL frames. But this was disabling the golden reference
from being used as a temporal reference for TL > 0 frames
(since frames_since_golden was 0/not updated on TL > 0 frames).

Fix is to copy the update of rc->frames_since_golden to all
upper temporal layers. This allows TL > 0 frames to test the
golden inter mode.

Gain on RTC set: ~2%, ~8% on desktop_vga clip.
Encode time increase ~5-8% on linux, 3SL-3TL run with 1 thread.

For now keep this off for TL > 0 frames in speed features, so
this change does not change current behavior for speed >= 7.

Change-Id: I405708f3f80039ae47bd64ec53e66f92160acd9e
2018-06-11 08:57:20 -07:00
Jerome Jiang 5f8e24161e vp9 svc: clean up first_spatial_layer_to_encode.
Change-Id: I3c9aefd3ea5028797b9105d7e49b1cb2f762a9fc
2018-06-08 14:33:05 -07:00
Hui Su 693c9a70f0 Small speedup of ml_pruning_partition()
Terminate early and skip neural net model when linear score is already
high enough, which indicates that we should not skip split and
rectangular partitions.

No changes on compression; encoding speed improves slightly.

Change-Id: I4e0995090200eb4889344da905d2f7048673af5f
2018-06-08 13:42:57 -07:00
James Zern 3ac2b57015 Merge "vp9_subtract_test,cosmetics: fix class order, casts" 2018-06-08 17:44:43 +00:00
Jingning Han 18fa715c89 Merge "Localize variable scope in vp9_rc_get_second_pass_params()" 2018-06-08 17:18:33 +00:00
Tom Finegan 9a9ff2d804 Merge "Add avx512 compile test." 2018-06-08 15:11:13 +00:00
James Zern a75bb97402 vp9_subtract_test,cosmetics: fix class order, casts
+ remove obsolete FIXME

Change-Id: I97ceb94b0e7860167e9c8cc6900bec8d155f0e8f
2018-06-07 23:19:50 -07:00
James Zern 9e4dc99e4b Merge changes I89ce12b6,Id91b52d6,Icd7d4453
* changes:
  Implement subtract_block for VSX
  Cast bsize as int to print a meaninful debug info
  Speed test for subtract_block
2018-06-08 06:15:24 +00:00
Marco Paniconi 2626b1545e vp9-svc: Adjust some logic on gf temporal reference.
For the feature of using second temporal reference (when
inter-layer is off): move the buffer_idx assignement and
refresh flag settings further down to vp9_rc_get_svc_params(),
since is_key_frame is set there for every frame/layer.
Otherwise it was using the setting from the previous frame/layer.

This makes the refresh more consistent for both layers for
2 spatial layers case.

Small/negligible change in metrics.

Change-Id: I88279243bc27898448e8891dba38143d936cf6d5
2018-06-07 20:43:39 -07:00
Luca Barbato d468fd90e0 Implement subtract_block for VSX
~2x speedup or better.

[ RUN      ] C/VP9SubtractBlockTest.Speed/0
[    BENCH ]      4x4  365.1 ms ( ±2.2 ms )
[    BENCH ]      8x4  258.5 ms ( ±0.3 ms )
[    BENCH ]      4x8  202.7 ms ( ±0.2 ms )
[    BENCH ]      8x8  162.2 ms ( ±0.5 ms )
[    BENCH ]     16x8  138.8 ms ( ±0.3 ms )
[    BENCH ]     8x16  121.5 ms ( ±0.4 ms )
[    BENCH ]    16x16  110.2 ms ( ±0.5 ms )
[    BENCH ]    32x16  104.8 ms ( ±0.1 ms )
[    BENCH ]    16x32  32.7 ms ( ±0.1 ms )
[    BENCH ]    32x32  30.0 ms ( ±0.0 ms )
[    BENCH ]    64x32  28.7 ms ( ±0.0 ms )
[    BENCH ]    32x64  20.1 ms ( ±0.0 ms )
[    BENCH ]    64x64  19.3 ms ( ±0.0 ms )

[ RUN      ] VSX/VP9SubtractBlockTest.Speed/0
[    BENCH ]      4x4  155.3 ms ( ±0.9 ms )
[    BENCH ]      8x4  99.3 ms ( ±0.4 ms )
[    BENCH ]      4x8  77.2 ms ( ±0.1 ms )
[    BENCH ]      8x8  45.7 ms ( ±0.0 ms )
[    BENCH ]     16x8  34.1 ms ( ±0.0 ms )
[    BENCH ]     8x16  29.5 ms ( ±0.0 ms )
[    BENCH ]    16x16  19.9 ms ( ±0.0 ms )
[    BENCH ]    32x16  15.1 ms ( ±0.0 ms )
[    BENCH ]    16x32  16.7 ms ( ±0.0 ms )
[    BENCH ]    32x32  14.1 ms ( ±0.0 ms )
[    BENCH ]    64x32  12.6 ms ( ±0.0 ms )
[    BENCH ]    32x64  12.0 ms ( ±0.0 ms )
[    BENCH ]    64x64  11.2 ms ( ±0.0 ms )

Change-Id: I89ce12b6475871dc9e8fde84d0b6fe5c420c28c7
2018-06-08 05:26:05 +02:00
Luca Barbato 034f94c127 Cast bsize as int to print a meaninful debug info
cout helpfully decides to print the bsize value as non-printable char
otherwise.

Change-Id: Id91b52d6475ae9f869365468d1d56d94b2e10ecb
2018-06-08 05:25:49 +02:00
Luca Barbato 0284e38293 Speed test for subtract_block
Change-Id: Icd7d4453f0ee699635a2a1d484d24cba71d748de
2018-06-08 05:25:44 +02:00
Tom Finegan 393f785658 Add avx512 compile test.
Some compiler releases allow the -mavx512f arg without actually
implementing support. Test for this situation, and disable avx512
when it is detected by configure.

BUG=webm:1536

Change-Id: I63952153bb4b24aa9f25267ed47a0fe845d61f8b
2018-06-07 16:05:48 -07:00
Jerome Jiang e32f0bf239 vp9 svc: add control to set using second temporal ref.
Bump up ABI version.

Change-Id: I4498d7ea4ed72994c5f847aa98e75b0150dd7f82
2018-06-07 14:09:50 -07:00
Marco Paniconi 76cc69f884 vp9-svc: Allow second temporal reference for next highest layer.
When inter-layer prediction is disabled on INTER frames, allow
for next highest resolution to have second temporal reference.
Current code allowed for only top/highest spatial layer.

Change-Id: I102137273e3e4d57512a13d95e8ccb9c5b0a7b4b
2018-06-07 12:26:42 -07:00
Marco Paniconi 7bb10a9720 Merge "VP9 SVC: Write out svc src for all spatial layers." 2018-06-07 15:57:42 +00:00
Marco Paniconi 1ff7438463 vp9-svc: Modify choose_partitioning for second temporal ref
For mode where second temporal reference is used in SVC: allow
for using/testing this reference (golden ref) in the variance
partition scheme (choose_partitioning).

Small positive gain (~0.25%) on metrics for 3 layer SVC,
negligible change in speed.

Change-Id: I29b8315da530e60db3d6c90faa8fb178d9f2de26
2018-06-06 22:47:58 -07:00
Jerome Jiang 7724525085 Merge "VP9: fix unsigned integer overflow in decoder." 2018-06-07 00:02:48 +00:00
Jerome Jiang 22dcecd3d8 VP9 SVC: Write out svc src for all spatial layers.
Change-Id: Ie78676e4df75f3f870ee2de0c87a8167b7ec68e0
2018-06-06 16:14:31 -07:00
Marco Paniconi 4640c2f583 vp9-svc: Enable use of second temporal reference for SVC.
When inter-layer is disabled on INTER frames, this will allow
use of a second (longer term) temporal reference for SVC.

Only enabled on highest resolution spatial layer.

Average gains of ~4% on RTC set, speed decrease of about ~2%.

Change-Id: I3c2d415653c448eb7269c828e120fe8bb2ef3f97
2018-06-06 15:19:49 -07:00
Marco Paniconi 49785d3d02 vp9-svc: Add a buffer_idx is_used parameter for SVC.
For the case where a second (long term) temoral reference is
used in the SVC: this additional parameter is to make sure the
buffer slot selected for this reference is available for usage,
i.e., it is never used for any of the 3 references set for the
fixed SVC patterns.

And some code cleanup (replace cpi->svc).

No change in behavior.

Change-Id: Icba46edfbbefb94d5ea8e2d5c24cccd85a406ee6
2018-06-06 13:17:29 -07:00
Jingning Han 19ef822683 Localize variable scope in vp9_rc_get_second_pass_params()
Remove unnecessary definitions.

Change-Id: Ie540aaed5f3ed3768eff4e6563455666aef9c9e8
2018-06-06 12:52:45 -07:00
Jerome Jiang 84a9e8eb9a VP9: fix unsigned integer overflow in decoder.
The difference of two size_t variables.

Change-Id: I73f35cdafc2ba64a9ddaf855cc6a410cfb63b8da
2018-06-06 10:46:42 -07:00
Jerome Jiang 87386826a9 vp9: Move up reset of cyclic refresh under dynamic resize.
When resize happens and cyclic refresh is not applied on the
current (resized) frame, the sb_index is not reset and then
might be out of boundary on future frames when the
cyclic refresh is applied.

Change-Id: I05282fc4bc2323522d60e019ed0790d69221a2f7
2018-06-05 22:31:16 -07:00
James Zern d5bd2809f1 Merge changes I3ba75c45,I97d26285
* changes:
  force-inline the convolve functions
  Unbreak the force inline directive for gcc
2018-06-06 04:54:41 +00:00
Luca Barbato a405bc2ec9 force-inline the convolve functions
Change-Id: I3ba75c459ed7c9591b7892e9f8f108146c04507d
2018-06-05 20:21:07 -07:00
Marco Paniconi 0826a475b0 Merge "vp9-svc: Allow usage of second (long term) temporal reference." 2018-06-05 03:24:52 +00:00
James Zern 3a0dc0e4b7 test,cosmetics: fix func/member naming, decl order
functions: upper camelcase
members: lowercase with trailing '_'
decl order: functions (overrides marked virtual), members

after:
656e8ac61 VSX version of vpx_post_proc_down_and_across_mb_row
766d875b9 VSX version of vpx_mbpost_proc_ip
35e98a70b VSX version of vpx_mbpost_proc_down
b2898a9ad Bench Class For More Robust Speed Tests

Change-Id: Ib257bd607c5c1248d30e619ec9e8a47cc629825b
2018-06-04 16:14:33 -07:00
Jerome Jiang f058688eaa vp9-svc: Allow usage of second (long term) temporal reference.
Allow for second temporal reference for top spatial layer in SVC,
when inter-layer prediction is disabled on INTER frames.
The second temporal reference is labelled as the golden reference
and the update/refresh of this reference buffer is only on base
temporal layer superframes. For now the period of refresh is
fixed at every 20 TL0 superframes.

Average gain is ~4% on RTC set, several clips up
by ~8-12%. Speed loss is about ~2% on mac.

Feature is disabled as default for now.

Change-Id: I2e5db5052c62dbe958a3b14be97d043823b7a529
2018-06-04 13:54:04 -07:00
Luca Barbato 3dffc634c0 Unbreak the force inline directive for gcc
Change-Id: I97d26285ec146628cbafd3573ca812c630c6687d
2018-06-03 17:00:51 +02:00
Jerome Jiang 3f7e6cc020 Merge "VP9: exclude speed 9 from VBR datarate tests." 2018-06-01 23:04:35 +00:00
Jerome Jiang 76342f7a68 VP9: exclude speed 9 from VBR datarate tests.
Change-Id: I4c4d31d013cb45e20918f4ef83ce32811d76e02b
2018-06-01 15:13:28 -07:00
Jerome Jiang 19222548a1 Merge "VP9: Allow for bilinear subpel interp at speed 9 for high motion." 2018-05-31 23:00:32 +00:00
Marco Paniconi 71e13f7661 Merge "vp9-svc: Fix to some frame metrics for real-time mode." 2018-05-31 22:22:57 +00:00
Jerome Jiang 3b4425c2ca VP9: Allow for bilinear subpel interp at speed 9 for high motion.
Fixed some settings in nonrd pick mode to allow for frame-level bilinear
to be set.
On Galaxy S8+ it has 4% speed up on high motion clips. Almost the same
for low motion.

0.17% quality loss on RTC.

Change-Id: I044a7de020183754ba08bb6c96c5a78ba5c7fea2
2018-05-31 15:13:34 -07:00
Hui Su 0a72b06665 Merge "Improve the ML based partition pruning" 2018-05-31 21:04:40 +00:00
Marco Paniconi bfe85191dd vp9-svc: Fix to some frame metrics for real-time mode.
Add condition of LAST frame to the consec_zeromv and
avg_frame_low_motion metrics. This is needed for SVC as
the golden reference is a spatial reference and should
not be included in the metric computation.

Small/negligible change in metrics on RTC set.

Change-Id: I6ea16298fae566bb288c34cf50d120b509146eee
2018-05-31 11:34:51 -07:00
Luc Trudeau 656e8ac61e VSX version of vpx_post_proc_down_and_across_mb_row
Low bit depth version only. Passes the VpxPostProcDownAndAcrossMbRowTest

VpxMbPostProcAcrossIpTest Speed Test (POWER8 Model 2.1)
C time = 121.3 ms (±4.0 ms), VSX time = 9.4 ms (±0.3 ms) [12.9x]

Change-Id: I28300779e197ea3855cf30867d17a2805388b447
2018-05-31 13:13:06 -04:00
Hui Su efc195cbb9 Improve the ML based partition pruning
Add a neural net model that uses the same features as the existing
linear model. Make the pruning decision based on both the linear
and the neural net model. It provides more accurate predictions,
and may improve compression and/or encoding speed.

This only affects speed 0.

Coding gain:
0.37% on midres
0.34% on hdres
0.50% on jvet8b720p

Encoding speed impact(average over locally tested 20 clips from midres
and hdres):
QP=20: down by 2.5%.
QP=30: down by 3.9%.
QP=40: donw by 4.5%.
QP=50:   up by 5.2%.

Change-Id: I402ec799745ad3b74abf0789fa5e124fe64e704d
2018-05-31 16:03:00 +00:00
Marco Paniconi 4caabd362d Merge "vp9-svc: Fix to compute some metrics on top spatiail layer." 2018-05-31 05:44:05 +00:00
Alexandra Hájková 4997a29c86 ppc: add vp9_iht8x8_64_add_vsx
Change-Id: I99a9535bf1ae58c494113fc88d9616bda202716a
2018-05-31 05:09:42 +00:00
Alexandra Hájková a6a57507fb ppc: add vp9_iht4x4_16_add_vsx
Change-Id: Id584d8f65fdda51b8680f41424074b4b0c979622
2018-05-31 04:57:06 +00:00
James Zern 57de90d24b Merge "libs.mk: expose libvpx.{ver,syms} in all configs" 2018-05-31 04:01:02 +00:00
Marco Paniconi e2726cd02d vp9-svc: Fix to compute some metrics on top spatiail layer.
The avg_frame_low_motion and consec_zeromv are frame-level
metrics that are updated on every frame. For SVC these should be
updated on top spatial layer (full resolution).

Small/negligible change in metrics.

Change-Id: Ibe14f05be3b82daa9dd60378097ff11a27f1b95e
2018-05-30 20:15:03 -07:00
Marco Paniconi d7a80012a0 vp9: Refactor code for q adjustment in CBR mode.
Move the adjustment code to separate function.
Change-Id: I876b246a5c26095f262bb9a19f03d1f17077225d
2018-05-30 14:58:48 -07:00
James Zern 7f7782bebd Merge "Revert 3 slide show coding changes" 2018-05-30 21:25:55 +00:00
James Zern 39fa0777e3 Revert 3 slide show coding changes
This is a combination of the following 3 reverts. The changes cause
issues on certain hardware devices. We'll pull them for now to allow for
further investigation.

Revert "Experiment regarding playback problems on Bravia TVs."

This reverts commit 624f8105f5.

Revert "Improved slide show coding"

This reverts commit f4091bc30e.

Revert "Improved coding on slide show content."

This reverts commit 2fa333c2ae.

BUG=b/77492144

Change-Id: Ifba937792d644a9286307262f050216408e8ecf4
2018-05-30 19:50:17 +00:00
Jim Bankoski eb012d74f8 tiny_ssim: fix for odd image sizes.
Change-Id: I7dd1e37c5de3efccc07fcdc877653d4873a88266
2018-05-30 19:00:36 +00:00
Marco Paniconi 5850cf8817 Merge "vp9-svc: Add frame dropper control to sample encoder." 2018-05-30 03:24:57 +00:00
Marco Paniconi e1921294f9 vp9-svc: Add frame dropper control to sample encoder.
Disabled as default as enc_cfg.rc_dropframe_thresh is
set to 0 as default.

Change-Id: Ia888aa16b1a86a716ec33ea041e8b16b19bf93be
2018-05-29 19:31:23 -07:00
Marco Paniconi 3c9a10261f Merge "vp9: Adjust cyclic refresh and limit frame-level q." 2018-05-30 02:25:17 +00:00
James Zern 121a312bdc libs.mk: expose libvpx.{ver,syms} in all configs
this allows the targets to be used explicitly in builds configured with
--enable-external-build

Change-Id: Id7db309a39a73cfd8f15f74430b17b317c0a847f
2018-05-29 18:21:01 -07:00
Marco Paniconi 1d29325361 vp9: Adjust cyclic refresh and limit frame-level q.
For CBR mode with aq-mode=3: reduce delta-q for second
segment and limit how much the frame-level q can decreae
from one frame to the next.

Reduces bitrate spikes in slide/sreen content.

Change-Id: Id9ac4b7270f07e09690380755cfbef4aec5c26dc
2018-05-29 15:19:49 -07:00
Luc Trudeau 766d875b9d VSX version of vpx_mbpost_proc_ip
Low bit depth version only. Passes the VpxMbPostProcAcrossIpTest.

VpxMbPostProcAcrossIpTest Speed Test (POWER8 Model 2.1)
C time = 188.5ms (±0.2ms), VSX time = 65.2ms (±0.1ms) [2.9x]

Change-Id: I1cf72365d94a9d7f1e9323925a87a30e3bd5cfe2
2018-05-29 13:32:52 +00:00
Luc Trudeau 35e98a70ba VSX version of vpx_mbpost_proc_down
Low bit depth version only. Passes the VpxMbPostProcDownTest.

VpxMbPostProcDownTest Speed Test (POWER8 Model 2.1)
Full calculations:
C time = 195.4 ms, VSX time = 33.7 ms (5.8x)

Change-Id: If1aca7c135de036a1ab7923c0d1e6733bfe27ef7
2018-05-29 13:29:00 +00:00
Luc Trudeau b2898a9ade Bench Class For More Robust Speed Tests
To make speed testing more robust, the AbstractBench runs the
desired code multiple times and report the median run time with
mean absolute deviation around the median.

To use the AbstractBench, simply add it as a parent to your test
class, and implement the run() method (with the code you want to
benchmark).

Sample output for VP9QuantizeTest

  [    BENCH ]      Bypass calculations       4x4  165.8 ms ( ±1.0 ms )
  [    BENCH ]        Full calculations       4x4  165.8 ms ( ±0.9 ms )
  [    BENCH ]      Bypass calculations       8x8  129.7 ms ( ±0.9 ms )
  [    BENCH ]        Full calculations       8x8  130.3 ms ( ±1.4 ms )
  [    BENCH ]      Bypass calculations     16x16  110.3 ms ( ±1.4 ms )
  [    BENCH ]        Full calculations     16x16  110.1 ms ( ±0.9 ms )

Change-Id: I1dd649754cb8c4c621eee2728198ea6a555f38b3
2018-05-29 13:04:47 +00:00
Marco Paniconi 2b08f89076 vp9-realtime: Move frame dropper to after scene detection.
Move frame dropper to after scene detection and noise estimation.
Scene detection and noise estimation operate on source data and
update metrics along sequence, so they should be moved before
the frame dropper.

Also we don't want to drop on scene change, as the scene detection
and (possible) re-encode step will be missed.

Change-Id: I3d9e16d785bd5ace6707db2abce77ddc110bfef4
2018-05-28 22:21:40 -07:00
Marco Paniconi 6b4ac9b253 vp9-svc: Fix to allowed value of max_consec_drop.
For the max_consec_drop parameter in svc frame drop:
since passing value 0 in the control would completely
disable the dropper, only allow for values >= 1 to be set.

Change-Id: I6b74ec9cc08a638fa571d6246a021dab9c811d14
2018-05-28 21:13:10 -07:00
Jerome Jiang ed217fe372 Merge "VP8: Fix use-after-free in postproc." 2018-05-25 17:48:28 +00:00
Jerome Jiang 52add58966 VP8: Fix use-after-free in postproc.
The pointer in vp8 postproc refers to show_frame_mi which is only
updated on show frame. However, when there is a no-show frame which also
changes the size (thus new frame buffers allocated), show_frame_mi is
not updated with new frame buffer memory.

Change the pointer in postproc to mi which is always updated.

Bug: 842265
Change-Id: I33874f2112b39f74562cba528432b5f239e6a7bd
2018-05-25 10:46:26 -07:00
Marco Paniconi 36825590ba Merge "VP9: Fix issues with high bitdepth in real-time." 2018-05-25 15:50:19 +00:00
Jerome Jiang 8502a95a0d VP9: Fix issues with high bitdepth in real-time.
Disable denoiser, skin detection and aq-mode for high bitdepth for now.

BUG=webm:1534

Change-Id: I361a4e20b2319041148af497bf7043bfd5c5f589
2018-05-24 23:36:20 -07:00
Marco Paniconi fc1c5d1c9c vp9-svc: Add max_consec_drop to SVC frame drop.
For any spatial, limits the amount of consecutive frame drop.

Change-Id: I692d90363f329f571f2b59e12cc680ad2e76065d
2018-05-24 15:19:47 -07:00
Marco Paniconi 8446af7e9a vp9: Rate control adjustments for screen content.
For screen content mode: changes to reduce occurence of
significant QP decrease (from one frame to next),
which can cause large frames (overshoot/delay).
-cap the buffer increase to optimal level for frame drop
mode where full superframe can drop
-reduce the max_adjustment_down due to buffer overflow
-reduce qp threshold to trigger re-encode on large frame

Change-Id: I3e30e4814192b5f728abff3f7359eb64f561b8f0
2018-05-23 19:35:16 -07:00
Paul Wilkins 276cafb7a7 Merge "Experiment regarding playback problems on Bravia TVs." 2018-05-23 11:03:12 +00:00
Marco Paniconi b2004fdda6 vp9-svc: Add full superframe drop mode.
This will check for dropping full superframe if any
spatial layer is overshooting.

Change-Id: Ic656807028ebef5552301b6d10399fbe3a6c890c
2018-05-22 21:36:08 -07:00
Marco Paniconi fe8c07172f Merge "vp9-svc: Small code cleanup in nonrd-pickmode." 2018-05-22 02:19:06 +00:00
Marco Paniconi 71efeffe62 vp9-svc: Small code cleanup in nonrd-pickmode.
Rename a flag to indicate it is for the inter_layer reference.

Change-Id: Ib198d3df95fb912259efde854613592c724b7c49
2018-05-21 15:40:13 -07:00
James Zern b2f12c6359 remove unused vpx_ports/config.h
references were earlier removed in:
1a7d25a48 Replace vpx_ports/config.h with vpx_config.h

Change-Id: I1824cd71e970f5c7550c3978e0c63ce36a9644e4
2018-05-21 14:26:08 -07:00
Niveditha Rau 11a868d104 Add Solaris to supported platforms
Change-Id: Ib49e1d79ba4c1c5d5147ab437f744a31429a059c
2018-05-21 16:25:23 +00:00
Johann Koenig c26c07328f Merge "configure,ios: add missing c++11 checks" 2018-05-21 16:24:13 +00:00
Marco Paniconi e27a331778 Merge "vp9-svc: Fix on disabling inter_layer prediction." 2018-05-21 01:10:57 +00:00
Marco Paniconi e447a1a334 vp9-svc: Fix on disabling inter_layer prediction.
In vp9_svc_constrain_inter_layer_pred() we disable the
inter_layer prediction if anything but only the previous
spatial layer (from same supeframe) is used for inter_layer
prediction. This check and disabling was only allowed when
the control VP9E_SET_SVC_INTER_LAYER_PRED is set to
INTER_LAYER_PRED_ON_CONSTRAINED.

But the control VP9E_SET_SVC_INTER_LAYER_PRED is needed for setting:
INTER_LAYER_PRED_ON/INTER_LAYER_PRED_OFF/INTER_LAYER_PRED_OFF_NONKEY.
So there is a conflict with setting INTER_LAYER_PRED_ON_CONSTRAINED.

Fix for now is to always allow for this disabling check
(disable inter_layer reference if its not previous spatial layer) as
long as inter_layer prediction is used (i.e., not set to _OFF).

A separate fix if needed may be to invoke another control for setting
INTER_LAYER_PRED_ON_CONSTRAINED.

This was causing an issue with enabling spatial layers on the fly
(say spatial layer 2), where since INTER_LAYER_PRED_ON_CONSTRAINED was
not set (default), the inter_layer prediction was then using a reference
from 2 spatial layers below (spatial layer 0).

Change-Id: Ic6434000665f63aab27c509b5eb7b8fc965827bc
2018-05-19 19:11:42 -07:00
Marco Paniconi a2d5a4a956 vp9-svc: Fix issue with reseting lst_fb_idx.
When encoding a given spatial layer and the same spatial layer
on previous superframe was dropped (or disabled due to 0 bitrate),
the lst_fb_idx for current layer is set to the buffer index that
was last updated on TL0 frame (for the same spatial layer).

This condition was to maintain proper temporal prediction pattern
under frame drops, and it should only apply to INTER frames.

But the condition was causing an assert to be triggered on spatial
layers whose base are key frames. Fix is to condition this reset of
lst_fb_idx on the "is_key_frame" flag. Also initialize the
fb_idx_upd_tl0 to -1 and only use it for a given spatial layer
if its been set.

These issues can happen when superframe drop happens just before
a key frame, or when stream starts with lower layers and dynamically
enabled higher spatial layers.

Added datarate unittest the inserts key frame after superframe drop,
and verified that this fix is needed for test to pass.
Also modified the existing DisableEnable spatial layer test to trigger
the issue of using fb_idx_upd_tl0 when it hasn't been set for a
spatial layer.

Change-Id: I059d1135736aca17e1326b9b4a2b16371eb4634e
2018-05-19 18:38:37 -07:00
James Zern 48a3df87c0 configure,ios: add missing c++11 checks
+ bump ios minimum to 7.0; 6.0 does not have full c++11 support

Change-Id: If838b036e7327fda514cd2e8156eeda122cf6c73
2018-05-19 00:31:40 -07:00
Johann e5b858740a Revert "Revert "update libwebm""
This reverts commit d32a55ffc4.

Use the correct 'check_add_cxxflags' invocation.

Change-Id: I97d8062c9218b81a24268ec5998e847b1a0efeda
2018-05-18 13:08:02 -07:00
Johann 1a0047994f iosbuild.sh: portable sed usage
There is no convenient way to have both gnu and bsd
sed do in-place processing.

Change-Id: I95f2a378d5c1bd95debb446317cc18ad79835e49
2018-05-18 11:32:59 -07:00
Paul Wilkins 624f8105f5 Experiment regarding playback problems on Bravia TVs.
This patch experimentally reduces the maximum GF interval for
static content such as slide shows.

It does not fully revert the previous slide show patches as this
still allows the codec to code static sections only using GFs
groups rather than ARF groups or a mix of ARF and GF groups.
However, the maximum group length is reduced.

Change-Id: Ia968b608efb9a67d2402b12e979695d58ddc1ad7
2018-05-18 17:36:30 +01:00
Marco Paniconi d7b1404878 vp9-svc: Skip find_predictors based on ref_frame_flags.
Has some effect for SVC on base spatial layers (which only
reference LAST) or on upper spatial layers when inter_layer
prediction is disabled.

Small speedup on Mac of ~1%, for 3 layer SVC with inter-layer
prediction disabled.

Change-Id: I05be5da8843e0d32e9d85f6eb951cf1894e781d8
2018-05-17 09:10:01 -07:00
Marco Paniconi 35908928c5 vp0-svc: Small code cleanup in nonrd-pickmode.
Change-Id: I0bc9a555064f053a00c1ab9a4dd2557ccf5537d8
2018-05-16 12:16:20 -07:00
Marco Paniconi bf7ee0524c vp9-svc: Enable scene detection and re-encode for SVC.
Keep a lower rate threshold for video case.
Also lower the exiting threshold somewhat for screen-content mode.

Change-Id: I79649a36678d802fd4d4080754fd366e78904214
2018-05-16 09:49:48 -07:00
Jingning Han 27a6f53979 Merge "Use the updated best rd cost for transform block search" 2018-05-16 14:51:09 +00:00
Jingning Han bbca74d412 Use the updated best rd cost for transform block search
The compression performance change is +/-0.01% for both speed 0/1.
Locally tested the encoding speed:

ped_1080p 150 frames speed 0
79544 b/f 41.339 dB 503072 ms ->
79566 b/f 41.338 dB 493009 ms.

speed 1
79789 b/f 41.152 dB 104583 ms ->
79770 b/f 41.153 dB 102607 ms

Change-Id: Ief200b613608643e5708cebe979982eb4a84831b
2018-05-15 14:44:04 -07:00
Marco Paniconi d99abe9a9a Merge "vp9: Some speed feature settings for speed 9." 2018-05-15 17:45:29 +00:00
Luca Barbato 689ac41331 Merge "Add vpx_varianceNxM_vsx and vpx_mseNxM_vsx" 2018-05-15 17:01:54 +00:00
Marco Paniconi 25b7b6e20f vp9: Some speed feature settings for speed 9.
Disable 8x8 blocks for higher resolutions,
reduce mv_thresh for 1/2 subpel motion, and
disable golden reference at superblock level
based on source sad and motion content.

~6% loss in RTC metrics over current speed 9.
Speedup about ~10% for high motion clip on linux.

Change-Id: I7ff8f81ac93ee8a90d5a1f4837c955d000bd75e7
2018-05-15 09:58:53 -07:00
Luca Barbato d8c36c9480 Add vpx_varianceNxM_vsx and vpx_mseNxM_vsx
Speedups:

64x64   5.9
64x32   6.2
32x64   5.8
32x32   6.2
32x16   5.1
16x32   3.3
16x16   2.6
16x8    2.6
8x16    2.4
8x8     2.3
8x4     2.1
4x8     1.6
4x4     1.6

Change-Id: Idfaab96c03d3d1f487301cf398da0dd47a34e887
2018-05-15 18:04:10 +02:00
Johann Koenig 8266c45b01 Merge "Revert "update libwebm"" 2018-05-15 14:35:49 +00:00
Johann Koenig d32a55ffc4 Revert "update libwebm"
This reverts commit 595edb9669.

Incorrect check_add_cxxflags invocation prevented libwebm from
building. Correcting it causes build failures on jenkins and mac.

Original change's description:
> update libwebm
> 
> Clears "auto_ptr deprecated" warnings when building with
> clang v6.0.0
> 
> Requires C++11 support.
> 
> Change-Id: I5ea2744e73deeaa4e7b2599bacf0b6c9cf355a54

TBR=jzern@google.com,johannkoenig@google.com,builds@webmproject.org

Change-Id: I7340d912a121de035997cbd8ad77a150ee38189a
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2018-05-15 14:35:28 +00:00
Yaowu Xu d665731ed0 Merge "Make a config time flag" 2018-05-15 01:14:46 +00:00
Marco Paniconi c73907a09b Merge "vp9-realtime: Enable alt_ref at speed 5, for live." 2018-05-14 22:44:52 +00:00
Luc Trudeau e6859b04ac Merge "VSX version of vpx_quantize_b_32x32_vsx" 2018-05-14 21:52:30 +00:00
Yaowu Xu fd554ec714 Make a config time flag
This commit replace a hard coded macro with a macro defined by
a configure command.

Change-Id: Ib31354d61865314ed43e2c429c72b4ef2c8fa2a7
2018-05-14 14:32:34 -07:00
Yaowu Xu e51c9e39bc Merge "Fixes for consistent encoding across recodes of a frame" 2018-05-14 20:56:34 +00:00
Johann Koenig 5222a928b0 Merge "update libwebm" 2018-05-14 20:22:12 +00:00
Luc Trudeau d1aede92ec VSX version of vpx_quantize_b_32x32_vsx
Low bit depth version only. Passes the VP9QuantizeTest.

VP9QuantizeTest Speed Test (POWER8 Model 2.1)
Full calculations:
C time = 1456 ms, VSX time = 80 ms (18x)

Change-Id: I1b1d6d03b1aeff63640efbdeb222cab857ddd95e
2018-05-14 19:50:11 +00:00
Ranjit Kumar Tulabandu b6f7d69464 Fixes for consistent encoding across recodes of a frame
Change-Id: I094bca857f0fc2c067a4d08d1b36370fe61c25aa
2018-05-14 12:13:53 -07:00
Marco Paniconi 9ebc8605db vp9-realtime: Enable alt_ref at speed 5, for live.
Enable alt_ref and compound prediction at speed 5.

For 1 pass VBR mode, when lag > 0.
Gain for Live set: ~3% gain on average, several
clips have gains ~5-15%.

Encoder fps decrease ~5-10%, on desktop with 4 threads.

For now enable it only for resolutions <= 1280x720.

Change-Id: I25e3d61a2244a3a01962624052c5adf4837965c7
2018-05-14 11:21:21 -07:00
Marco Paniconi dc262df7b5 Merge "vp9-svc: Add conditon to asserts on prediction pattern." 2018-05-14 18:13:59 +00:00
Marco Paniconi 5086716a17 vp9-svc: Add conditon to asserts on prediction pattern.
Add condition that inter-layer prediction is on.

Change-Id: I84d8c73be4296e7b6b79abb7e5e5e6dbaa6e0600
2018-05-14 09:57:00 -07:00
Jerome Jiang 66aca163f5 VP9: Add speed 9 for subpel search.
Set subpel search stop to 2 when motion vector is non zero.

10% speedup on 1 and 2 threads on Samsung Galaxy S8+.

Change-Id: I7323bb913000229cf60a37495bf88bcc51d0ac96
2018-05-14 08:47:18 -07:00
Johann 595edb9669 update libwebm
Clears "auto_ptr deprecated" warnings when building with
clang v6.0.0

Requires C++11 support.

Change-Id: I5ea2744e73deeaa4e7b2599bacf0b6c9cf355a54
2018-05-14 15:33:14 +00:00
Marco Paniconi c85c5337bf vp9-svc: Update layer_id of frame buffer idx last refreshed.
Remove some unused code and add parameter to keep track
of the layer_id of the frame buffer indices last refreshed.

This is useful for verifying constaints on spatial-temporal pattern,
for fixed/non-flexible mode.

Change-Id: I6957bb43157eb31df49dac1b8245facc043e4a49
2018-05-13 22:46:48 -07:00
Jerome Jiang a19638ac40 Merge "Fix valgrind failure on uninitialized values." 2018-05-12 06:08:16 +00:00
Jerome Jiang d32c38afbf Fix valgrind failure on uninitialized values.
Change-Id: I917d884c9fab9b15bb092de5675f92225f1cdebd
2018-05-11 21:46:55 -07:00
Marco Paniconi 36b998f2eb vp9-svc: Fix pattern update for skip enhancement layers.
Use the same logic as for dropped frames to be consistent.

Change-Id: I16fd317e70514fe8516d9eb350c275d1813e943e
2018-05-11 15:51:19 -07:00
Marco Paniconi 7b3ee0dfa7 Merge "vp9-svc: Fix when whole superframe is dropped." 2018-05-11 17:45:11 +00:00
Marco Paniconi f362bf981c vp9-svc: Fix when whole superframe is dropped.
When the whole superframe is dropped (due to rate control),
don't increment the temporal layer counter.

This is a temporary fix to prevent an issue where temporal
prediction pattern is possibly broken.

Updated svc_datarate tests to handle this case.

Change-Id: Icac44fdc9d0f08a957776c937584db4b2c7927c7
2018-05-11 09:39:29 -07:00
Luc Trudeau 8355eed28a Merge "Faster VSX vpx_quantize_b" 2018-05-11 13:29:17 +00:00
James Zern 69660a4b77 vpx_subtract_block_neon: add explicit cast
quiets ptrdiff_t -> int conversion warning

Change-Id: If6b545a736fc19e48e290961736b1618df97db3e
2018-05-10 23:08:54 -07:00
Jerome Jiang 4cfde1caf0 Merge "Fix vpxdec fuzz failure." 2018-05-11 05:59:24 +00:00
Marco Paniconi 7b13f16fff Merge "vp9: Adjust some early exits in nonrd-pickmode." 2018-05-11 04:45:13 +00:00
Luc Trudeau 31f674d3e4 Remove ppc64 (big endian) from configure
Remove big endian PowerPC 64 from configure, as this build is problematic and
not supported. PowerPC 64 will be limited to little endian (ppc64le).

BUG=webm:1525
BUG=webm:1508

Change-Id: Id6a86d5913192549e03ac8f77879ba7526b752c8
2018-05-11 03:19:38 +00:00
James Zern c9bb592aff Merge "ppc: Add vpx_iwht4x4_16_add_vsx" 2018-05-11 03:17:55 +00:00
James Zern 3d8ea90212 Merge "configure: check for arm_neon.h w/neon builds" 2018-05-11 03:15:31 +00:00
James Zern 2b6b1471c0 Merge "Update vpx_subtract_block_neon()" 2018-05-11 01:47:08 +00:00
Luc Trudeau 81a98509dc Faster VSX vpx_quantize_b
Process 16 coefficients on the first iteration (a full 4x4) and 24 coefficients
on subsequent iteration.

VSX/VP9QuantizeTest.DISABLED_Speed
Before:
4x4   176 ms
8x8    91 ms
16x16  72 ms
After:
4x4   152 ms
8x8    82 ms
16x16  64 ms

Change-Id: I07cb130833504206ccdc5bc12ae5af369364999a
2018-05-10 21:23:39 -04:00
Marco Paniconi 6512048ada vp9: Adjust some early exits in nonrd-pickmode.
Condition some early exitis in nonrd-pickmode on the
motion vector, to make sure we always test (0, 0) for
inter-layer prediction.

Change-Id: Id0e790ecc75ccfb7031d3e8786ccdd13781d81fe
2018-05-10 18:00:04 -07:00
Linfeng Zhang 38f6b22498 Update vpx_subtract_block_neon()
Change-Id: Ie2ac06c090c8f92268e9a799e96aa5192a1bdcd2
2018-05-10 16:26:40 -07:00
James Zern 47ef72f1e9 Merge "Update vpx_comp_avg_pred_neon()" 2018-05-10 23:23:47 +00:00
Jerome Jiang f1a5d42c65 Merge "Remove extra line in warnings in ivfdec.c" 2018-05-10 23:19:26 +00:00
James Zern 007971470b configure: check for arm_neon.h w/neon builds
fails at configure time rather than compile time unless using
--enable-external-build

Change-Id: I966ee1000e28fdcc3f4a29759789b056faee0010
2018-05-10 16:01:38 -07:00
Jerome Jiang 67b5054035 Merge "Make upper limit of frame size in ivf reader consistent." 2018-05-10 22:58:30 +00:00
Jerome Jiang 76d588902a Fix vpxdec fuzz failure.
BUG=webm:1495

Change-Id: Ibaee35aa5e8e00847c61e707f2c9b4c0cff23673
2018-05-10 15:06:43 -07:00
Jerome Jiang 401c896531 Remove extra line in warnings in ivfdec.c
The warnings give an extra line which is confusing sometimes.

E.g.

Warning: Read invalid frame size (308164564) // This is for frame 5

Warning: Failed to decode frame 5: Invalid parameter
Warning: Read invalid frame size (1936229463) // This is for frame 6

Warning: Failed to decode frame 6: Invalid parameter
Warning: Read invalid frame size (2282536257)

Change-Id: I1753fa32079deca5c8b534c6ca9a527cc9e491e9
2018-05-10 14:43:00 -07:00
Jerome Jiang f11630d9f7 Make upper limit of frame size in ivf reader consistent.
Change the limit of frame size in ivf reader used by test to make it
consistent with ivf reader used in vpxdec.

Change-Id: I19ab05adf51eca65322e609efdf4d83ad66af847
2018-05-10 14:30:54 -07:00
Luc Trudeau af355dacd5 Merge "VSX version of vpx_quantize_b_vsx" 2018-05-10 13:48:07 +00:00
Marco Paniconi e93571601a vp9-svc: Fix inter-layer early exit threshold.
If the scale factors are 1 (no scaling), set the threshold
for skipping the inter-layer prediction to 0, so we will
more often test this mode.

Improves quality for upper layers for quality layers
in svc mode.

Change-Id: Iaf848d44f6cc153780db861b76517a4cf9672c45
2018-05-09 20:55:33 -07:00
Hui Su 5f3e99166c Merge "Don't use transform domain distortion when eob is 0" 2018-05-09 22:09:11 +00:00
Luc Trudeau 1251bf2a63 VSX version of vpx_quantize_b_vsx
Low bit depth version only. Passes the VP9QuantizeTest.

Change-Id: I6546f872864bd404a7e353348b0554aab1de5bf0
2018-05-09 17:54:27 +00:00
Marco Paniconi 6e3d8000df Merge "vp9-svc: Fix to SVC for frame dropping." 2018-05-09 17:41:33 +00:00
Paul Wilkins ecda0015de Merge "Improved slide show coding" 2018-05-09 11:14:25 +00:00
Linfeng Zhang 7edb5e8a16 Update vpx_comp_avg_pred_neon()
Separate width 4 and 8 cases to reduce jumps in loop in clang.

Change-Id: I6ffc6f1555f2ad08b72a8dba35a78b9fd5f95a73
2018-05-08 17:37:18 -07:00
Marco Paniconi bb06546d56 vp9-svc: Fix to SVC for frame dropping.
When the previous frame is dropped, for the current
spatial layer make sure the lst_fb_idx corresponds
to the buffer index last updated on the (last) encoded
TL0 frame(for same spatial layer).

This is needed to preserve the temporal prediction pattern
for fixed/non-flexible mode under frame dropping.

Change-Id: Ifc8e257beb025654a81580c4da0a181235724508
2018-05-08 16:01:00 -07:00
Linfeng Zhang 2d3e333882 Update SadMxNx4 NEON functions
Change-Id: Ia313a6da00a05837fcd4de6ece31fa1c0016438c
2018-05-08 14:47:21 -07:00
Martin Storsjö 9b06ec4625 Merge "configure: Disable pthread_h if linking failed" 2018-05-08 19:04:12 +00:00
Linfeng Zhang 7d7577ec54 Merge "Add vpx_sum_squares_2d_i16_neon()" 2018-05-08 16:50:33 +00:00
paulwilkins f4091bc30e Improved slide show coding
This patch improves coding of slide shows with fade or other
complex transitions.

Previously, fades and other complex transitions between static "slides"
were sometimes being incorrectly marked such that they were coded
as a single static slide rather than two slides with a transition.

As the initial key frame for the first slide is not necessarily a good
predictor of the second slide and ARFs were turned off, this led to a
poor visual and metrics outcome in some such cases.

This patch allows for long GF groups in static sections before and after
a complex transition (instead of just with  simple slide transitions) with
one or more normal ARF groups during the transition. It also enforces a
single "normal" length GF group after the transition before any extended
group is allowed. The reason for this is that the ARF that spans the
transition my not have a very high quality and hence may not act as a
good GF for the long static section that follows.

Change-Id: Ica1f979e27d8a0625f3cebf7b7cf6d69edccaba9
2018-05-08 12:59:48 +01:00
Linfeng Zhang 2d36522991 Update vpx_sum_squares_2d_i16_sse2()
Change-Id: I5a2ca2ed246277cf6b1ef2ffac34ce5c40aa0158
2018-05-07 14:33:42 -07:00
Linfeng Zhang 3a0f25ea3a Add vpx_sum_squares_2d_i16_neon()
Perf shows CPU time of this function dropped from 0.81% to 0.15%.

Change-Id: I8a7649ca5c15af2fc65cfb848f5befa0cc5e64f2
2018-05-07 13:17:34 -07:00
Hui Su 11880f6a3d Don't use transform domain distortion when eob is 0
When eob is 0, pixel domain distortion is more accurate and efficient.

This mainly affects speed >= 2. Speed 0 always use pixel domain
distortion; speed 1 use it most of the time.

Compression impact(negative means gain):
        speed 2   speed 3   speed 4
lowres   -0.04%   -0.06%    -0.06%
midres   -0.10%   -0.10%    -0.20%
hdres    -0.01%   -0.03%    -0.06%

Encoding speed is about neutral.

Change-Id: I77b957658deeaad57381fd13afc11bacdec8c08f
2018-05-04 13:01:06 -07:00
Marco Paniconi 28801f91c4 Merge "vp9-svc: Reset fb_idx for unused references." 2018-05-04 17:12:59 +00:00
Martin Storsjo e63c29c760 configure: Disable pthread_h if linking failed
When doing both check_header and check_lib, the check_header call
will already enable pthread_h if the header was found. This was
overlooked when the pthread linking check was amended into a header
check and a separate linking check in 9b7d4cce63.

This brings back the same result as the original check in 38dc27cc6.

Change-Id: I0efb38f5780f7c79e2eb2b14290d6094096ea222
2018-05-04 09:37:51 +03:00
Marco Paniconi 04d8700862 vp9-svc: Add memset on the svc fb_idx.
The memset is added to better handle frame drops
with the GET_SVC_REF_FRAME_CONFIG contro

There is an issue with some tests in bypass mode,
so condition it on that for now.

Change-Id: I2635037143f14ff62a36be7c22b2b604a0c1efc2
2018-05-03 19:44:56 -07:00
Marco Paniconi 0a334921f7 vp9-svc: Reset fb_idx for unused references.
For fixed (non-flexible) SVC mode.
No change in behavior.

Needed for future change to make Intra-only frame work.

Change-Id: I91e18776e7ef27c9c6fcbc8d5f64764d9cc3d9a9
2018-05-03 19:33:22 -07:00
Marco Paniconi 1d93f58b1f vp9-svc: On key frame update all reference slots for SVC.
Key frame updates the slots corresponding to the 3 references
last/golden/altref, but for SVC where more references buffers
may be in use, especialy for dynamically swithing up/down in layers,
make sure we should update all 8 slots on key frame.

Change-Id: Ifcca12608f420d5bae32b92794a3afe9b6369f77
2018-05-03 14:38:08 -07:00
Linfeng Zhang dd1411624d Merge "Clean switch cases in vp9 encoder" 2018-05-02 20:01:59 +00:00
xiwei gu 767d23f361 Merge "vp9: [loongson] optimize vpx_convolve8 with mmi" 2018-05-02 08:54:24 +00:00
Linfeng Zhang 28c563a11c Clean switch cases in vp9 encoder
To save a branch.

Change-Id: Ifa2be7583e95c6991784731c654bbd4cce31e993
2018-05-01 17:57:51 -07:00
James Zern e4408a07be Merge "Update variance_test.cc" 2018-04-28 05:52:45 +00:00
Linfeng Zhang be5fde6c4b Update variance_test.cc
Change-Id: I1301781f0f2528a61ad2b5c2828404b2b3e3e8b9
2018-04-27 17:56:07 -07:00
Marco Paniconi e4643a9904 vp9-svc: Remove the memset on the svc fb_idx.
This fixes failures on the datarate tests for
temporal layers with frame dropping.

The memset was only added to better handle frame drops
with the GET_SVC_REF_FRAME_CONFIG control from 43c58df3.
So ok to remove it for now.

Change-Id: I256d9ac4278b93fe6f39b94cce2e458a1a5eff69
2018-04-27 15:47:44 -07:00
Marco Paniconi f444e5743d Merge "VP9 SVC: Add new level to constrain inter-layer pred." 2018-04-27 04:55:22 +00:00
Hui Su 60e1a1befc Merge "Respect MV limit in vp9_int_pro_motion_estimation()" 2018-04-27 04:41:52 +00:00
Jerome Jiang 9cf725a4f4 VP9 SVC: Add new level to constrain inter-layer pred.
Add another level (INTER_LAYER_PRED_ON_CONSTRAINED) to the
inter-layer prediction control. This new level enforces the
condition that a given spatial layer S can only do inter-layer
prediction from the previous spatial layer (S - 1) from the same
time/superframe.

BUG=webm:1526

Change-Id: I0a1ec95b2c220c7b13a9a425d5fb0a8814c23c70
2018-04-26 20:51:58 -07:00
Marco Paniconi ede1efa55f vp9-svc: Remove unneeded call and init some parameters.
Remove the unneeded vp9_copy_flags_ref_update_idx(cpi),
and initialize the struct parameters needed for the
GET_SVC_REF_FRAME_CONFIG. This init is useful for the case
for spatial layer frame drops.

Change-Id: If89e8349f6246c33720ecbb758d41a932d21e496
2018-04-26 18:03:15 -07:00
Hui Su a47376c52d Merge "Do one level less of transform search for large blocks" 2018-04-26 23:58:33 +00:00
Hui Su 34353da674 Respect MV limit in vp9_int_pro_motion_estimation()
Change-Id: I08cb072a32e06c6452eca068b2f7ef7287f221e6
2018-04-26 13:59:42 -07:00
Hui Su e4dfacbf6c Do one level less of transform search for large blocks
If block size is larger than 32x32, search transform size for one level
less than the other blocks.

This mainly affects speed 0 and 1, as speed >= 2 uses largest transform
size(except for keyframes and alt-ref frames).

Compression(negative means gain):
        speed 0     speed 1
lowres  -0.007%       0.00%
midres   0.023%     -0.011%
hdres    0.002%     -0.016%

Encoder speed:
Tested on crowd_run_1080p 30 frames
Fixed QP = 30, speed 0: 582.5s -> 564.6s
               speed 1:  75.0s ->  73.3s

Change-Id: I46622efafe0e88d502efa1480a5324ead1d1e8d0
2018-04-25 12:31:48 -07:00
Hui Su 0145b5ee67 Merge "Calculate transform size cost once per frame" 2018-04-25 16:25:58 +00:00
Hui Su 6aca1d2d94 Merge "Add speed feture to control tx size search depth" 2018-04-25 16:25:40 +00:00
Jerome Jiang 57755aad5b Merge "vp9-svc: Add GET control to get SVC pattern info." 2018-04-25 04:59:10 +00:00
Marco Paniconi 43c58df34e vp9-svc: Add GET control to get SVC pattern info.
Copy ref frame index in SVC struct after set in encoder.
Rename ext_{lst,gld,alt}_fb_idx to {lst,gld,alt}_fb_idx.

Bump up ABI version.

BUG=webm:1527

Change-Id: I06209040cb83d374030f40b79f0b36b0efe9f97d
2018-04-24 21:08:10 -07:00
guxiwei-hf@loongson.cn 15dad6bcbc vp9: [loongson] optimize vpx_convolve8 with mmi
1. vpx_convolve_avg_mmi
2. vpx_convolve8_avg_horiz_mmi

Change-Id: Ie544aac45b4b1c0a0e51b44b650189ae5e88aee1
2018-04-25 09:55:05 +08:00
Jerome Jiang 170013f19b Fix vp8_multi_resolution_encoder test failure.
BUG=webm:1528

Change-Id: I8eb8278c2e61577159308dd5329be0577b82d1a6
2018-04-24 17:28:44 -07:00
Hui Su 5c0a118d86 Calculate transform size cost once per frame
Instead of doing it in every transform search loop.

Change-Id: I12dc402a6633d1a27d32cb6b58710b8c0ebf0fd4
2018-04-24 12:02:23 -07:00
Hui Su df1e06ed0b Remove get_tx_probs2()
This function is redundant.

Change-Id: I7651fc34787c09e59cb1366495f6b525dec8510d
2018-04-23 17:22:19 -07:00
Hui Su 2eac6df788 Add speed feture to control tx size search depth
Set the max depth as 2 for speed 0.

Compression(negative means gain):
        speed 0     speed 1
lowres   -0.01%      0.00%
midres    0.05%     -0.01%
hdres    -0.01%      0.01%

Encoding speed gain:
Tested on crowd_run_1080p 30 frames
Fixed QP = 20, speed 0: 669.7s -> 656.1s
               speed 1: 104.5s -> 101.5s
Fixed QP = 40, speed 0: 440.7s -> 435.8s
               speed 1:  47.7s ->  45.1s

Change-Id: I61bc13818c72317b9f1d596727d54a906b20c012
2018-04-23 17:01:04 -07:00
James Zern 3b460db214 Merge changes I0202a556,Iebb98f3b
* changes:
  Update variance avx2 functions
  Update variance sse2 functions
2018-04-19 21:29:42 +00:00
Martin Storsjo 9b7d4cce63 configure: Use both check_header and check_lib for pthreads
check_lib can be a stub that always returns true - make sure to
still use check_headers as before 38dc27cc6.

Change-Id: I5d471de56b16c015a0b686fa6c6caefa35bb89b4
2018-04-19 09:32:40 +03:00
Marco Paniconi d547aced6c vp9:aq-mode=3: Keep perc_refresh fixed for screen content mode.
Don't allow for changing the perc_refresh with screen-content
mode, as this helps reduce some overshoot for static content.

Change-Id: Idbe1849e7a14ef18fda20bee6dced809f134b7f7
2018-04-18 20:03:54 -07:00
Marco Paniconi 5464395948 Merge "vp9: Rate control fix for CBR mode." 2018-04-19 02:56:26 +00:00
Marco Paniconi 2b369bf63c Merge "vp9: Remove limit on QP on key frame for CBR." 2018-04-19 01:39:52 +00:00
Marco Paniconi 581b6b826e vp9: Add condition of real-time mode to scene detection.
This was removed by error from the change: ce11afb, and
made some datarate tests fail.

Change-Id: I0c29e1f5aede8f56ce835b25fed0528722350241
2018-04-18 17:48:54 -07:00
Marco Paniconi 740782f8cd vp9: Rate control fix for CBR mode.
For CBR mode: modify the qp clamping to allow q to respond
faster to overshoot. Can reduce some suprious overshoot events
observed in screen content coding.

Change-Id: I0b3f54b0d1b4086182f834e557a4121950b176d4
2018-04-18 17:21:51 -07:00
Jerome Jiang d30966609b Merge "SVC: Fix duplicated run of svc datarate tests." 2018-04-18 23:15:57 +00:00
Marco Paniconi 8a22a21cc0 vp9: Remove limit on QP on key frame for CBR.
This piece was carried over the VBR routine, for CBR
mode we don't want to apply this limit.

Change-Id: Ib9e9937eabeff8cfd30e11c9bd17444cc2b591aa
2018-04-18 15:57:42 -07:00
Jerome Jiang 2c6736506e SVC: Fix duplicated run of svc datarate tests.
Change-Id: I3f4e45b398009852f1183943461625d621c4eb80
2018-04-18 14:37:21 -07:00
Marco Paniconi ce11afb0e0 vp9: Changes for scene detection overshoot and SVC.
Refactor the scene detection for 1 pass cbr to allow the
scene detection to be checked once per superframe (on the base layer),
using the full resolution sources.

If scene change is detected: check for re-encoding due to
large overshoot for all spatial layers withing the superframe.

Add speed feature to control the re-encode step.
Keep the re-encode step on for now.

Small change in nonrd_pickmode to remove the possible skip of golden
reference for SVC, when the high_source_sad is set for the superframe.

Change only affects SVC encoding with screen-content mode enabled.

Change-Id: If4cfb52cb0dd0f0fce1c4214fa8b413f8f803d56
2018-04-18 12:39:56 -07:00
Martin Storsjö 03fa701873 Merge "configure: Test linking pthreads before using it" 2018-04-18 06:23:09 +00:00
Jerome Jiang 449e84ff06 Merge "Clean up svc comment in vp9_bitstream.c" 2018-04-17 23:30:23 +00:00
Linfeng Zhang 78ba83bb91 Update variance avx2 functions
Old vs New
Variance 64x64 time:  1145 ms           797 ms
Variance 64x32 time:  1200 ms           831 ms
Variance 32x32 time:  1228 ms          1135 ms
Variance 32x16 time:  1374 ms          1491 ms
Variance 16x16 time:  1688 ms          1571 ms

sse2 vs avx2
Variance 32x64 time:  1645 ms           957 ms
Variance 16x32 time:  2031 ms          1243 ms
Variance 16x8  time:  3071 ms          2275 ms

Change-Id: I0202a556e4629977d647e219c2e897e1ab6accb2
2018-04-17 13:43:16 -07:00
Marco Paniconi aaaf9215e2 vp9: Remove this_key_frame_forced setting for CBR.
The setting this_key_frame_forced can lead to large key frame sizes,
not suitable for CBR rate control used for RTC.

Change-Id: Idf6d2bf385d5b1494f4bf783f623b7c202f34e55
2018-04-17 10:37:47 -07:00
Linfeng Zhang 55ca875e6b Update variance sse2 functions
Old vs New
Variance 64x64 time:   197 ms   143 ms
Variance 64x32 time:   200 ms   146 ms
Variance 32x64 time:   203 ms   140 ms
Variance 32x32 time:   214 ms   152 ms
Variance 32x16 time:   243 ms   153 ms
Variance 16x32 time:   234 ms   197 ms
Variance 16x16 time:   205 ms   205 ms
Variance 16x8  time:   228 ms   222 ms
Variance 8x16  time:   228 ms   232 ms
Variance 8x8   time:   282 ms   240 ms
Variance 8x4   time:   506 ms   341 ms
Variance 4x8   time:   518 ms   415 ms
Variance 4x4   time:   604 ms   628 ms

Observed vp9 encoder speed up when encoding a 720p video.

Change-Id: Iebb98f3b3d8adbc11a733a529d8427ce3d2a5314
2018-04-17 10:27:34 -07:00
Jerome Jiang 0ae1628f7e vp9: fix multiple run for vp9 datarate tests using one bitrate.
Change-Id: I1458bf25fadc23e8be5a9532a153d7129a53accf
2018-04-16 14:31:49 -07:00
Jerome Jiang d46f614ca9 Merge "vp9: refactor vp9 datarate test for better sharding." 2018-04-15 07:03:16 +00:00
Martin Storsjo 38dc27cc6d configure: Test linking pthreads before using it
This avoids enabling pthreads if only pthreads-w32 is available.
pthreads-w32 provides pthread.h but has a link library with a
different name (libpthreadGC2.a).

Generally, always using win32 threads when on windows would be
sensible.

However, libstdc++ can be configured to use pthreads (winpthreads), and
in these cases, standard C++ headers can pollute the namespace with
pthreads declarations, which break the win32 threads headers that
declare similar symbols - leading us to prefer pthreads on windows
whenever available (see d167a1ae and bug 1132).

Change-Id: Icd668ccdaf3aeabb7fa4e713e040ef3d67546f00
2018-04-14 23:42:21 +03:00
Jerome Jiang a6ff396576 vp9: refactor vp9 datarate test for better sharding.
Change-Id: Icfaf29e1ca847ba9e3748700c9e09383ce8d1f65
2018-04-13 22:39:30 -07:00
Jerome Jiang 2acb7fc1d3 Clean up is_two_pass_svc.
Change-Id: I9e92616471be380d3ba4e2b85399d7eb9f687d2f
2018-04-13 15:36:39 -07:00
Jerome Jiang 266d495f22 Clean up svc comment in vp9_bitstream.c
SVC related conditions were removed long time ago and comments should be
removed too.

Change-Id: Iff3f3b6815d85ae5a69994932a4893cd1f831ce3
2018-04-13 13:56:45 -07:00
Jerome Jiang 3cc90a6d18 Merge "vp9 svc: Refactor svc datarate test for better sharding." 2018-04-13 03:40:50 +00:00
Jerome Jiang baefd61752 Merge "Silence warning when built with --enable-internal-stats." 2018-04-13 03:40:20 +00:00
Jerome Jiang 546a210259 Silence warning when built with --enable-internal-stats.
Change-Id: I3a600a9baf2b8e46c109f4ec2b5bd6bafda4bf58
2018-04-12 15:07:03 -07:00
Jerome Jiang a08f27150d vp9 svc: Refactor svc datarate test for better sharding.
Wrap denoiser tests under config flags.

Change-Id: I6175c3c9d8b5b079ad35a55553383145db58a10f
2018-04-12 14:40:45 -07:00
Paul Wilkins 768b018d18 Merge "Add extra case to wq_err_divisor()" 2018-04-12 10:42:50 +00:00
James Zern c9a459216d vpx_image: remove unused image formats
libvpx only emits:
VPX_IMG_FMT_{I420,I422,I440,I444,I42016,I42216,I44016,I44416}
and additionally supports YV12 as input.

interleaved yuv, rgb and alpha formats are unused.

Change-Id: Ie2ab1099e950c6e696f475d46882f5c47a174042
2018-04-09 16:25:50 -07:00
Marco Paniconi be5df60801 Merge "vp9-svc: Make constrained_layer_drop default for svc." 2018-04-09 18:05:51 +00:00
Jerome Jiang 30a5cd80e2 Merge "Fix settings for num of tiles in samples & tests." 2018-04-09 17:03:59 +00:00
Marco Paniconi 0ea4e229a7 vp9-svc: Make constrained_layer_drop default for svc.
Switch the order of constrained and layer drop mode,
and keep constrained_layer_drop as the default.
Update the svc datarate tests.

Change-Id: I764270f7b4964b87b0cd3da6c2f96a628f212a30
2018-04-09 09:53:28 -07:00
Jerome Jiang be4561248d Fix settings for num of tiles in samples & tests.
The control is set by log2 of number of threads (such that the number of
tiles is the same of number of threads).

Thus it should be log2(num_threads) instead of (num_threads >> 1).

Change-Id: I2ccec5557e660048dad3e561534e1c74fc8eec1f
2018-04-06 12:52:53 -07:00
Marco Paniconi 7255ff9b85 vp9-svc: Hybrid search on spatial layers whose base is key.
For spatial layers whose base is a key frame, i.e., when
svc.layer_context[cpi->svc.temporal_layer_id].is_key_frame = 1,
allow for hybrid search, similar to what we do on key frames.

For small blocks (<= 8x8) rd-based intra search will be used,
otherwise non-rd pick mode is used.

Feature is controlled by nonrd_keyframe, which is set to 1
for now on non-base spatial layers, so this change has
currently no effect.

Small change only when inter-layer prediction is off, as we now
call vp9_pick_intra_mode instead of vp9_pick_inter_mode on key frame.
But this change is very small/insignificant.

Change-Id: I5372470f720812926ebbe6c4ce68c04336ce0bdd
2018-04-06 11:36:48 -07:00
Johann Koenig cfc6dc8db3 Merge "remove support for yuv 411" 2018-04-06 15:26:59 +00:00
Marco Paniconi 4934e52e43 Revert "vp9-svc: Fix to first superframe when inter_layer is off."
This reverts commit 5cc8df5bcf.

Reason for revert: <INSERT REASONING HERE>
We need to do this on all key frames in the stream (not just the first one). Will make another cleaner change for this.

Original change's description:
> vp9-svc: Fix to first superframe when inter_layer is off.
> 
> When the application selects the setting INTER_LAYER_PRED_OFF
> each spatial stream should be decodeable separately.
> For this we need to force key frames on all spatial layers
> on the first superframe.
> 
> In order to maintain the quality at the beginning of the stream
> the active_worst for spatial layer of the second superframe is set
> to the last_QP of the correspondng spatial layer of the first superframe.
> Also make sure nonrd_keyframe is set for non-base spatial layers.
> 
> Change only affects SVC mode wit number_spatial_layers > 1 and
> svc->disable_inter_layer_pred == INTER_LAYER_PRED_OFF.
> And only affects first and second frame of sequence.
> 
> Change-Id: I8ee9a0873ab1d3a02515774571f719617771ad41

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: If73d9f3932224fc6751e773763adf7e8ee67d17f
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2018-04-06 05:41:56 +00:00
Marco Paniconi 5cc8df5bcf vp9-svc: Fix to first superframe when inter_layer is off.
When the application selects the setting INTER_LAYER_PRED_OFF
each spatial stream should be decodeable separately.
For this we need to force key frames on all spatial layers
on the first superframe.

In order to maintain the quality at the beginning of the stream
the active_worst for spatial layer of the second superframe is set
to the last_QP of the correspondng spatial layer of the first superframe.
Also make sure nonrd_keyframe is set for non-base spatial layers.

Change only affects SVC mode wit number_spatial_layers > 1 and
svc->disable_inter_layer_pred == INTER_LAYER_PRED_OFF.
And only affects first and second frame of sequence.

Change-Id: I8ee9a0873ab1d3a02515774571f719617771ad41
2018-04-05 21:25:55 -07:00
Marco Paniconi a4e453f668 vp9-svc: Fix to disable cyclic refresh on key superframes.
Cyclic refresh is disabled on key frames, but we did not
disable it for for spatial layers whose base is a key frame
(i.e., on a key-superframe).

This fix means generally somewhat lower frame-level QP will be
used for those spatial layers whose base is a key frame,
which will generally mean little better quality for the
key-superframes.

Change-Id: Idf090651aa2f5856fb6696c89198a9f6d5d50280
2018-04-05 15:58:36 -07:00
Johann ac4dc51027 update codereview.settings
Using pdfium as a reference:
https://pdfium.googlesource.com/pdfium/+/master/codereview.settings

Change-Id: I30874cf9f1d575325c32342146137a1952db91ba
2018-04-05 12:18:38 -07:00
Johann 000c276ffa ios configure: quiet shell warning
Generating file lists on a non-mac with:
--target=x86-iphonsimulator-gcc --enable-external-build
the lack of xcrun would cause a warning to print:
libvpx/build/make/configure.sh: line 1397: [: : integer expression expected

Change-Id: I4623b6c5b65296bc71986cd042823f4be9427b42
2018-04-04 11:53:11 -07:00
Marco Paniconi 9336197663 Merge "vp9-svc: Fix in choose_partitioning for different scaling." 2018-04-04 05:59:00 +00:00
Marco Paniconi 742e93f030 Merge "Fix to svc sample enocoder for visual studio build." 2018-04-04 02:15:56 +00:00
Marco Paniconi c9b6c5d5ad vp9-svc: Fix in choose_partitioning for different scaling.
In the SVC encoder LAST ref frame should be the last temporal
reference at the same resolution. This is the case for the default/fixed
patterns, but may not be the case for arbitrary pattern in flexible mode.

Add check that the LAST reference frame has same resolution as the current frame.
If the reference scale for LAST is different from current treat the current
frame as key frame just for the purpose of superblock partitioning.
This avoids potential segfault in vp9_int_pro_motion_estimation() for different
scaled reference.

Change-Id: I4276ff616de46cd4e12c73316f85ae313f170242
2018-04-03 16:21:44 -07:00
Linfeng Zhang 47f22fc5fb rm CONVERT_TO_SHORTPTR in vpx_highbd_comp_avg_pred
BUG=webm:1388

Change-Id: I1d0dd9af52a1461e3e2b2d60e8c4b6b74c3b90b0
2018-04-03 15:36:41 -07:00
Marco Paniconi cb63aefa45 Fix to svc sample enocoder for visual studio build.
Fix to sample encoder, for visual studio buid failure:
conversion from 'uint64_t' to 'int'.

Change-Id: I385ab8482e1ee97da9872437f8286d9071e38e0e
2018-04-03 15:09:09 -07:00
Johann b2c6e1410f remove support for yuv 411
Previously we attempted to convert 411 input. Remove support
because malformed 411 input can cause the conversion to crash.

BUG=webm:1386

Change-Id: I3d41465a94867ee7f8eaa43fb76beb41f8fa644b
2018-04-02 15:37:26 -07:00
Marco Paniconi ee37046f1b vp9-svc: Fix to svc sample encoder for write_out.
When writing out stream for spatial layer N,
make sure to include all spatial layers up to N.

Fixes an issue with the streams when frame dropping occurs.

Change-Id: I1e20b7dac6b94dcda751043541dd8a12f7df6d8c
2018-04-02 12:57:35 -07:00
Johann Koenig 99fa889f91 Merge "helper script for sanitizer testing" 2018-04-02 18:30:28 +00:00
Linfeng Zhang c3ba5c521e Merge changes I5704bd66,I4d548e97
* changes:
  Shrink size of mode_map in struct TileDataEnc
  Update sad4d x86 functions
2018-04-02 16:05:05 +00:00
James Zern eae638db15 Merge "vp9_datarate_test: relax over shoot constraints" 2018-03-31 00:49:11 +00:00
Jerome Jiang c706c02291 Merge "VP9 SVC: Write bitstream for each spatial layer in sample." 2018-03-30 23:57:21 +00:00
James Zern 2b9c017cf0 vp9_datarate_test: relax over shoot constraints
in BasicRateTargetingVBRLagZero and
BasicRateTargetingVBRLagNonZeroFrameParDecOff after:
e0b28ad69 Add extra case to wq_err_divisor()

BUG=webm:1512

Change-Id: Id181613cc191ff2a2281deffe141efb982501edf
2018-03-30 16:44:41 -07:00
Jerome Jiang 1d6f930517 VP9 SVC: Write bitstream for each spatial layer in sample.
Added control for denoiser in the sample SVC encoder.

Change-Id: I8e62aa2fc13a943eb110cb33e419e912a898bbc7
2018-03-30 15:31:35 -07:00
Jerome Jiang 01b19a1c3e Code cleanup for datarate tests.
Add/Remove static to functions. Name change.

Change-Id: I5de3efc23cd151fe8e70fe67a7a11acfcfa707dc
2018-03-30 10:47:41 -07:00
Jerome Jiang 4127799a91 Split datarate_test.cc to vp8, vp9, svc ones.
As we add more tests to datarate_test.cc, it's growing bigger and hard
to find specific test.

Split it to vp8, vp9 and svc ones.

Change-Id: Ie8c302010cf304a95554bee19d87ddc90498d0fb
2018-03-29 16:40:20 -07:00
James Zern d636fe53af Merge "test: use testing::*tuple instead of std::tr1" 2018-03-29 19:01:34 +00:00
Marco Paniconi 81e78a1d5f Merge "vp9-svc: Fix in pickmode for key frames." 2018-03-29 18:49:26 +00:00
Jerome Jiang d467bbe11e Merge "VP9 SVC: Add enum type for framedrop_mode." 2018-03-29 18:44:35 +00:00
Jerome Jiang 70b86af22e VP9 SVC: Add enum type for framedrop_mode.
Change-Id: I3d4697b00729553e0860762b9264e29b8a89b9d4
2018-03-29 10:49:06 -07:00
Johann 20521c394c helper script for sanitizer testing
source tools/set_analyzer_env.sh <sanitizer>
will set the compiler, flag, and sanitizer variables necessary to build
and run a variety of sanitizers.

Change-Id: I5dd2ae947cb337d5ccf2a11e9fe87991bc8ba0c8
2018-03-29 06:58:30 -07:00
paulwilkins e0b28ad696 Add extra case to wq_err_divisor()
Add extra case for 360P and smaller.
This hurts a little in psnr for the derf cif set but helps a little
in terms of average rate accuracy. Most clips come in a little
smaller with this patch.

No impact on larger formats.

Change-Id: I5056246cb53b90f961ff9ea5813937f33778aa4c
2018-03-29 12:52:15 +01:00
xiwei gu 5476ab095f Merge "vp9: [loongson] optimize vpx_convolve8 with mmi." 2018-03-29 01:05:33 +00:00
Marco Paniconi fe72ba15ac vp9-svc: Fix in pickmode for key frames.
For the fixed/default SVC patterns, GOLDEN is the
spatial reference, except on key frames, where LAST
is labeled as the spatial reference.

The current code was assuming GOLDEN is always the
spatial reference for the purpose of selecting the
subpel motion (due to the downsampling filter).

Fix is make sure flag_svc_subpel is set and used
with spatial_ref, which is labeled as the proper
spatial reference before entering mode check.

Some quality improvement on key frames.
Change-Id: Id236bcd47055b035731cc910ed84449d7e29f50c
2018-03-28 16:57:08 -07:00
Marco Paniconi 382afcab98 Merge "vp9-svc: Modify logic for frame dropping with spatial layers." 2018-03-28 21:29:35 +00:00
Marco Paniconi e358cf0a43 vp9-svc: Modify logic for frame dropping with spatial layers.
In the constrained framedrop mode for svc: modify the buffer check
condition relative to (non-zero) dropmark to include uppper spatial layers,
in addition to the current spatial layer.

But keep the single layer check if the buffer goes below zero, since
in this case (buffer underflow) we should force drop of that layer
regardless of upper layers.

Change-Id: Id277f0b4a3ae6275effdd5f5f0c80e3229c17424
2018-03-28 13:00:00 -07:00
Linfeng Zhang cd83802885 Shrink size of mode_map in struct TileDataEnc
To reduce the memcpy() cycles in vp9_rd_pick_inter_mode_sb().
The maximum value of mode_map is (MAX_MODES - 1) = 29.

Change-Id: I5704bd66838ea0b075f0afb001f5cbebfd3f1602
2018-03-28 12:50:31 -07:00
Linfeng Zhang 39de45d3cc Update sad4d x86 functions
Speed change is marginal.

Change-Id: I4d548e9763ce43bd546f19132202f7a8509a32bf
2018-03-28 12:49:12 -07:00
James Zern db49a22cfa test: use testing::*tuple instead of std::tr1
googletest imports tuple into testing to allow for compatibility across
c++ versions where tuple may be in std::tr1 or std. fixes deprecation
warnings under visual studio 2017

Change-Id: Id78b372d5478b12d8c8f63fd3f2166fec25aa8be
2018-03-28 12:45:35 -07:00
Linfeng Zhang debd86ec82 Merge "Add speed test in SADx4Test" 2018-03-28 19:39:37 +00:00
Linfeng Zhang 8edd5051aa Add speed test in SADx4Test
Change-Id: I42dd3df8c13c0a6d08ce28e27e8917b5d831fc1a
2018-03-28 11:13:28 -07:00
gxw 25d9adb74b vp9: [loongson] optimize vpx_convolve8 with mmi.
1. vpx_convolve8_vert_mmi
2. vpx_convolve8_horiz_mmi
3. vpx_convolve8_mmi
4. vpx_convolve8_avg_mmi
5. vpx_convolve8_avg_vert_mmi

Change-Id: I41a6b3b4f327d6b67d282e0163cfa0aee8648abe
2018-03-28 18:11:16 +00:00
Marco Paniconi 872e34ae8d vp9-svc: Add check in datarate unittests for frame-dropping.
Add verfication for constrained svc framedrop mode: check that
if a given spatial is dropped, all uppper layers must be dropped.

Change-Id: I9b4821b23c95d1d9d0c031a41af19984647ec5dc
2018-03-28 10:30:59 -07:00
Marco Paniconi 7b9984b386 vp9-svc: Add logic to enable for constrained framedrop.
Add the logic for the constrained framdrop mode for SVC.

Add test case in datarate unittests.
Also lower target bitrates in the tests to better test
frame dropper.

Change-Id: I8ee1b8cb56d835c233ad1fbe0fc1456cb2e7291f
2018-03-27 15:02:12 -07:00
Johann Koenig 239511fad8 Merge "third_part/googletest: update to release-1.8.0-742-g7857975" 2018-03-27 01:08:12 +00:00
Marco Paniconi 223f9e3671 vp9-svc: Allow for setting frame drop thresholds per layer.
Add encoder control to set the frame drop thresholds per
spatial layer, and add a frame drop mode: 0 = per-layer drop,
and 1 = constrained drop mode (a drop on a given layer forces
drops to all upper layers).

Default is mode 0 (per-layer dropping).
Implementation for mode 1 will come in subsequent change.

If the control is not used, then the spatial layer frame
drop thresholds (water mark) are all equal and set to the value
given by the encoder config (oxcf->drop_frames_water_mark).

Bump up the ABI version.

Change-Id: Id038d4181b86fa98b3d44d026f96d5f344d81629
2018-03-26 13:56:55 -07:00
Johann aa5e9c6a7c third_part/googletest: update to release-1.8.0-742-g7857975
Address std::tr1::tuple warnings:
https://github.com/google/googletest/issues/1111

The unsigned overflow fix has been superseded by:
https://github.com/google/googletest/pull/1180

Change-Id: I92dc0ba08a4d0d63f5e5b2da7b64f4a4642ed9ab
2018-03-26 11:29:11 -07:00
Johann 9037a05041 msvs build: only fix_file_list when it is broken
Clears a warning when generating VS project files with older versions of
bash:
declare: -n: invalid option

Change-Id: Id0c0bc17dc5a1599f7d2d73e3cc9259a45540f3f
2018-03-26 09:39:24 -07:00
Alexandra Hájková 233aef07f4 ppc: Add vpx_iwht4x4_16_add_vsx
Change-Id: I47336ff6dcb9d8cd54c055ec201c87f62e269eb6
2018-03-24 16:52:25 +00:00
Paul Wilkins 2b800d9394 Merge "Adjustment to initial q range estimate and kf boost." 2018-03-24 11:27:12 +00:00
Martin Storsjö f4b1eca53e Merge "Restore emms usage on x86_64 after 726b021a12c1b" 2018-03-23 22:04:43 +00:00
Martin Storsjo a6fdfda44c Restore emms usage on x86_64 after 726b021a12
Even on x86_64, emms has to be called if the x87 state has
been clobbered - the calling code (either within libvpx or
in a caller outside of libvpx) may be using the x87 instructions,
even though use of them isn't all that common on x86_64.

This fixes builds with clang for mingw/x86_64.

Change-Id: I1f6072835590b862bad156f17331ba65c813ddd9
2018-03-23 19:57:46 +00:00
Martin Storsjö 343ef23db0 Merge "thumb: Remove a brittle, ugly and unused arm->thumb conversion" 2018-03-23 19:54:01 +00:00
Martin Storsjö 9c9de8a8ce Merge changes from topic "llvm-mingw"
* changes:
  configure: Add an arm64-win64-gcc target
  test: Check for ARCH_X86_64 in addition to _WIN64
  configure: Add an armv7-win32-gcc target
  ads2gas: Add a -noelf option
2018-03-23 19:49:40 +00:00
Johann Koenig 99e1784525 Revert "remove fldcw/fstcw from Win64 builds"
This reverts commit 60a3cb9ad8.

Reason for revert: x87 instruction usage might not be as
clear cut as I would like. At the very least, llvm mingw
builds appear to having issues with emms.

Original change's description:
> remove fldcw/fstcw from Win64 builds
>
> _MCW_PC (Precision control) is not supported on x64:
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/control87-controlfp-control87-2
>
> The x87 FPU is not used on Win64 or ARM so setting the x87 control word
> is not necessary. The SSE/SSE2 and ARM FPUs don't have a precision
> control - the precision is embedded in each instruction - so the need to
> set the control word is also gone.

BUG=webm:1500

Change-Id: I25bcfa96bc9c860f6c7e03315d75fa6fd1d88ec5
2018-03-23 11:09:15 -07:00
Johann Koenig 1000e07609 Merge "remove fldcw/fstcw from Win64 builds" 2018-03-23 13:23:58 +00:00
Johann 60a3cb9ad8 remove fldcw/fstcw from Win64 builds
_MCW_PC (Precision control) is not supported on x64:
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/control87-controlfp-control87-2

The x87 FPU is not used on Win64 or ARM so setting the x87 control word
is not necessary. The SSE/SSE2 and ARM FPUs don't have a precision
control - the precision is embedded in each instruction - so the need to
set the control word is also gone.

BUG=webm:1500

Change-Id: I014513282a7dc320d1cdeaec48249d98a66bf09f
2018-03-23 06:22:39 -07:00
Martin Storsjo 8af243cfba configure: Add an arm64-win64-gcc target
This configuration doesn't require any extra custom settings, since
it only uses neon intrinsics that are handled automatically by the
compiler (no external assembly).

Change-Id: I35415c68f483a430c0672e060a7bbd09a3469512
2018-03-23 13:42:01 +02:00
Martin Storsjo e9ea629b13 test: Check for ARCH_X86_64 in addition to _WIN64
_WIN64 is also defined when targeting windows on aarch64.

Change-Id: I42b84e14079c19d0ba9362a06d8c6e7287644373
2018-03-23 13:41:35 +02:00
Martin Storsjo b08a0b07d4 configure: Add an armv7-win32-gcc target
This builds for windows on arm, with llvm-mingw. The target triplet
is named -gcc since that's how similar existing targets are named,
even though it technically runs clang (via frontends named
"$CROSS-gcc").

Assemble using $CC -c since there's no standalone assembler
available (except perhaps llvm-mc).

Change-Id: I2c9a319730afef73f811bad79f488dcdc244ab0d
2018-03-23 13:25:35 +02:00
Martin Storsjo 2cc5c8f97c ads2gas: Add a -noelf option
This allows skipping elf specific features from the output.

Change-Id: I739299ba41286ca10415e056b4ffd561be5e0350
2018-03-23 13:23:16 +02:00
Martin Storsjo 3abdb1ca65 thumb: Remove a brittle, ugly and unused arm->thumb conversion
The relevant code that this conversion handled was removed in
c26a9ecaa2.

Change-Id: Iee40f95134e609c291c7c4e06bc50dcb895bc5e3
2018-03-23 13:16:57 +02:00
James Zern c13aaf7a3e Merge changes Ied91c7ef,If2dcc6e2,Ib7397e71,Ib6392c79
* changes:
  Fix implicit-fallthrough warnings
  Fix dangling-else warnings
  Fix a strict-overflow warning
  Rename several static NEON iht functions
2018-03-23 03:56:50 +00:00
Jerome Jiang 4bff5bca92 Merge "vp9 svc frame drop: enable adaptive rd for row mt." 2018-03-23 01:41:21 +00:00
Jerome Jiang 1ae97b4a4d vp9 svc frame drop: enable adaptive rd for row mt.
adaptive_rd_threshold_mt is set to 1 when speed >= 7 for SVC.
QVGA in SVC uses speed 5 which set adaptive_rd_threshold_mt to 0.
If VGA or HD is dropped for the last super frame, the flag is still 0
when the encoder is destroyed. Thus memory won't be released.

Change the bitrate threshold in datarate test.

Change-Id: I55352cc0b030568d38eb735d99c2fa29058d3690
2018-03-22 17:11:05 -07:00
Linfeng Zhang 920f4ab8f8 Fix implicit-fallthrough warnings
Compiler -- gcc (Debian 7.3.0-5) 7.3.0

Change-Id: Ied91c7ef3d25c3ef44a1f667656176e2709b4f44
2018-03-22 11:44:05 -07:00
Linfeng Zhang c101a5f5c4 Fix dangling-else warnings
Compiler -- gcc (Debian 7.3.0-5) 7.3.0

Change-Id: If2dcc6e215a2990cde575f0e744ce0c7a44a15f1
2018-03-22 11:44:01 -07:00
Linfeng Zhang 5192ce92b8 Fix a strict-overflow warning
Compiler -- gcc (Debian 7.3.0-5) 7.3.0

./libvpx/vp9/encoder/vp9_denoiser.c:374:9: assuming signed overflow
does not occur when assuming that (X + c) < X is always false
[-Wstrict-overflow]
         for (j = 0; j < xmis; j++) {

Change-Id: Ib7397e718ff717bdabc088fc4c6e1771381fb522
2018-03-22 11:42:25 -07:00
Linfeng Zhang a09acf7e19 Rename several static NEON iht functions
Change-Id: Ib6392c79d0269a43dbe180a89f2571482d98844d
2018-03-22 11:13:12 -07:00
Marco Paniconi 3cb9c5ffe9 vp9-svc: Fix to sample encoder
Get the correct computation of number of input
layers to account for frame drops.

Change-Id: I39637381e1981b53c930da67a5c525191de6907d
2018-03-21 16:47:59 -07:00
James Zern 62c4747532 Merge "vp9_highbd_iht8x8_add_neon: rm unused functions" 2018-03-21 17:52:47 +00:00
Jerome Jiang 1f82e06122 VP9 SVC: Add control to disable inter layer prediction.
Add VP9E_SET_SVC_INTER_LAYER_PRED to disable inter layer (spatial)
prediction.
0: prediction on
1: prediction off for all frames
2: prediction off for non key frames

Bump up ABI version.

Change-Id: I5ab2a96b47e6bef202290fe726bed5f99bd4951f
2018-03-20 19:28:12 -07:00
Marco Paniconi 4a20caef78 Merge "vp9-svc: Improve frame dropper for spatial layers." 2018-03-20 21:46:36 +00:00
Marco Paniconi 126a3718fc vp9-svc: Improve frame dropper for spatial layers.
SVC frame dropper: modify the logic to allow for individual
spatial layers to drop. This removes the constraint that all
upper spatial layers must drop when a given spatial layer drops.

Add a flag to the pkt to indicate whether a spatial layer is
encoded or dropped. This is needed for applications that enable
this feature (frame dropping for SVC).

For a current spatial layer, if its previous spatial layer is
dropped, then disable certain features for that layer:
inter-layer prediction, base_mv, partition_reuse, copy partition.

Also add the constraint to never drop a spatial layer if its
base layer is a key frame.

Updates to sample encoder (vp9_spatial_svc_encoder) and the
SVC datarate unittests to properly handle frame dropping.

Bump up ABI version.
Change-Id: I7d14ccf67b8d014a7abfce5ba3989fc623e94067
2018-03-20 10:34:45 -07:00
Johann 7a9a46fb53 visual studio: add yasm instructions
Change-Id: Ied2a1fc0e1d1a2245d63f267669add8889dd0cec
2018-03-20 10:08:45 -07:00
Johann Koenig 8b263798d0 Merge "x86 android: default on realtime-only" 2018-03-20 17:06:18 +00:00
Johann Koenig c5630b9e6c Merge "build: remove stale .git files" 2018-03-20 17:06:05 +00:00
Johann Koenig 52da0428be Merge "reland "use intrinsics for 'emms'"" 2018-03-20 17:05:17 +00:00
Johann 907e966adf build: remove stale .git files
These were used for an older style of Visual Studio configurations.

Change-Id: I51f07b30ad51c4da0c5caf1ede36cdb69b2d2b19
2018-03-19 17:36:38 -07:00
Johann 726b021a12 reland "use intrinsics for 'emms'"
Only target 32bit builds. Visual Studio does not define _mm_empty for
64bit configurations.

Rename emms.asm and remove from 32 bit builds to avoid empty file
warnings.

Don't check register state on 64bit builds.

BUG=webm:1500

This reverts commit 60beb781c1.

Change-Id: I5ac4cf6c67249ff24f7da19792144de20527bfce
2018-03-19 17:11:55 -07:00
Jerome Jiang 4038632fab Merge "VP8: Fix out of range index for mvcost." 2018-03-19 19:00:55 +00:00
James Zern 0dad4e2997 vp9_highbd_iht8x8_add_neon: rm unused functions
their use was removed in:
d8424d289 Fix a bug in vp9_highbd_iht8x8_64_add_neon

Change-Id: I041800f3fb34ffbb7cfa7401370c5a5ceeab01c6
2018-03-18 18:24:41 -07:00
James Zern 0f9521f0a8 CopyFrameTest: reduce max size for 32-bit targets
avoids potential OOM when allocating 3 buffers for 16383x16383; 3840 is
used as a replacement
this test was missed in:
215bddf32 vpx_scale_test: reduce max size for 32-bit targets

Change-Id: I515adf5999c6ef1724394ccd62d677134bd35e6d
2018-03-18 15:15:07 -07:00
Jerome Jiang ca28740570 VP8: Fix out of range index for mvcost.
Clamp index between 0 and MVvals.

Bit exact for speed -8, -6 and -4 on RTC set.

BUG=b/72510002

Change-Id: I61bdb02a0924e157b3c1980f74fbbfe5ce51bc44
2018-03-17 04:04:10 +00:00
James Zern 215bddf324 vpx_scale_test: reduce max size for 32-bit targets
avoids potential OOM when allocating 3 buffers for 16383x16383; 3840 is
used as a replacement

Change-Id: I92116ab69b10db6820fc651d3626bd9699700208
2018-03-16 17:47:56 -07:00
paulwilkins da7a7089c6 Adjustment to initial q range estimate and kf boost.
Adjustment to initial active based on image size.
Add extra breakout case for kf boost loop.
Small adjustment to q delta calculation for key frames.

Net % improvements for all standard tests sets (-ve values) measured
using c-bvr mode.

(Overall PSNR, SSIM, PSNR-HVS)
Low Res:  -0.223	-0.229	-0.107
Mid Res:  -0.175	0.008	-0.180
High Res: -0.131	0.106	-0.206
NFlix 2K:   -0.390	-0.271	-0.489
NFlix 4K:   -1.370	-0.825	-1.590

Change-Id: I06a39de43594e1a99bb0cb281af15cdb8058a8ed
2018-03-16 12:24:54 +00:00
Marco Paniconi 2640f25072 vp9-svc: Frame dropper for SVC.
If a given spatial layer decides to drop, due to the
buffer/overshoot conditions for that layer, then drop
that current spatial layer and all spatial layers above.

In the current implementation the svc frame counter
(and hence the pattern for the non-flexible SVC case)
are updated on frame drops.

Also add last spatial layer encoded to the pkt.
This is useful for RTC applications that enable
frame dropping for SVC.

Update to the SVC datarate tests:
enabled frame dropper on all SVC datarate tests, and
made a fix to properly set the temporal_layer_id, which
works now even on frame drops.

Change-Id: If828c193f3cb6b1839803fd52fe9fbbda5b5a039
2018-03-15 21:04:59 -07:00
James Zern d07a5bfbf8 Merge "libs.mk,vcxproj generation: split srcs in invocation" 2018-03-16 03:52:02 +00:00
James Zern b986ba84b3 Merge "Revert "vp9_loopfilter.c: zero lfl_uv"" 2018-03-16 02:11:52 +00:00
James Zern e97804c67d Revert "vp9_loopfilter.c: zero lfl_uv"
This reverts commit 13d0955b25.

Reason for revert:
this should be investigated further to ensure the memset is really
necessary outside of the static analysis pass.

Original change's description:
> vp9_loopfilter.c: zero lfl_uv
> 
> The initialization depends on cm and mi_row which static
> analysis does not approve of.
> 
> Clears a static analysis warning:
> warning: The right operand of '+' is a garbage value
> const loop_filter_thresh *lfi = lfthr + *lfl;
> 
> Change-Id: I8c863ced2b1e9a7e10103b7281098f20941a6ca2

TBR=johannkoenig@google.com,marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: Icadb6438fbcddba747622f06f2eadebdb333edf6
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2018-03-15 21:43:29 +00:00
Johann Koenig b24e377008 Merge changes I5501d0d6,I8c863ced,I19895d06,Ifa39353d,I09bd209b
* changes:
  vp9_resize.c: assert vp9_highbd_resize_plane conditions
  vp9_loopfilter.c: zero lfl_uv
  vp8 rdopt.c: zero rd.[rate_uv|distortion_uv]
  vp8 mfqe: zero map[]
  temporal svc: zero layer_target_bitrate
2018-03-15 21:16:43 +00:00
Johann Koenig f3adb45914 Merge "remove spatial svc experiment" 2018-03-15 19:26:16 +00:00
Linfeng Zhang 5dfb01bb30 Merge "Add vp9_highbd_iht16x16_256_add_neon()" 2018-03-15 19:21:36 +00:00
James Zern 944e83ad32 libs.mk,vcxproj generation: split srcs in invocation
this avoids truncation under mingw which would result in link failures

BUG=webm:1434

Change-Id: I6eb45d94f02966532b3cdf02860a5bf2e5d3efef
2018-03-15 11:58:54 -07:00
Linfeng Zhang ddb3d7a8a1 Merge changes I9e0bf2c7,I695b4090
* changes:
  Fix a bug in vp9_highbd_iht8x8_64_add_neon
  Fix a bug in vp9_highbd_iht4x4_16_add_neon()
2018-03-15 18:05:08 +00:00
Marco Paniconi 881c8ec816 vp9-svc: Bugfix to dyanmic enabling/disabling of layers.
Fix a bug when middle and top spatial layer are skip encoded
(disabled) and then re-enabled again, during the sequence.

Issue is that pending_frame_count in the packing may
be incremented on middle layer, even though that layer is skipped
(not encoded and hence zero size). Fix is to add size check.

Modified existing unitest to reproduce the issue.

Change-Id: I86d806a112d468e06b04fbf7c46ae07db9e0ad93
2018-03-14 17:31:59 -07:00
Johann a3f09c03b8 vp9_resize.c: assert vp9_highbd_resize_plane conditions
Clears static analysis warnings similar to the low bitdepth version:
commit c4367b9b51
Author: James Zern <jzern@google.com>
Date:   Wed Mar 18 14:34:30 2015 -0700

    vp9_resize_plane: quiet some static analysis warnings

Change-Id: I5501d0d6ad7c7720d746d53ec07078cb9051d0d7
2018-03-14 16:42:14 -07:00
Johann 0e97e70496 remove spatial svc experiment
Change-Id: Ifda11caaf992d10f2d93d6cd1d07b79b6047be05
2018-03-14 22:00:28 +00:00
Johann Koenig 5c5dc73320 Merge "spatial svc: set window_size to 15" 2018-03-14 22:00:08 +00:00
Johann 13d0955b25 vp9_loopfilter.c: zero lfl_uv
The initialization depends on cm and mi_row which static
analysis does not approve of.

Clears a static analysis warning:
warning: The right operand of '+' is a garbage value
const loop_filter_thresh *lfi = lfthr + *lfl;

Change-Id: I8c863ced2b1e9a7e10103b7281098f20941a6ca2
2018-03-14 14:22:08 -07:00
Johann f9a13a1786 vp8 rdopt.c: zero rd.[rate_uv|distortion_uv]
These values are not consistently set before calling update_best_mode.

In vp9_rdopt.c they are individual values instead of a struct and are
zero'd at declaration.

Clears a static analysis warning:
warning: The right operand of '-' is a garbage value
RDCOST(x->rdmult, x->rddiv, (rd->rate2 - rd->rate_uv - other_cost),

warning: The right operand of '-' is a garbage value
(rd->distortion2 - rd->distortion_uv));

Change-Id: I19895d062e7c0ac67937126ebc5dcb0afd3a2931
2018-03-14 13:56:39 -07:00
Johann 1f41c0d37f vp8 mfqe: zero map[]
The loop appears to set map[i] with the intention of running
the 'j' loop up to that point. However, without zero'ing map[]
first the behavior is unpredictable.

Fixes a static analysis warning:
warning: Branch condition evaluates to a garbage value
for (j = 0; j < 4 && map[j]; ++j) {

Change-Id: Ifa39353d8aa5cc47b467a7d3d8cdd3b5319fd997
2018-03-14 13:56:39 -07:00
Johann 8276005eba temporal svc: zero layer_target_bitrate
These values are set in main() from user input. Ensure
they are cleared out first.

Clears a static analysis warning:
warning: The right operand of '*' is a garbage value
1000.0 * rc->layer_target_bitrate[0] / rc->layer_framerate[0];

Change-Id: I09bd209be5aff31b87597a24d37a9673fa99381b
2018-03-14 13:56:39 -07:00
Johann Koenig 60beb781c1 Revert "use intrinsics for 'emms'"
This reverts commit 118a57045b.

Reason for revert: Fails on Visual Studio builds:

vpxmdd.lib(vpx_ports_emms_mmx.obj) : error LNK2019: unresolved
external symbol _m_empty referenced in function
vpx_clear_system_state

Original change's description:
> use intrinsics for 'emms'
> 
> BUG=webm:1500
> 
> Change-Id: I3235d8c2abc01dd3a35e14c5cbcfe20283ff8fb2

Change-Id: Ia9c40bc103c57cced83353249c55218eaf2f0b0c
2018-03-14 20:26:44 +00:00
Marco Paniconi ae856e4012 Merge "vp9-svc: Fix to update layer counters when layer is skipped." 2018-03-14 20:13:34 +00:00
Johann 6b2cc75622 spatial svc: set window_size to 15
Static analysis does not recognize that output_rc_stat guards
the usage of window_size. Clears this warning:
The right operand of '>' is a garbage value
if (frame_cnt > (unsigned int)rc.window_size) {

set_rate_control_stats sets window_size to 15. Zeroing it
just introduces another static analysis warning.

Change-Id: Ieee7b81a385f986e42189101cfa39279e519b368
2018-03-14 19:30:38 +00:00
Marco Paniconi f50ad31ec1 vp9-svc: Fix to update layer counters when layer is skipped.
Update layer counters when layer is skipped,
for any spatial layer.

Change-Id: Ie37c4a16ccafdef3390b651dec473beb5d926896
2018-03-14 11:51:01 -07:00
Johann f0a3979063 spatial svc: zero sizes
This should be taken care of by parse_superframe_index but
the static analysis is not recognizing it because it depends
on 'marker' which is read from the bitstream.

Clears a static analysis warning:
The right operand of '*' is a garbage value
rc.layer_encoding_bitrate[layer] += 8.0 * sizes[sl];

Change-Id: I8ee48a98f907bc7b46869fd27a351f33e2e7de71
2018-03-13 18:22:50 -07:00
Johann b0d57f682d spatial svc: remove vpx_svc_get_message
Print error messages as they are encountered. This was the default
behavior.

Removes a static analysis warning regarding the use of strncat:
Null pointer argument in call to string length function

As this is the only use of strncat in the library, remove it and the
associated public function.

Change-Id: Id55305c5a4d65f11da88c3a2203ff824200f526f
2018-03-13 17:58:24 -07:00
Linfeng Zhang 9351f96069 Add vp9_highbd_iht16x16_256_add_neon()
BUG=webm:1403

Change-Id: I2293c11666786be276909d48ee78dacb40a89e25
2018-03-13 17:39:23 -07:00
Linfeng Zhang d8424d2890 Fix a bug in vp9_highbd_iht8x8_64_add_neon
This bug was introduced in 29b6a30c.

BUG=webm:1403

Change-Id: I9e0bf2c7a01d8ff1c714c12236f7985b772b0540
2018-03-13 17:38:29 -07:00
Johann 6f9163db95 spatial svc: remove unused locals
Clears static analysis warning:
Value stored to 'tl' is never read

Change-Id: If047a74f508288c63d5b83ed0f3ad34f791f9312
2018-03-13 17:12:35 -07:00
Johann Koenig c257d608fc Merge "use intrinsics for 'emms'" 2018-03-13 23:51:08 +00:00
Linfeng Zhang 88dc0d6062 Fix a bug in vp9_highbd_iht4x4_16_add_neon()
This bug was introduced in 36363304.

BUG=webm:1403

Change-Id: I695b409047e41ab7e0460981524310d78753751a
2018-03-13 16:10:00 -07:00
Johann 7b278e3072 spatial svc: rescope sl
sl was passed to set_frame_flags_bypass_mode, triggering
an uninitialized variable warning. Inside the function it
is only used as a local variable.

Change-Id: If743626e9e10fd41d135e3b4ad6196dc4dc90172
2018-03-13 14:38:05 -07:00
Johann 118a57045b use intrinsics for 'emms'
BUG=webm:1500

Change-Id: I3235d8c2abc01dd3a35e14c5cbcfe20283ff8fb2
2018-03-13 13:17:28 -07:00
Johann fc1302cd8b x86 android: default on realtime-only
Like the arm-based target, set realtime-only on by default.

BUG=webm:873

Change-Id: I2e04cfc43390953435e985716a25f32b8d4fadda
2018-03-12 15:20:28 -07:00
Johann 95a71057f0 autodetect macOS High Sierra
Add darwin17 target

Change-Id: I349a2f6a0396c59269f567a03ae813e3e59ccefa
2018-03-12 14:54:48 -07:00
Johann Koenig 7b5a57449b Merge "vp8 temporal_filter: ignore return value" 2018-03-12 20:56:30 +00:00
Johann 025b138679 vp8 temporal_filter: ignore return value
Clears up static clang analysis warning regarding a dead store.

Change-Id: I6a90e6fd5f2775d933c46c7553811635bd2def21
2018-03-12 19:52:11 +00:00
Marco Paniconi 312745cac4 vp9-svc: Update layer frame counters when layer is skipped.
When an enhancement spatial layer is skipped, we should check
for updating the layer frame counters.

Change-Id: Ib79d0955c62fb465f59ef2f9ac45240ae2614d7b
2018-03-12 12:21:10 -07:00
Marco Paniconi 3223a3f892 vp9-SVC: Fix to choose_partition when LAST ref is NULL.
This causes assert to trigger in choose_partitioning().
This can happen in some cases when enhancement layers
are enabled midway during the stream.

Change-Id: I69c3c8b4b1e3f1c7d8d7294d633ca5ddca148e8b
2018-03-12 09:13:25 -07:00
James Zern c12398fe81 Merge "vpx_scale_test: add w/h output to alloc failure" 2018-03-08 21:13:09 +00:00
Paul Wilkins 7987f35391 Merge "Change to KF frame boost calculation." 2018-03-08 13:34:15 +00:00
James Zern 6ed4c253a9 vpx_scale_test: add w/h output to alloc failure
Change-Id: Ib5df91d9fcd7fe973a2f7d8e73a204259beddc07
2018-03-07 23:07:51 -08:00
James Zern 0bee6de332 Merge changes If96fd6f1,I84b27337
* changes:
  Fix a bug in vp9_iht16x16_256_add_neon()
  Fix a bug in vp9_iht8x8_64_add_neon()
2018-03-08 03:41:27 +00:00
Johann 7195a2eaa4 add worst-case frame size cap
The largest frame is currently in choose_partitioning:
warning: stack frame size of 44156 bytes in function 'choose_partitioning'

but adding HBD amplifies other things:
warning: stack frame size of 51480 bytes in function 'dec_build_inter_predictors'

Add some padding for sanitizer and variances between compilers.

BUG=webm:1498

Change-Id: I0d94d4f94d25dafafca9d7484881c2ce5f8de371
2018-03-05 20:09:43 -08:00
Linfeng Zhang d18477b449 Fix a bug in vp9_iht16x16_256_add_neon()
This bug was introduced in 88c23864.

BUG=webm:1403

Change-Id: If96fd6f102be6b9bda866e55e574257287746f4a
2018-03-05 16:14:35 -08:00
Linfeng Zhang c244a86234 Fix a bug in vp9_iht8x8_64_add_neon()
This bug was introduced in b14b616d.

BUG=webm:1403

Change-Id: I84b2733734982e52b66548850d61758c772b5494
2018-03-05 15:33:37 -08:00
Johann e0b88b5c00 move vp8 encodeopt to block_error_sse2
The file contains sse2 implementations related to various block error
functions. Update the .mk file to include it only when sse2 is
requested.

BUG=webm:1500

Change-Id: I67b766faed425fd7a96db8541b13c69670b65fec
2018-03-04 12:38:07 -08:00
James Zern c6fcb9bb94 disable vp9_highbd_iht{4x4_16,8x8_64}_add_neon
these causes test vector failures

BUG=webm:1403

Change-Id: I08218f0bf26651eb367ece4feec6d704e0189bd8
2018-03-03 14:14:30 -08:00
James Zern 0685ec767c disable vp9_iht8x8_64_add_neon
this causes test vector failures

BUG=webm:1403

Change-Id: I7d37a05fbf4641ea352c947053aa4eaeb7f5c318
2018-03-03 14:14:12 -08:00
James Zern ac07cc89f1 disable vp9_iht16x16_256_add_neon
this causes test vector failures

BUG=webm:1403

Change-Id: Ifdb5b270c5cc70be5689e4fbda2ada3724cc65c3
2018-03-03 12:58:24 -08:00
Marco Paniconi 5ac63d15dc vp9-svc: Disable partition_reuse unless 2x2 scale.
For SVC, if any of the layer scale ratios are not
2x2, then disable the partiton_reuse, which assumes
2x2 scaling between layers.

Change-Id: I8b3163de0826052bbb1bfe03554a074c89510558
2018-03-02 10:56:16 -08:00
Marco Paniconi 4d8958d8dd vp9-svc: Fix to downsampling filter phase_shift.
Set phase_shift = 0 if the scale factors are
above 3/4. Removes artifact for scale factors
close to 1.

phase_shift = 8 is to get an averaging filter
(decimated pixel aligns to 8/16, midway between source pixels),
and only makes sense for scale factors multiples of
2 (1/2, 1/4,...).

Removes artifact for high scaling ratios.

Change-Id: Id0a85869d6c6156dda0032c697ded2de78fad6bd
2018-03-02 08:55:18 -08:00
James Zern 6cc33c1626 iadst16x16_256_add_half1d: fix array size
t[] is indexed from 0..11

Change-Id: I7d0021f1795c6608354c8770843ea9dfdea66f97
2018-02-28 23:49:39 -08:00
paulwilkins 29fbddec83 Change to KF frame boost calculation.
This change is targeted mainly at higher resolutions where typically
the average error per MB is much smaller.  hence this patch replaces
a fixed error per MB factor with a tiered value.

It also adds in a fixed offset value that acts as a minimum return score.

Note also minor fix to debug stats output.

The results are overall beneficial (-ve) on our test sets, most notably for
higher definition formats (see below - overall psnr, ssim, psnr hvs)

low res:    0.184	-0.262	-0.166
mid res:   0.094	  0.075	  0.049
hd res:    -0.752	-0.300	-0.800
NF 2K:    -0.353	 1.095	-0.302
NF 4K:    -1.245	-0.578	-1.205

The most notable negative case is pierseaside 2K which appears to be worse by
8-10% (which has a big impact on the overall gain for the NF 2K set). Closer
inspection reveals that the drop does not relate to the key frame boost
per se as in both cases the key frame substantially undershoots its target. Rather
this is a side effect relating to the initial Q range allowed for the key frame and
a poor initial complexity estimate. This will hopefully be improved in a later
patch.

Change-Id: I4773ebe554782f4024c047c3c392c763a3fe843b
2018-02-28 20:56:53 +00:00
Linfeng Zhang 932835677f Merge "Add vp9_iht16x16_256_add_neon()" 2018-02-28 18:26:39 +00:00
James Zern 7aa588debd Merge "datarate_test: correct last_pts_ref_ type" 2018-02-27 22:15:05 +00:00
Linfeng Zhang 88c2386447 Add vp9_iht16x16_256_add_neon()
BUG=webm:1403

Change-Id: I1413cc3dfcb62143ba04fe9b0f8d8b010fdf69b6
2018-02-27 10:13:20 -08:00
James Zern 09ce3177bb datarate_test: correct last_pts_ref_ type
use vpx_codec_pts_t to match last_pts_; this quiets a conversion warning
under visual studio

Change-Id: I3f1c146fc13f2edfb515d76730a9ef063846bf69
2018-02-26 23:03:02 -08:00
Linfeng Zhang 3c6dc743aa Fix a bug in create_s16x4_neon()
This bug exposes when 2nd argument is negative, and the higher 32 bits
would be all 1s.

Change-Id: I189ee8cd3753fde00a34847e7a37cde2caa4ba72
2018-02-26 17:49:24 -08:00
Linfeng Zhang 8de0404ed9 Merge "Clean test/dct_test.cc with testing::Combine" 2018-02-24 01:24:32 +00:00
Linfeng Zhang 90d54a15fb Clean test/dct_test.cc with testing::Combine
Change-Id: I910fd34e4a06a73568b597ccb194c8395c2e6d08
2018-02-23 15:54:47 -08:00
Linfeng Zhang 167594414f Merge "Add vp9_highbd_iht8x8_16_add_neon()" 2018-02-23 01:42:59 +00:00
Jerome Jiang acac262663 Merge "VP9 SVC: Datarate test for dynamic bitrate change." 2018-02-23 00:21:56 +00:00
Jerome Jiang e3c6d30294 VP9 SVC: Datarate test for dynamic bitrate change.
Change-Id: Ie1cd990dcb19a4cc18de4a2e487791f399c4b3cb
2018-02-22 15:05:03 -08:00
Kyle Siefring dccb8b45bb Merge "Fold adds in 16->32-bit converts in SSE2/AVX2 fDCT" 2018-02-21 23:12:07 +00:00
Linfeng Zhang 29b6a30cd9 Add vp9_highbd_iht8x8_16_add_neon()
BUG=webm:1403

Change-Id: I11efb652f1aee371c71eee2d29e33793e4736832
2018-02-20 17:21:31 -08:00
Johann c1435e321c remove deprecated 'register' keyword
Will be removed in C++17:
http://en.cppreference.com/w/cpp/language/storage_duration

Change-Id: Iadce5e2b974c707799fa939f3ff1c420fb79a871
2018-02-20 14:49:02 -08:00
Jerome Jiang 93da1ba2dc Merge "VP9 ROI test clean up regarding bool type flag." 2018-02-12 23:26:07 +00:00
Jerome Jiang 1a7b256f06 vp9_cx_iface: Remove else when returning from the other branch.
Change-Id: I2fc15ec25cc5587cafc6621176d0a6d7c376fc7c
2018-02-12 11:01:29 -08:00
Jerome Jiang 03e043e06c VP9 ROI test clean up regarding bool type flag.
Clean up code to make use_roi_ flag a bool.

Change-Id: I5b606ca19f8543840259d1cc79fe3301a2a70d30
2018-02-12 10:36:14 -08:00
Kyle Siefring 811b2e412e Fold adds in 16->32-bit converts in SSE2/AVX2 fDCT
Changes in the function size in bytes (in lieu of performance metrics)
                   Before    After    Diff
vpx_fdct32x32_avx2  29564 -> 28334   -1230
vpx_fdct32x32_sse2  38053 -> 36309   -1744

Change-Id: Ie0b3e6ed7c3f2e9ea45f9d6a1ce1e27d068cee6b
2018-02-10 14:25:24 -05:00
Jerome Jiang edc9a46876 VP9 ROI: reset use_roi_ in datarate test.
Change-Id: I51765ce6c3c8e8646852c4da47b12a0198892c52
2018-02-10 08:39:43 -08:00
Jerome Jiang 4410d729d1 VP9 ROI: Fix errors in example encoder.
Fix some errors in the vpx_temporal_svc_encoder.

Change-Id: Id93f449364dcf72c826ca931df3c8c3d3b80100f
2018-02-09 14:47:00 -08:00
Jerome Jiang 11b55a0614 Merge "Reland "Add ROI support for VP9."" 2018-02-09 19:01:52 +00:00
Jerome Jiang 46adbc4af8 Reland "Add ROI support for VP9."
Extended ROI struct suitable for VP9.
ROI input from user is passed into internal struct and applied on every frame
(except key frame).

Enabled usage of all 4 VP9 segment features (delta_qp, delta_lf, skip,
ref_frame) via the ROI map input.
Made changes to nonrd_pickmode for the ref_frame feature.

Only works for realtime speed >= 5.
AQ_MODE needs to be turned off for ROI to take effect.

Change example in the sample encoder: vpx_temporal_svc_encoder.c to be suitable
for VP9.
Add datarate test.

Bump up ABI version.

BUG=webm:1470

Change-Id: I663b8c89862328646f4cc6119752b66efc5dc9ac
2018-02-09 10:55:46 -08:00
Jerome Jiang efaaf387fc Merge "Revert "Add ROI support for VP9."" 2018-02-09 18:54:55 +00:00
Jerome Jiang 62b013abe8 Revert "Add ROI support for VP9."
This reverts commit 4e5b4b5848.

Reason for revert: Commit message inaccurate.

Original change's description:
> Add ROI support for VP9.
> 
> Extended ROI struct suitable for VP9.
> ROI input from user is passed into internal struct and applied on every frame
> (except key frame).
> 
> Enabled usage of all 4 VP9 segment features (delta_qp, delta_lf, skip,
> ref_frame) via the ROI map input.
> Made changes to nonrd_pickmode for the ref_frame feature.
> 
> Only works for realtime speed >= 5.
> AQ_MODE needs to be turned off for ROI to take effect.
> 
> Change example in the sample encoder: vpx_temporal_svc_encoder.c to be suitable
> for VP9.
> Add datarate test.
> 
> Bump up ABI version.
> 
> BUG=webm:1470
> 
> Change-Id: I7e0cf6890649adb98a5fda2efb6ae1fa511c7fc9

TBR=yaowu@google.com,jzern@google.com,marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: I000dbd81e0c67cb8a0dcde4013ee9bf7afb038f0
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: webm:1470
2018-02-09 18:53:54 +00:00
Jerome Jiang c930ea7dcd Merge "Add ROI support for VP9." 2018-02-09 16:58:55 +00:00
paulwilkins 2fa333c2ae Improved coding on slide show content.
This patch adds in detection of slide show key frame groups.
The detection assumes extremely  low or 0 motion for all frames
in the key frame group.

If this case is detected the boost level is set to a very high value
and the min Q to a lower value for the key frame itself.
Alt refs and golden frames are disabled to save bits (up to a limiting
maximum interval currently set to 240 frames).

In test samples that I created, this patch gave rise to a substantial
improvement in overall psnr and a drop in data rate. In some cases the
average psnr fell, however, with the boost and minQ values set as they are.

This is to be expected because previously a relatively poor key frame
could be followed by progressively better alt refs. For example a key
frame at q7.5 but subsequent alt refs improving it to lossless. Given that
average psnr tends to be dominated by the best frames, a ramp like this
from q7.5 to lossless may give a better average psnr than, for example,
coding the entire sequence at q2.5. Overall psnr, however, will be much
better in the latter case.  The option exists to boost the key frame further
which would  insure much better results for all metrics, but at the expense
of smaller bitrate savings. Given that these samples tend to have very
good quality anyway this seems like a bad trade off.

For slides displayed for several seconds, bitrate savings of >= 20% are likely
and much larger gains are possible in some cases.

Change-Id: Ib4b61e153c55d3f2f561153da13fdb56f397a52b
2018-02-09 15:13:25 +00:00
Marco 4e5b4b5848 Add ROI support for VP9.
Extended ROI struct suitable for VP9.
ROI input from user is passed into internal struct and applied on every frame
(except key frame).

Enabled usage of all 4 VP9 segment features (delta_qp, delta_lf, skip,
ref_frame) via the ROI map input.
Made changes to nonrd_pickmode for the ref_frame feature.

Only works for realtime speed >= 5.
AQ_MODE needs to be turned off for ROI to take effect.

Change example in the sample encoder: vpx_temporal_svc_encoder.c to be suitable
for VP9.
Add datarate test.

Bump up ABI version.

BUG=webm:1470

Change-Id: I7e0cf6890649adb98a5fda2efb6ae1fa511c7fc9
2018-02-08 16:30:56 -08:00
paulwilkins b78dad3ffa Adjust MAXRATE_1080P.
This value was originally set in response to requests from the hardware
team before levels were properly defined for VP9.

Even if a level is not specified for an encode, it imposes a maximum
frame size for videos of dimensions <= 1080P.  For larger formats the
limit was set at 250 bits per MB.

This patch modifies the limit to be more in line with the requirements
specified for level 4 (max rate for a 4 frame group of 16 Mbits).  If a lower
level is specified at encode time and this mandates a smaller maximum frame
size then the level requirement will still take precedence.

Increasing this value allows for some slide shows or very low motion clips
to code a better quality key frame.

Change-Id: Ic08e0e09c8a918077152190c59732b9a1c049787
2018-02-08 12:31:09 +00:00
Paul Wilkins 1acc25f11b Merge "Fix file input pointer bug in allocate_gf_group_bits()." 2018-02-08 10:57:44 +00:00
Linfeng Zhang 0f3edc6625 Update iadst NEON functions
Use scalar multiply. No impact on clang, but improves gcc compiling.

BUG=webm:1403

Change-Id: I4922e7e033d9e93282c754754100850e232e1529
2018-02-08 07:23:55 +00:00
Linfeng Zhang d8497e1fcd Clean vp9_highbd_iht4x4_16_add_neon()
Extract common code.

Change-Id: I422150ada1c6915f0ce39b912149994eb3bb3f12
2018-02-07 10:39:52 -08:00
paulwilkins c104f4cbdc Fix file input pointer bug in allocate_gf_group_bits().
The stats input pointer, when passed in, already points to the
frame after the golden frame so should not be advanced here.

This fix has a small mostly positive effect on results in our test sets
(tested using corpus vbr settings) and gives a gain of almost 0.5%
in overall psnr (plus slightly smaller gains on other metrics) for the
4K set.

The bug also caused a crash in calculate_group_score() in another
patch which allows coding of slides in a slide show as a single
long KF group without ARFs or GFs.

Change-Id: I57a3a24baf442ce55dbc91fba05e056697c63a6f
2018-02-06 14:02:33 +00:00
Linfeng Zhang 82e9c30334 Update tx_type switch code in idct
Change-Id: Ia244bfd4b4eb9d703653792bc4f64c6f5358ae19
2018-02-05 13:42:26 -08:00
Linfeng Zhang 3636330490 Add vp9_highbd_iht4x4_16_add_neon()
BUG=webm:1403

Change-Id: Id9833e985fb70958cf4bde38f8e6303ed83c12f9
2018-02-05 13:42:16 -08:00
James Zern 0fe4371cc0 Merge "inv_txfm_vsx.c: make code c90 compatible" 2018-02-02 18:41:46 +00:00
Jerome Jiang ac54d233b6 Merge "Fix issue for 0 target bitrate in multi-res build." 2018-02-02 05:32:55 +00:00
Jerome Jiang 519fed01c2 Fix issue for 0 target bitrate in multi-res build.
For encoding with --enable-multi-res-encoding, with 1 layer, when the
target bitrate is set 0, under these conditions null pointer
will be de-referenced. Fix is to check
cpi->oxcf.mr_total_resolutions > 1. Also added NULL pointer check.
This issue causes crash for asan build in chromium clusterfuzz.

BUG=805863

Change-Id: I9cd25af631395bc9fede3a12fb68af4021eb15f8
2018-02-01 20:17:54 -08:00
James Zern 73d1236384 inv_txfm_vsx.c: make code c90 compatible
move for loop declarations to function scope

Change-Id: I84d92a1a6ca6c5ac30aacb0f55d87ca3aef4c98f
2018-02-01 19:40:28 -08:00
James Zern 534e9af53b Merge "vp9_scale_test: parameterize filter type" 2018-02-01 20:44:48 +00:00
Paul Wilkins 79c14b83e9 Merge "Further change to code detecting slide transitions." 2018-02-01 10:21:38 +00:00
James Zern 14b21b84e3 vp9_scale_test: parameterize filter type
this allows the test to be sharded more efficiently and speeds up the
run when built with slower configs, e.g., asan.

Change-Id: If6d863b76871e3934704a1079bbf17f4886932c7
2018-01-31 23:38:47 -08:00
Marco cb16652598 vp9-svc: Add condition on allocation for scaled_temp.
scaled_temp frame is used as an intermediate buffer for
2 stage down-sampling: two stages of 1/2 down-sampling
for a target of 1/4x1/4. This is used in 3 layer SVC
to avoid duplicate frame downsampling (on middle layer).

As this allocation is only needed/used when the
number_spatial_layers > 2, add this condition to avoid
unneeded allocation for 1 and 2 spatial SVC.

Change-Id: If342466644f685c1ea3ca5344b581793e5136c09
2018-01-31 15:19:27 -08:00
Marco 2c950e131c vp9-svc: Fix to initialize downsampling filters.
For 3 spatial layers with 1/2 downsampling, the
downsampling filter for the middle layer was not
set for the very first frame, so it was defaulting
to the subsample filter (no averaging/phase = 0).

Its not set due to the two stage scaling that is
done for 1/4 on base layer, during which the intermediate
1/2 result is saved for the middle layer.

Fix for now is to set the default downsampling filter
to Bilinear (averaging/non-zero phase) for all layers on
init (vp9_init_layer_context):.

Change-Id: Ic7407810b34c621e7e7420682508d45478bdffcf
2018-01-31 13:49:16 -08:00
paulwilkins 41d3331d42 Further change to code detecting slide transitions.
Eliminate false positives in previous patch.

The previous patch did a good job of detecting slide transitions but
in discussions a number of situations were identified that might trigger
harmful false positives. This risk seems to be born out by some testing
on a wider YT set done by yclin@.

This patch adds an additional clause that requires that the best case
inter and intra error for the frame are very similar,meaning it is almost
as easy to code a key frame as an inter frame. This will certainly prevent
the false positive conditions that Jim and I discussed and even if one
does occur it should not be very damaging.

The down side is that this clause may mean that we still miss some
real slide transitions, especially if the images are small and similar.  If this
proves to be the case then some further adjustment of the threshold may be
required. However, in the specific problem sample provided we do  trap every
transition correctly.

Change-Id: I7e5e79e52dc09bc47917565bf00cc44e5cddd44c
2018-01-31 17:44:46 +00:00
Marco Paniconi efa786d464 vp9 svc: Make top layer non-ref: for 2 TL case
Only affects 2 temporal layer case.
Modified the flags for 2 temporal layers to make
top layer (top spatial, top temporal) a non-reference
frame, conistent with the 3 TL case.

Add mismatch check to the datarate test of changing
svc pattern on the fly, which is test for 2 temporal
layers.

This re-applies the change: 254e2f5501,
that was reverted in: 658eb1d675.

Change-Id: Ib5fd4a7a0312c0c05329ae75baac480af34b4694
2018-01-31 09:17:43 -08:00
Marco Paniconi 658eb1d675 Merge "Revert "vp9 svc: fix to make top layer frame non-ref"" 2018-01-31 16:49:42 +00:00
Marco Paniconi 7edd1a6cea Revert "vp9 svc: fix to make top layer frame non-ref"
This reverts commit 254e2f5501.

Reason for revert: <INSERT REASONING HERE>

Original change's description:
> vp9 svc: fix to make top layer frame non-ref
> 
> Add mismatch check to the datarate test of changing svc pattern on the
> fly.
> 
> Change-Id: I6a878736de44e6a40c077ed6430aabd7fadabdd9

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: Ibcb600438098f8dc380fe7e1de90cb81fc367468
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2018-01-31 16:49:16 +00:00
Johann Koenig 848d6004a4 Merge "Fix warning about bitwise 'not' on boolean" 2018-01-31 14:25:13 +00:00
Johann Koenig fdb64ec289 Merge "vp8 bool: verify buffer size" 2018-01-31 14:23:49 +00:00
Johann 8001c5c7a8 Fix warning about bitwise 'not' on boolean
cherry-picked from libaom:
commit cf26ee5ad2b9da79fa68c33b7d22ff53c66d6509
Author: Sebastien Alaiwan <sebastien.alaiwan@allegrodvt.com>
Date: Wed, 4 Oct 2017 10:09:13 +0200

BUG=webm:1491

Change-Id: I36c6e83ed716649f3d9ee10ce3aa9bb847cac2d9
2018-01-30 14:47:38 -08:00
Johann c59c84fc74 vp8 bool: verify buffer size
In the process of fixing a ubsan warning:
  commit 738b829b8c
  Fix incorrect size reading
the inferred check of start < end was removed. This causes fuzzed files
to get a little further and segfault in vp8dx_start_decode.

Change-Id: I316e23058753ba42dbcc46d27eb575f51c8a9e9a
2018-01-30 12:20:06 -08:00
Marco Paniconi de3a7e2630 Merge "vp9 svc: fix to make top layer frame non-ref" 2018-01-30 16:56:42 +00:00
Johann Koenig 8f952bada7 Merge "Fix doc comment mismatch in vpx_frame_buffer.h" 2018-01-30 15:32:22 +00:00
Jerome Jiang 254e2f5501 vp9 svc: fix to make top layer frame non-ref
Add mismatch check to the datarate test of changing svc pattern on the
fly.

Change-Id: I6a878736de44e6a40c077ed6430aabd7fadabdd9
2018-01-29 18:38:46 -08:00
Linfeng Zhang 5eca3c23c3 Merge "Update vp9_iht8x8_64_add_neon()" 2018-01-30 01:20:41 +00:00
Jerome Jiang e14e9c9964 Merge "Datarate test for usage of SVC_SET_REF_FRAME_CONFIG" 2018-01-30 00:01:46 +00:00
Linfeng Zhang b14b616d96 Update vp9_iht8x8_64_add_neon()
Change-Id: Ie70ed8b9273df5e1fd06bc93cb469e80630941d2
2018-01-29 15:17:08 -08:00
Brion Vibber ddf40ec156 Fix doc comment mismatch in vpx_frame_buffer.h
When compiling an app using libvpx in Xcode 9.2, a warning is
thrown in vpx_frame_buffer.h:

  "Parameter 'new_size' not found in the function declaration"

Switching it to 'min_size' to match the comment text and the
callback type definition prototype resolves it.

Change-Id: I7a3e4a857c2007c2d0d390e22054d7bc85068aa1
2018-01-29 15:13:09 -08:00
Jerome Jiang 9a96b18f03 Datarate test for usage of SVC_SET_REF_FRAME_CONFIG
Change-Id: Iea7fc1b6cea84826eb45b1f01bd923323c2c9a6f
2018-01-29 14:41:41 -08:00
Linfeng Zhang 77108f5001 Merge changes Ica8dbe5f,I8f4e0fc6
* changes:
  Update vp9_iht4x4_16_add_neon()
  Clean dct_const_round_shift() related neon code
2018-01-29 20:16:47 +00:00
Johann 7e75e8a622 Merge remote-tracking branch 'origin/mandarinduck' into HEAD
The following changes were not carried back from the release branch:
  commit f87a4594fb
  Revert "Add frame width & height to frame pkt. Add test."

  commit c5dc3373db
  work around pic issue with gcc 6

BUG=webm:1490

Change-Id: Id3e15983d5565680c05a0c454544003a615a4d7f
2018-01-29 11:57:00 -08:00
Linfeng Zhang 903bc150da Update vp9_iht4x4_16_add_neon()
Change-Id: Ica8dbe5f8167e5d370d89d233c598b70bba123b7
2018-01-29 10:25:24 -08:00
Linfeng Zhang 884d1681f8 Clean dct_const_round_shift() related neon code
Change-Id: I8f4e0fc6ecb77b623519f2dd3cd2886f89218ddd
2018-01-29 10:23:24 -08:00
Linfeng Zhang 2654afc16c Merge "cosmetic: clean idct neon functions" 2018-01-29 17:34:11 +00:00
Matt Oliver c01b6dc845 project: Update for 1.7.0 merge. 2018-01-28 22:39:58 +11:00
Matt Oliver b55e658705 Merge commit 'f80be22a1099b2a431c2796f529bb261064ec6b4'
# Conflicts:
#	vp8/encoder/x86/quantize_mmx.asm
#	vp9/encoder/x86/vp9_highbd_error_avx.asm
#	vp9/encoder/x86/vp9_temporal_filter_apply_sse2.asm
#	vpx_dsp/x86/quantize_avx_x86_64.asm
2018-01-28 20:55:48 +11:00
Johann Koenig ea1d0a6b53 Merge "Fix incorrect size reading" 2018-01-27 01:42:58 +00:00
Johann 738b829b8c Fix incorrect size reading
Cherry pick from vp9:

commit 85770264ac
Guard against incorrect size values moving *data past data_end.

Check read length against the difference of the buffers.

Change-Id: I5e8679ddd447c4d73deb80be5ec94841a92c5fcd
2018-01-26 15:51:50 -08:00
Jerome Jiang 3fa713caee Merge "vp9 svc: Update temporal_layering_mode in config change." 2018-01-26 22:45:39 +00:00
Jerome Jiang cc91abb325 vp9 svc: Update temporal_layering_mode in config change.
temporal_layering_mode can be changed on the fly.

BUG=webm:1488

Change-Id: I223fd4085184e41878ddf0f9244d2e3d07636ae3
2018-01-26 13:38:04 -08:00
Marco a9bbff1049 vp9-svc: Adjust logic on intra mode search.
For SVC, on spatial enhancement layer, intra
search was disabled unless best reference frame
is golden (i.e., spatial/inter-layer prediction),
except for some other conditions (lower layer is key
or golden is not an allowed reference).

Fix is to add the base temporal layer condition,
so intra search will not be force-disabled for base
temporal layer frames.

This improves metrics (-1-2%) for SVC 3 and 2 layer config.
Some small encode time is expected, but since condition
only affect base temporal layers (i.e., every 4 frames
for 3 layers), increase is small.

Change-Id: I10b824faef99560dfdeeb02ba8bf8e3e1eea6255
2018-01-25 19:11:42 -08:00
Marco 067457339b vp9-svc: Add QP dependency to thresh_svc_skip_golden.
In nonrd-pickmode: the golden/spatial reference for inter-layer
prediction may be skipped in the mode testing. Add QP dependency
to reduce the threshold for skipping (i.e., check it more often)
at high QP, if the lower layer was encoded at lower QP relative
to the current layer.

At high QP, a better quality lower resolution is more likely to
provide good spatial (inter-layer) prediction.

avgPSNR/SSIM metrics up by ~1% (all clips positive gain or neutral).
Some decrease in encode time (~1-2%) expected at lower bitrates,
for 3 layer SVC.

Change-Id: I9ee0f62d4b10d4ebd30165d378ecfa4399ae5ef1
2018-01-25 09:00:58 -08:00
Marco Paniconi d069f4c29d Merge "vp9: Fix to vp9_svc sample encoder for bypass mode." 2018-01-24 23:56:48 +00:00
Johann Koenig cee96c7d85 Merge "remove obsolete doxygen tags" 2018-01-24 23:44:12 +00:00
Marco 43caed4e42 vp9: Fix to vp9_svc sample encoder for bypass mode.
This fix makes it bitexact to the default SVC pattern,
for the example 2 temporal layer case.

Change-Id: I4df2063b70f7aecbfc7082f29c8439e05f6db8ac
2018-01-24 15:21:59 -08:00
Scott LaVarnway 15b261d854 Merge "BUG FIX: sse2 subpel variance is not PIC compliant" 2018-01-24 22:54:42 +00:00
Johann 3bfadfcd62 remove obsolete doxygen tags
warning: Tag `XML_SCHEMA' at line 941 of file `doxyfile' has become obsolete.

warning: Tag `XML_DTD' at line 947 of file `doxyfile' has become obsolete.

Change-Id: I85e39c4fb154569b8d7f68bdf362408983e9bd4f
2018-01-24 14:40:46 -08:00
Johann f80be22a10 Release 1.7.0 Mandarin Duck
Change-Id: I186440f3643a85694f45400393efb661f6d012fc
2018-01-24 14:25:44 -08:00
Linfeng Zhang 6248f0c91f cosmetic: clean idct neon functions
Change-Id: I9c7c52567850aded0437b13ba1260e94441bc49d
2018-01-24 13:55:15 -08:00
Marco Paniconi 3b85a5beb7 Merge "vp9-svc: Re-adjust some aq-mode=3 control parameters." 2018-01-24 20:29:10 +00:00
Scott LaVarnway cb9f4dc105 BUG FIX: sse2 subpel variance is not PIC compliant
BUG=webm:1464

Change-Id: Ibc15bac54aaf509365bed5892a26a29972ad3540
2018-01-24 05:58:54 -08:00
Scott LaVarnway b9e44842fc Merge "vp9_quantize_fp_avx2()" 2018-01-24 13:58:08 +00:00
Marco 7c69136494 vp9-svc: Re-adjust some aq-mode=3 control parameters.
Remove an adjustment to two cyclic refresh (aq-mode= 3)
parameters for SVC. The adjustment was to reduce the
delta-qp on second segment, and reduce the motion threshold.
This was done early on in the SVC encoder development,
in the latest codebase removing this adjustment yields
some improvements in metrics.

The avgPSNR/SSIM metrics increase on average by ~1%
(most clip positive gain), for 3 and 2 layer SVC.

Change-Id: I7a4d5114f16b2a1df383dbe6b3fe02940e29e6cc
2018-01-23 20:14:00 -08:00
Johann Koenig 742ae4b24d Merge "update .clang-format for v5.0.0" 2018-01-24 04:03:23 +00:00
James Zern 81d66e2cc6 vpx_codec_enc_init_multi: fix segfault w/vp9
vp9 does not support multi-res encoding, the request should not crash.

+ encode_api_test: unconditionally expose multi-res test

vpx_codec_enc_init_multi should fail independent of
CONFIG_MULTI_RES_ENCODING if not for the same reason.

Change-Id: I44fc58ef70ee4e0e482cb6a5736885f4cb2a8517
(cherry picked from commit 004fb91416)
2018-01-23 18:14:23 -08:00
Jerome Jiang 9f36419bf2 Fix crash invalid params for vp8 multres. Add test.
Fix is from the patch in the issue.
Release memories allocated before early exit.

BUG=webm:1482

Change-Id: I64952af99c58241496e03fa55da09fd129a07c77
(cherry picked from commit 5b6ae020b6)
2018-01-23 18:14:14 -08:00
James Zern d1e9635402 Merge "vpx_codec_enc_init_multi: fix segfault w/vp9" 2018-01-24 02:12:00 +00:00
Jerome Jiang dcbe6750e1 Merge "Fix frame sizes in pkt to support spatial layers." 2018-01-24 01:12:44 +00:00
Shiyou Yin 6ee88546c0 Merge "vp8: [loongson] fix bug of type conflict." 2018-01-24 01:06:42 +00:00
James Zern 004fb91416 vpx_codec_enc_init_multi: fix segfault w/vp9
vp9 does not support multi-res encoding, the request should not crash.

+ encode_api_test: unconditionally expose multi-res test

vpx_codec_enc_init_multi should fail independent of
CONFIG_MULTI_RES_ENCODING if not for the same reason.

Change-Id: I44fc58ef70ee4e0e482cb6a5736885f4cb2a8517
2018-01-23 15:52:03 -08:00
Johann 7e14e0f109 update .clang-format for v5.0.0
Change-Id: Id43e8ce9cf3790b728683acc9686e246ccaa90cf
2018-01-23 13:41:22 -08:00
Linfeng Zhang 231012fdab Add vp9_highbd_iht16x16_256_add_sse4_1()
BUG=webm:1413

Change-Id: I8d7eeae1bd219eb848c1a86071046a477f7a91af
2018-01-23 11:24:42 -08:00
Linfeng Zhang 8fd648c78a Merge "Add "vpx_" prefix to 2 idct x86 functions" 2018-01-23 18:28:59 +00:00
Jerome Jiang b8159fab38 Merge "Fix crash invalid params for vp8 multres. Add test." 2018-01-23 17:25:28 +00:00
Linfeng Zhang 8f50e06012 Add "vpx_" prefix to 2 idct x86 functions
Change-Id: I4f3052d8748e16b06e9155f8daf22f867dfaa7a3
2018-01-23 09:17:38 -08:00
Linfeng Zhang 6fea41abee Merge "Add vp9_highbd_iht8x8_64_add_sse4_1()" 2018-01-23 17:04:20 +00:00
Shiyou Yin d344ab03cc vp8: [loongson] fix bug of type conflict.
In commit 577d4fa79, int8_t was used to replace char. This will result in a
compilation error, for int8_t was typedefined to signed char, but not char.

Change-Id: I5c9837e01b0b58688a7741f5c9a99a76ca887e4a
2018-01-23 14:28:55 +08:00
Jerome Jiang 2c2fea2c5b Fix frame sizes in pkt to support spatial layers.
Add test for svc frame sizes in pkt.

BUG=webm:1485

Change-Id: I983dc229e526d72d22360d7f3016d8358d6beae7
2018-01-22 21:05:39 -08:00
Jerome Jiang 5b6ae020b6 Fix crash invalid params for vp8 multres. Add test.
Fix is from the patch in the issue.
Release memories allocated before early exit.

BUG=webm:1482

Change-Id: I64952af99c58241496e03fa55da09fd129a07c77
2018-01-22 20:30:45 -08:00
Johann Koenig 3761254119 Merge changes from topic "clang-format"
* changes:
  clang-format v5.0.0 vp9/
  remove spurious comments
  clang-format v5.0.0 vp8/
  clang-format v5.0.0 vpx_dsp/
  clang-format v5.0.0 mem_ops.h
  clang-format v5.0.0 vpx_util/vpx_atomic.h
  clang-format v5.0.0 y4minput.c
  clang-format v5.0.0 vpxenc.c
  clang-format v5.0.0 examples/
  clang-format v5.0.0 test/
2018-01-22 19:38:42 +00:00
Linfeng Zhang 9874ec07bd Add vp9_highbd_iht8x8_64_add_sse4_1()
BUG=webm:1413

Change-Id: Id9038226902b2d793fc6c17ac81bb104c1a18988
2018-01-18 15:49:44 -08:00
Scott LaVarnway c7449b482c vp9_quantize_fp_avx2()
Started from vp9_quantize_fp_sse2 and tweaked to use avx2.

Change-Id: Ic2da50cc9d73896c7ef2f3cd3db5b1c5d7795b8b
2018-01-18 13:33:30 -08:00
Johann 281f68a81f clang-format v5.0.0 vp9/
Remove trailing commas to keep multiple elements on one line.

Add blank lines to prevent comments from being treated as blocks.

clang-format guards for struct with a comment in the middle.

Change-Id: I3bcb8313ae8aaf69179249a13b4087b1272cdbc0
2018-01-18 12:37:58 -08:00
Johann 68cc1dc422 remove spurious comments
These don't appear to make any sense given their context. The
commit log also does not reveal anything.

Discovered due to spurious clang-format indenting:
https://bugs.llvm.org/show_bug.cgi?id=35930

Change-Id: I732a66056ba4c05e3e132a2f236fe10f7a282900
2018-01-18 12:37:58 -08:00
Johann f95bf1db50 clang-format v5.0.0 vp8/
Allow*OnASingleLine appears to no longer apply to
typedef structs.

Adjust closing parenthesis/opening brace on functions.

Remove trailing commas to keep multiple elements on one line.

Change-Id: I6e535a8ddb15c9b3de8216ce8ddb2a18241af46c
2018-01-18 12:37:58 -08:00
Johann 97acbbb701 clang-format v5.0.0 vpx_dsp/
Remove comments above #define statements because they get
indented unnecessarily.
https://bugs.llvm.org/show_bug.cgi?id=35930

Add blank lines to prevent comments from being treated as
blocks.

Change-Id: I04dce21b2a10e13b8dc07411a0019c098f6dd705
2018-01-18 12:37:50 -08:00
Marco 9debbc2ec7 vp8: Fix to multi-res-encoder for skipping streams.
For the vp8 simulcast/multi-res-encoder:

Add flags to keep track of the disabling/skipping of
streams for the multi-res-encoder. And if the lower spatial
stream is skipped for a given stream, disable the motion
vector reuse for that stream.

Also remove the condition of forcing same frame type
across all streams.

This fix allows for the skipping/disabling of the base
or middle layer streams.

Change-Id: Idfa94b32b6d2256932f6602cde19579b8e50a8bd
2018-01-18 11:56:42 -08:00
Johann Koenig 740883897a Merge "Revert "Add frame width & height to frame pkt. Add test."" into mandarinduck 2018-01-17 20:17:33 +00:00
Johann f87a4594fb Revert "Add frame width & height to frame pkt. Add test."
This reverts commit bd1d995cd3.

Remove the feature from the release as it requires additional work.

BUG=webm:1485

Change-Id: I1a01ac2525703af97a456a3eed85718306c0f734
2018-01-17 11:28:24 -08:00
Vignesh Venkatasubramanian eedda5f924 vp8dx.h: Add macro for skipping loop filter
Without this applications cannot use the vpx_codec_control macro
for VP9_SET_SKIP_LOOP_FILTER. The tests only cover the underscored
version vpx_codec_control_().

Change-Id: I3e6c1888307b76636fdc1a8deae70b5c14238163
(cherry picked from commit 373e08f921)
2018-01-16 18:27:42 -08:00
Vignesh Venkatasubramanian 373e08f921 vp8dx.h: Add macro for skipping loop filter
Without this applications cannot use the vpx_codec_control macro
for VP9_SET_SKIP_LOOP_FILTER. The tests only cover the underscored
version vpx_codec_control_().

Change-Id: I3e6c1888307b76636fdc1a8deae70b5c14238163
2018-01-16 15:42:43 -08:00
paulwilkins 7d19739949 Fix bug in use of zoom metric as part of arf breakout.
The in/out (or zoom metrics) in accumulate_frame_motion_stats()
are in effect a % of the blocks that have a motion vector pointing
either towards or away from the center. As such they are already
normalized in terms of image size and the thresholds against which
these are tested should be image size independent.

In practice a zoom either in or out is an indicator for a shorter group
length so the abs value is more important as a breakout clause.

This patch fixes the threshold test. Clips without noticeable zoom show
no effect but some  with strong zooms such as "station" show a big
gain (5-10%). Average psnr-hvs gain on hdres set was 0.292%

Change-Id: I4f97a72b0e273e4e844ade15285749c32cd81c1c
(cherry picked from commit 0226ce79e9)
2018-01-16 11:21:29 -08:00
Paul Wilkins f915e6d4af Merge "Add zoom break out for kf boost loop." 2018-01-12 18:45:28 +00:00
Paul Wilkins 733820c509 Merge "Fix kf detection in some slide shows." 2018-01-12 18:45:13 +00:00
Johann eb20d6f64c clang-format v5.0.0 mem_ops.h
Remove trailing empty line to keep the comment from being indented.
https://bugs.llvm.org/show_bug.cgi?id=35930

Change-Id: I6c51f7afb4cc47f03a190b4c90e29e4ff1e0c689
2018-01-12 09:15:22 -08:00
Johann b4fb99220b clang-format v5.0.0 vpx_util/vpx_atomic.h
Allow*OnASingleLine appears to no longer apply to
typedef structs.

Change-Id: If10db1c30c74ee31dad1a0b1926964e850f15fd2
2018-01-12 09:15:19 -08:00
Johann fd7de8362d clang-format v5.0.0 y4minput.c
Remove trailing empty line to keep the comment from being indented.
https://bugs.llvm.org/show_bug.cgi?id=35930

Change-Id: If0f0862623b3fa3ae49e850edbbed52c2b4c6672
2018-01-12 09:15:15 -08:00
Johann b87250c56e clang-format v5.0.0 vpxenc.c
Treat the formatted string as one distinct parameter to fprintf

Change-Id: I62cfd5657c4cefc6b3fa45247ba9f33515a292b1
2018-01-12 09:15:12 -08:00
Johann e1c69544b1 clang-format v5.0.0 examples/
Attempts to group () expression on their own line.

Change-Id: I404f9dd1a91aaa2100925c90162bcdefbead5ad2
2018-01-12 09:15:10 -08:00
Johann 8f25c3ff8f clang-format v5.0.0 test/
Remove trailing commas to keep multiple elements on one line.

Remove trailing empty lines to keep comments from being indented.
https://bugs.llvm.org/show_bug.cgi?id=35930

Change-Id: I0a66dde95f2a304f13cb85a2e9197afca20051e8
2018-01-12 09:14:56 -08:00
Scott LaVarnway 8c0cd2bd76 Merge "Add quantize_fp_32x32_nz_c()" 2018-01-11 23:05:33 +00:00
Marco Paniconi 2879e0d2cc Merge "vp9: Skip encoding of enhancement layers on the fly." 2018-01-11 22:08:49 +00:00
Johann f5b2dd2a66 adopt some clang 5.0.0 formatting
At least the changes that don't conflict with 4.0.1

Change-Id: I9b6a7c14dadc0738cd0f628a10ece90fc7ee89fd
2018-01-11 12:35:24 -08:00
Marco f8639b1554 vp9: Skip encoding of enhancement layers on the fly.
For SVC: if an enhancement layer (spatial_layer > 0)
has 0 bandwidth, skip/drop the encoding of the layer.
This allows the application to dynamically disable
higher layers for SVC.

Add flag to signal the skip encoding, this is needed
to modify the packing of the superframe when the top
layer is skipped/dropped.

Also moved some updates (current_video_frame counter and
the last_avg_frame_bandwidth) to the postencode_update_drop_frame().

Added datarate unittest for dynamically going from 3 to 2
and then back to 3 spatial layers.

Change-Id: Idaccdb4aca25ba1d822ed1b4219f94e2e8640d43
2018-01-11 10:38:30 -08:00
Vlad Tsyrklevich 1633786bfb [CFI] Remove function pointer casts
Control Flow Integrity [1] indirect call checking verifies that function
pointers only call valid functions with a matching type signature. This
change eliminates some function pointer casts that I missed in my last
CL https://crrev.com/c/780144.

BUG=chromium:776905

[1] https://www.chromium.org/developers/testing/control-flow-integrity

Change-Id: I1c7adbdfffa4fe0b62e993bfb31d06e64b022d66
2018-01-10 16:38:26 -08:00
Paul Wilkins bbdbee429f Merge "Fix bug in use of zoom metric as part of arf breakout." 2018-01-10 17:33:00 +00:00
paulwilkins 32f86ce276 Add zoom break out for kf boost loop.
Adds a breakout threshold to key frame boost loop.

This reduces the boost somewhat in cases where there is a
significant zoom component. In tests most clips no effect
but a sizable gain for some clips like station.

Change-Id: I8b7a4d57f7ce5f4e3faab3f5688f7e4d61679b9a
2018-01-10 17:26:32 +00:00
paulwilkins bb4052b873 Fix kf detection in some slide shows.
This fix improves detection of key frames in slide shows.

In particular it helps if the slides are pictures of varying formats
as in a sample provided by yclin@.

This change does not impact any of the clips in our standard tests
but for the example slide show test clip helped global psnr by
several db and resolved a serious visual quality issue.

Change-Id: Iaeeeed55dc0bb50aeacd4996ed660ced06374603
2018-01-10 15:07:38 +00:00
Johann c5dc3373db work around pic issue with gcc 6
Enable pic when building sse2 or higher optimizations.

BUG=webm:1464

Change-Id: I36c6e83ed716649f3d9ee10ce3aa9bb847cac2d9
2018-01-09 12:46:45 -08:00
Linfeng Zhang e20ca4fead Add vp9_highbd_iht4x4_16_add_sse4_1()
BUG=webm:1413

Change-Id: I14930d0af24370a44ab359de5bba5512eef4e29f
2018-01-08 10:14:20 -08:00
Linfeng Zhang 7a41610581 Update dct_test.cc
Make 8-bit functions testing available in high bitdepth.

Change-Id: Ic030c75aa4c6b649c52426abb4bb2122882de0fe
2018-01-08 10:07:38 -08:00
Linfeng Zhang b25b2ca455 Merge "Update iadst4_sse2()" 2018-01-08 17:15:45 +00:00
Marco Paniconi bed28a55f5 Merge "vp9-svc: Use eightap_smooth for downsampling at low resol." 2018-01-05 19:01:31 +00:00
Marco 321f295632 vp9-svc: Use eightap_smooth for downsampling at low resol.
Switch from bilinear to eighttap_smooth for frame-level
downsampling at low resolutions (<= 320x240).

avgPSNR/SSIM metrics increase from ~0.5-2% (all clips positive gain),
for 2 and 3 spatial layer SVC, with 3 temporal layers.
Small/negligible increase in encoding time (< 1%).

Change-Id: I758472fc4fddd51d87f13c9d1a1cd4986ef5d41f
2018-01-05 10:17:08 -08:00
paulwilkins 0226ce79e9 Fix bug in use of zoom metric as part of arf breakout.
The in/out (or zoom metrics) in accumulate_frame_motion_stats()
are in effect a % of the blocks that have a motion vector pointing
either towards or away from the center. As such they are already
normalized in terms of image size and the thresholds against which
these are tested should be image size independent.

In practice a zoom either in or out is an indicator for a shorter group
length so the abs value is more important as a breakout clause.

This patch fixes the threshold test. Clips without noticeable zoom show
no effect but some  with strong zooms such as "station" show a big
gain (5-10%). Average psnr-hvs gain on hdres set was 0.292%

Change-Id: I4f97a72b0e273e4e844ade15285749c32cd81c1c
2018-01-05 13:19:35 +00:00
Marco 55db4f033f vp9: Increase convergence speed of noise estimation.
Increase the recursive average factor from 15/16 to 3/4
to make the noise estimation respond faster.

Small/neglible change on low noise content, but better
denoising for noisy content.
Also encoder speedup of ~2-3% observed on some noisy clips.

Change-Id: I9dd02fe961ca24b411fe4c2732f814bf1e9a7f9f
2018-01-04 14:29:27 -08:00
Linfeng Zhang 867b593caa Update iadst4_sse2()
Change-Id: I21ff81df0d6898170a3b80b3b5220f9f3ac7f4e8
2017-12-28 16:47:57 -08:00
Scott LaVarnway fe5d87aaeb Add quantize_fp_32x32_nz_c()
This c version uses the shortcuts found in the
vp9_quantize_fp_32x32_ssse3 function.

Change-Id: I2e983adb00064e070b7f2b1ac088cc58cf778137
2017-12-26 06:11:21 -08:00
Scott LaVarnway 8a4336ed2e Add vp9_quantize_fp_nz_c() -- 2
This c version uses the shortcuts found in the x86
vp9_quantize_fp functions.

The test was updated to use the correct quant/round range.

Change-Id: Ie5871f710d9eb39047d8d9f48b907c0633e1f830
2017-12-21 15:26:36 -08:00
James Zern 1a7bf0d1f9 Merge "vp9_quantize_ssse3_x86_64: fix out of bounds write" 2017-12-21 23:02:32 +00:00
Ralph Giles 117893a717 Don't force inlining for msvc targets.
INLINE is defined as __forceinline for vs* configs, but is the
normal, compiler-discretion inline for gcc/clang configs. This
makes many functions very large when building for windows targets,
much larger than they are elsewhere.

Use '__inline' as a consistent definition to get consistent function
sizes. Although Visual Studio documentation says that 'inline' is
only available in C+ code. This is probably incorrect, since Visual
Studio 2017 accepts C99 'inline' even when passed /TC. Nevertheless,
this commit uses the recommended '__inline' for consistency.

Thanks to David Major for the diagnosis.

Change-Id: Ib0b31a3afcea77822c84fe3c6cd452add66d825a
2017-12-21 13:55:18 -08:00
James Zern 84a7263d4c vp9_quantize_ssse3_x86_64: fix out of bounds write
eob is a pointer to a uint16_t. previously the code would store 64-bits
causing a crash or test failure with the right stack layout.

Change-Id: Ibd653baf323db114f2444951b9d8b00c596bf15a
2017-12-21 16:53:14 -05:00
James Zern 7a245adb18 Revert "Add vp9_quantize_fp_nz_c()"
This reverts commit 86842855d3.

SSSE3/VP9QuantizeTest.EOBCheck/1 fails on Mac and the build breaks under
visual studio due to a #if within another macro.

Change-Id: I475095a04aafcc714fade2b24e4df7b682be2cd1
2017-12-21 06:05:19 -08:00
Scott LaVarnway de50e8052c Merge "Add vp9_quantize_fp_nz_c()" 2017-12-20 23:15:11 +00:00
James Zern 1a9c7bee88 Merge "lpf_test: correct threshold ranges" 2017-12-20 20:22:34 +00:00
Marco 9ca9c12dbd vp9-svc: Add layer bitrate targeting to SVC datarate tests.
Modify and update the SVC datarate unittests to verify the
rate targeting for each spatial-temporal layer.
The current tests were only verifying the rate targeting
of the full SVC stream, not individual layers.
Also re-enabled a test that was disabled.

This is a stronger verification of the layered rate control
for SVC for 1 pass CBR encoding.

Added PostEncodeFrameHook, needed to get the layer_id and
update the layer buffer level.

Change-Id: I9fd54ad474686b20a6de3250d587e2cec194a56f
2017-12-19 19:48:47 -08:00
Scott LaVarnway 86842855d3 Add vp9_quantize_fp_nz_c()
This c version uses the shortcuts found in the x86
vp9_quantize_fp functions.

The test was updated to use the correct quant/round range.

Change-Id: I5d19f8af2fddda8e50910249eafb740acb29415b
2017-12-19 12:48:45 -08:00
Marco a2127236ae vp9: Reset buffer level on large bitrate changes.
For a large change in the target avg_frame_bandwidth,
via the update in change_config()), reset the buffer_level
to optimal_level.

This fix prevents possible frame drops, where for example,
encoder suddenly goes from lower to higher bitrate.

Change-Id: I2f844c41d04c01240e85f574e59d2b9075c7eb6d
2017-12-19 09:57:21 -08:00
James Zern 5203b40a2a lpf_test: correct threshold ranges
the random number generator creates values from [0, range) add 1 to all
and make hev more realistic by mirroring its calculation of level >> 4,
i.e., [0, 3]

Change-Id: Ic19be5d7ba668deb17c96f143b739116a4b5d21c
2017-12-18 23:17:45 -08:00
Shiyou Yin 08a668af32 vp8: [loongson] optimize loopfilter v2.
Optimize function vp8_mbloop_filter_vertical_edge_mmi and
function vp8_mbloop_filter_horizontal_edge_mmi.
Make full use of memory loading delay slot and reduce unnecessary
instructions.

Change-Id: I61da2c3a44c06044225461f46bf487d83cba6c16
2017-12-15 17:06:47 +08:00
Shiyou Yin 09519a55c7 Merge "vp8: [loongson] optimize sixtab predict v2." 2017-12-15 00:53:21 +00:00
Johann Koenig 7970cc02df Merge "add copyright to rtcd files" 2017-12-14 23:44:30 +00:00
Johann Koenig d95ddc7c71 Merge "mark generated version header" 2017-12-14 23:44:04 +00:00
Johann e4b3f03c64 add copyright to rtcd files
Allows them to pass the license check in chromium.

BUG=chromium:98319

Change-Id: Iefc1706152a549d8c4ae774c917596bf1c9492d8
2017-12-14 22:50:08 +00:00
Johann Koenig 7d1bf5d12a Merge "remove unused tools" 2017-12-14 21:19:59 +00:00
Johann Koenig 9f8433ffe2 Merge "fix typo in boilerplate" 2017-12-14 21:19:47 +00:00
Johann 920ba82409 remove unused tools
all_builds.py has been more or less replaced by Jenkins.

author_first_release.sh is unused.

ftfy.sh has been obviated by having the whole tree clang-format clean.

Change-Id: I741315ad9042e6e901f07410e93f28371db703b2
2017-12-14 20:34:14 +00:00
Johann fe4de1ff63 mark generated version header
Allows it to pass the license check in chromium.

BUG=chromium:98319

Change-Id: I5ba9c8c81ab9eb4168df09db9d2eab846e99e981
2017-12-14 11:58:10 -08:00
Johann 6746ba6d01 fix typo in boilerplate
The extra 'e' was causing the chromium license check to flag this file.

BUG=chromium:98319

Change-Id: Ic875ba66370298bf998438d14ff5f7e760293706
2017-12-14 11:54:16 -08:00
Johann 05e6e9ac83 mark generated rtcd headers
Allows them to pass the license check in chromium.

BUG=chromium:98319

Change-Id: Ib37bf45bdac8cf1edc62037dea17b734a5e37fa7
2017-12-14 11:48:46 -08:00
Shiyou Yin f2ad523461 vp8: [loongson] optimize sixtab predict v2.
1. Delete unnecessary zero setting process.
2. Optimize the method of calculating SSE in vpx_varianceWxH.

Change-Id: I8bab801416e7f4958c28c6d080e3cf785a50f82b
2017-12-14 16:29:58 +08:00
Marco c58f01724c vp9: Update to SVC datarate tests.
With recent fixes to rate control for SVC the
buffer underrun in the tests does not happen,
so comment and TODO can be removed.

Also, in some of these SVC tests, replace the HD clip
with the corresponding VGA clip, which has > 400 frames.
For the (niklas) HD clip: it has only 60 frames but the
test was running up to 300 frames. Fixed it to 60 frames.

Keep some tests with the HD clip, needed for the 4 thread
and 5 level scaling test.

Change-Id: I0a2356a908e8b2271c7a422eb8b15c0d56eec968
2017-12-13 14:07:52 -08:00
Marco Paniconi 028429310a Merge "vp9: Reset rc flags on some configuration changes." 2017-12-13 21:03:40 +00:00
Marco e9ad5d2aee vp9: Cleanup/remove TODO comment.
Change-Id: I2bd43e996909ad688b7e00b81ee19a5fc4df460b
2017-12-13 11:30:09 -08:00
Marco a40fa1f95d vp9: Reset rc flags on some configuration changes.
For large dynamic changes in target avg_frame_bandwidth, or
a change in resolution, via the update in change_config()),
reset the under/overshoot flags (rc_1_frame, rc_2_frame)
to prevent constraining the QP for the first few frames
following the change.

For SVC use the spatial stream avg_frame_bandwidth in
reset condition.

For the avg_frame_bandwidth condition, use fairly large
threshold (~50%) for now in reset.

This allows for better/faster QP response if, for example,
application dynamically changes bitrate by large amount.

Change-Id: Ib6e3761732d956949d79c9247e50dba744a535c0
2017-12-13 10:41:38 -08:00
Paul Wilkins 94eaecaa91 Merge "Bug fix for second reference stats." 2017-12-12 11:56:10 +00:00
Jerome Jiang f9ecdc35ec Merge "vp9 svc: Allow denoising next to highest resolution." 2017-12-12 05:27:11 +00:00
Jerome Jiang c1e511fd82 vp9 svc: Allow denoising next to highest resolution.
Denoise 2 spatial layes at most.

Add noise sensitivity level 2 for vp9 such that applications can control
whether to denoise the second highest spatial layer.

Add tests to cover this case.

Change-Id: Ic327d14b29adeba3f0dae547629f43b98d22997f
2017-12-11 15:20:19 -08:00
Jerome Jiang a1689ed16b Merge "Fix build warnings for gcc 6.3" 2017-12-11 18:27:17 +00:00
paulwilkins f1ce050f44 Bug fix for second reference stats.
Immediately following a key frame the trailing second reference
error in the first pass stats will be based on a reference frame from
the prior key frame group and will thus usually be much larger.

This fix eliminates that effect (which typically triggers a short arf
group immediately after a key frame). It also changes the accounting
for the first frame in each new arf group.

This change gives large gains on a couple of clips that contain mid
sequence key frames (e.g. 6% on 1080P tennis). Overall there was
a net gain in PSNR and PSNR-HVS ~(0.05- 0.4%) and mixed results for
SSIM (+/- 0.2%).

Change-Id: I8e00538ac2c0b5c2e7e637903cac329ce5c2a375
2017-12-08 10:05:36 +00:00
Jerome Jiang 2a602f745d Fix build warnings for gcc 6.3
Clean up some alias.

BUG=webm:1465

Change-Id: I99e186162db9f9e15375fef01564692434eda619
2017-12-07 13:42:10 -08:00
Jerome Jiang 14dbdd95e6 Merge "Add frame width & height to frame pkt. Add test." 2017-12-06 22:37:15 +00:00
Jerome Jiang bd1d995cd3 Add frame width & height to frame pkt. Add test.
Used to return correct frame width and height when dynamic resizing happens.

BUG=webm:1474

Change-Id: Ia2043f7e1635b3821848a67b9b134f47f14b0f3a
2017-12-06 13:55:18 -08:00
Marco 3562d6b0a2 vp9-svc: Set downsampling filter for VGA layer.
Downsampling filter for SVC was set to subsample (phase 0)
for HD -> VGA, and bilinear averaging (phase 8) for VGA -> QVGA.
This change makes it bilinear averaging for HD -> VGA.

Given the recent commit 9f9d4f8, quality is improved with
this change: avgPSNR/SSIM up ~1-3% on HD clips in RTC set.
Speed decrease of ~1% for 3 layer SVC.

Change-Id: If834a320e372b8b922a6bf7cab4227703b1beae6
2017-12-06 12:01:24 -08:00
Marco Paniconi 575c1933ea Merge "vp9: Nonrd-pickmode: move some early exits up." 2017-12-06 19:18:51 +00:00
Hui Su 2e44f16443 Merge "Add max luma picture width/height constraint in VP9 level" 2017-12-06 18:46:19 +00:00
Marco 33953f310e vp9: Nonrd-pickmode: move some early exits up.
Move the early exit checks on usable_ref_frame and
skip_ref_find_pref up before the check on flag_svc_subpel.
The code under flag_svc_subpel requires frame_mv to be set
for the golden/spatial reference, which is only set if the
both those exits don't pass.

No change in behavior.

Change-Id: Id304276c745eeb389ff85fa2dcf510d5976bc413
2017-12-06 10:18:44 -08:00
Marco 9f9d4f8dc9 vp9-svc: Allow for nonzero motion on spatial reference.
For nonrd pickmode on a given spatial layer, the spatial
(golden) reference was always only using zeromv for prediction.
In this patch if the downsampling filter used for generating
the lower spatial layer is an averaging filter (nonzero phase),
we allow for subpel motion on the spatial (golden) reference to
compensate for the shift. This is done by forcing the testing of
nonzero motion mode to compensate for spatial downsampling shift.

Improvement for cases where the downsampling is averaging filter.
In the current code this is only done for generating
resolutions <= QVGA.

Improvement for avgPSNR/SSIM on RTC set for speed 7: ~1.2%.
Gain is larger (~2-3%) for VGA clips with 2 spatial layers.
~1% speed slowdown for 3 layer SVC on mac.

Change-Id: I9ec4fa20a38947934fc650594596c25280c3b289
2017-12-05 22:41:07 -08:00
Shiyou Yin 90ce21e519 Merge "vpx_dsp: [loongson] optimize variance v2." 2017-12-04 01:30:06 +00:00
Hui Su 07b12aad77 Add max luma picture width/height constraint in VP9 level
BUG=b/65412009

Change-Id: I9e1478dcbd2ef9e97f5f8fb5a1c733b5f5cdf396
2017-12-01 16:29:40 -08:00
Johann e83d00f584 filter out asm includes
Don't add include files to the archive. Avoids build failures for
Windows such as:
the input file 'libvpx_g.a(x86_abi_support.asm.o)' has no sections

Change-Id: If9c8e70c0ec913b7ad7dd6a08d4fa19011114ad2
2017-12-01 15:03:51 -08:00
Johann bdbecea1ba explicitly label .text sections
nasm should infer .text but does not for windows:
https://bugzilla.nasm.us/show_bug.cgi?id=3392451

Change-Id: Ib195465e5f33405f5ff61c4cf88aa2a72640cacb
2017-12-01 14:33:04 -08:00
Johann 65df957df6 nasm defaults to -Ox
No need to specify default behaviour. The original change introducing nasm:
https://chromium.googlesource.com/webm/libvpx/+/7be093ea4d50c8d38438f88cb9fa817c1c9de8dd
mentions requiring 2.0.9, which was the first release to default to this behaviour:
http://www.nasm.us/doc/nasmdoc2.html
"The -Ox mode is recommended for most uses, and is the default since NASM 2.09."

Change-Id: Ia914c4deede5aa447277b5189bb4fcf7e54c338d
2017-12-01 14:33:04 -08:00
Johann Koenig 401e00792f Merge "pass 'win64' instead of 'x64' to the assembler" 2017-12-01 22:07:03 +00:00
Johann 460dbc01b5 pass 'win64' instead of 'x64' to the assembler
nasm does not accept x64

yasm has accepted (and appears to prefer) win64 at least as far back as
1.0.0:
http://yasm.tortall.net/releases/Release1.0.0.html

Change-Id: Ied881b1df0570da256b1bd7e131e7817e47f768f
2017-12-01 10:58:54 -08:00
Marco 8d0e7ac29a vp9-svc: Set num_inter_modes in non-rd pickmode.
Set num_inter_modes based on ref_mode_set_svc, which is
smaller set than ref_mode_set (which may use alt-ref).

No change in behavior.

Change-Id: I31169bb09028db230552c6fca0a86959d1ade692
2017-12-01 10:30:45 -08:00
Shiyou Yin 298f5ca47d vpx_dsp: [loongson] optimize variance v2.
1. Delete unnecessary zero setting process.
2. Optimize the method of calculating SSE in vpx_varianceWxH.

Change-Id: I58890c6a2ed1543379acb48e03e620c144f6515f
2017-12-01 13:44:48 +08:00
Kaustubh Raste 8099220e6c Merge "mips msa optimize vpx_scaled_2d function" 2017-12-01 01:24:25 +00:00
Marco Paniconi c22ab8ab9f Merge "Nonrd-pickmode: avoid duplicate computation of UV predictor." 2017-12-01 00:23:52 +00:00
James Zern 9dbefc4b57 Merge "decouple spatial-svc from encoder abi" 2017-12-01 00:22:34 +00:00
Marco 2e701f7c29 Nonrd-pickmode: avoid duplicate computation of UV predictor.
Avoids duplicate computation of UV predictor.

Bit-exact when static_threshold is zero.
Small/neutral difference on RTC set with nonzero static_threshold
(since UV predictor won't be skipped with this change).

Small speed gain, ~1-2%, at speed 8.

Change-Id: Iba8d22a307768b391e29d63c9826aac5a4d9c285
2017-11-30 12:41:58 -08:00
James Zern 5044779e77 decouple spatial-svc from encoder abi
this is only meant for testing. along with --enable-experimental
--enable-spatial-svc require VPX_TEST_SPATIAL_SVC to be defined rather
than bumping the encoder ABI.

Change-Id: I7f34d9f60300fa31ccf22e1a4aa619392c391b2e
2017-11-30 10:52:25 -08:00
Marco b409863c48 Fix to copy partition.
Update the prev_partition on early exits in
choose_partitioning().

Change-Id: I382ffcab8e647c00b14283d15c3dd11bb0ac6f50
2017-11-30 10:27:34 -08:00
Shiyou Yin 392e0188f6 Merge changes Icd9c866b,I81717e47
* changes:
  vp8: [loongson] optimize regular quantize v2.
  vp8: [loongson] optimize vp8_short_fdct4x4_mmi v2.
2017-11-30 00:53:50 +00:00
Shiyou Yin 8d70aef05f Merge "vpx: [loongson] fix bug in var_filter_block2d_bil_16x" 2017-11-30 00:53:37 +00:00
Jingning Han 9116e3d957 Merge "Add PSNR Cb and Cr metric to opsnr.stt" 2017-11-29 22:56:47 +00:00
Marco Paniconi 3437fe484a Merge "vp9-svc: Don't allow encode_breakout on golden ref." 2017-11-29 22:41:31 +00:00
Marco 49f51af4c9 vp9-svc: Don't allow encode_breakout on golden ref.
For 1 pass cbr SVC: GOLDEN is the spatial reference,
better not to check for encoder_breakout on this reference.

Small positive ~0.075% (mostly neutral) gain in avgPSNR/SSIM metrics.
No observed change in encoder speed.

Change-Id: Ib337f16d6771105bf06384c6a23ad047fc690418
2017-11-29 13:58:43 -08:00
Marco 0e94522338 vp9-svc: Clean conditon for allowing copy_partition.
Make condition explicit on non_reference_frame.

No change in behavior.

Change-Id: Iec5068bccd93c7c7be67634c5c090580b2dbb20d
2017-11-29 13:19:09 -08:00
Kyle Siefring 3ae909b0f9 Merge "Remove unnecessary includes of emmintrin_compat.h" 2017-11-29 19:14:45 +00:00
Kyle Siefring a60da3a2eb Remove unnecessary includes of emmintrin_compat.h
Change-Id: Ie60381a0c6ee01f828cd364a43f01517f4cb03e9
2017-11-29 11:48:24 -05:00
Shiyou Yin d49bf26b1c vp8: [loongson] optimize regular quantize v2.
1. Optimize the memset with mmi.
2. Optimize macro REGULAR_SELECT_EOB.

Change-Id: Icd9c866b0e6aef08874b2f123e9b0e09919445ff
2017-11-29 17:06:00 +08:00
Kaustubh Raste 339f4dcaee mips msa optimize vpx_scaled_2d function
Change-Id: I638507b360c71489ab0e87bd558d2719ad995333
2017-11-29 13:27:04 +05:30
Shiyou Yin 9966cc8d12 vp8: [loongson] optimize vp8_short_fdct4x4_mmi v2.
Optimize the calculate process of a,b,c,d.

Change-Id: I81717e47bc988ace1412d478513e7dd3cb6b0cc9
2017-11-29 12:58:37 +08:00
James Zern c5f5f4ed17 vpx{enc,dec}: add --help
only output short usage to stderr on error, with --help use stdout

Change-Id: I7089f3bca829817e14b14c766f4f3eaee6f54e5c
2017-11-28 20:49:54 -08:00
Jingning Han 9bd3f1e30d Add PSNR Cb and Cr metric to opsnr.stt
Change-Id: I24e1741c00f9514647c7db2758a7ababd4e96932
2017-11-28 20:03:59 -08:00
Shiyou Yin a0ca2a4079 vpx: [loongson] fix bug in var_filter_block2d_bil_16x
Which cause failed case:
1. MMI/VpxSubpelVarianceTest.Ref/6
2. MMI/VpxSubpelVarianceTest.Ref/7
3. MMI/VpxSubpelVarianceTest.ExtremeRef/6
4. MMI/VpxSubpelVarianceTest.ExtremeRef/7

Change-Id: I122ca20089e14ac324edd61295cf8f506e06afc8
2017-11-29 10:26:43 +08:00
Marco f0b4868625 vp9-svc: Fix condition for setting downsampling filter.
Use (width * height) for setting downsampling filter type.

Change-Id: If4acfde7ff9339e0584155f8a4d15b2f134211f2
2017-11-28 16:28:29 -08:00
Johann bd990cad72 quantize x86: dedup some parts
Change-Id: I9f95f47bc7ecbb7980f21cbc3a91f699624141af
2017-11-27 13:09:21 -08:00
Marco cbe62b9c2d vp9-svc: Fix to the layer buffer settings.
For the case when the number of temporal layers > 1,
the buffer levels (starting/optimal_buffer_level,
and maximum_buffer_size) were not scaled properly.

In vp9_update_layer_context_change_config():
when setting the layer-buffer levels, fix is to scale
the layer-target_bandwidth by the target_bandwidth
(which is the full stream bandwidth) instead of the
spatial_layer_target.

This is needed because prior to the call
vp9_update_layer_context_change_config(), set_rc_buffer_sizes()
is called which sets the buffer levels based on target bandwidth
(which is the full bandwidth for the SVC stream).

This fix properly sets the layer-buffer levels based on the
layer-bandwidth, and leads to better rate targeting.

Small/neutral change in avgPSNR/SSIM metrics on RTC set.

Change-Id: Ic0f4f7f3487c37b9a9adb4781ae5edfed7140a57
2017-11-26 22:17:48 -08:00
Peter Collingbourne 9639641cd4 Merge "[CFI] Remove function pointer casts" 2017-11-21 18:42:40 +00:00
Jerome Jiang 50fc0d896b Merge "vp8 simulcast: fix compile warnings." 2017-11-21 01:22:46 +00:00
Vlad Tsyrklevich bc29863b96 [CFI] Remove function pointer casts
Control Flow Integrity [1] indirect call checking verifies that function
pointers only call valid functions with a matching type signature. This
change eliminates function pointer casts to make libvpx CFI-safe.

[1] https://www.chromium.org/developers/testing/control-flow-integrity

Change-Id: I7e08522d195a43c88cda06fa20414426c8c4372c
2017-11-20 16:36:29 -08:00
Jerome Jiang f49360d740 vp8 simulcast: fix compile warnings.
Clean up some prints.

Change-Id: I199350e34a8b6fbff9601fcbd11ec68d24da5073
2017-11-20 16:18:31 -08:00
Kyle Siefring dd4cc5b596 Merge "Optimize AVX2 get16x16var and get32x16var functions" 2017-11-20 22:37:57 +00:00
Jerome Jiang 0cc23242b0 Merge "vp9 svc: fix a few compile warnings." 2017-11-20 18:52:58 +00:00
Marco 559166acfe vp9-svc: Enbale scale partition reference frames.
For reference frames: enable scale partition for
superblocks with low source sad or if bsize on lower-resoln
is at least 32x32.

Keep feature disabled for base temporal layer.

Small regression in avgPNSR/SSIM metrics, ~0.5-1%.
Speedup ~2-3% on mac for SVC (3 spatial/3 temporal layers) at speed 7.

Change-Id: I5987eb7763845b680059128b538bb5188be0cca5
2017-11-17 14:52:20 -08:00
Jerome Jiang 8b7a6ca60a vp9 svc: fix a few compile warnings.
Change-Id: I4cb878600038066513ab73f3658990d1245ff2fb
2017-11-17 14:40:05 -08:00
Kyle Siefring 07a0bf038f Optimize AVX2 get16x16var and get32x16var functions
Change-Id: If8b91aaa883c01107f0ea3468139fa24cfb301d2
2017-11-17 13:55:49 -05:00
Paul Wilkins 849b3c238d Merge "Disable allow_partition_search_skip for speed 2." 2017-11-17 10:34:56 +00:00
Paul Wilkins c66eeab30e Merge "Code cleanup." 2017-11-17 10:34:46 +00:00
Paul Wilkins 55eacca945 Merge "Remove decay_accumulator clause from alt ref breakout." 2017-11-17 10:34:37 +00:00
Paul Wilkins 4bd2a59e9b Merge "Add clause to alt ref group breakout." 2017-11-17 10:34:26 +00:00
Jerome Jiang ea14a1a965 Merge "vp9: Fix mem rel for non-ref for external buffer." 2017-11-17 00:31:16 +00:00
paulwilkins 44473e7eb9 Disable allow_partition_search_skip for speed 2.
When allow_partition_search_skip  is set the two pass code
can optionally skip the partition search in the rd loop if the image
appears static (based on selection of 0,0 motion).

Unfortunately 0,0 motion does not necessarily mean that there are
no meaningful changes or that motion or intra modes will not be selected
in the second pass.

Disabling "allow_partition_search_skip" may hurt the encode speed a little
for a small number of clips but can have a big impact on compression.
The most notable example of this in our test sets is "bridge_close_cif"
where this change gives a gains of 18%, 12% and 16% in opsnr, ssim and
psnr-hvs.

Change-Id: I765e288b5c0cd82bce00a148e7653a21e9203024
2017-11-16 16:17:57 +00:00
Jerome Jiang 1aea1675c0 vp9 svc: Rework/fix scale partitioning on boundary.
Enable partition copy on boundary and scale blocks along the boundary.
Rename copy_partition_svc to scale_partition_svc.

Do not copy if the block crosses the boundary.

Change-Id: I37a04d48f11b15c4ea67facd7631193ec2f62150
2017-11-15 20:34:58 -08:00
Johann 3e3a568616 fwd txfm ssse3: use GLOBAL() for loading constants
Fixes a build issue when relocation is not allowed:
relocation R_X86_64_32 against '.rodata' can not be used when making a shared object

Change-Id: Ica3e90c926847bc384e818d7854f0030f4d69aa0
2017-11-15 13:01:44 -08:00
paulwilkins 05302360c9 Code cleanup.
Removal of parameters to and code in calc_frame_boost() that is no
longer required.

No change to results from previous patch.

Change-Id: Ic92da35613fdc247d22fddf24d09679fc5329017
2017-11-15 17:07:28 +00:00
paulwilkins 03c1a827ac Remove decay_accumulator clause from alt ref breakout.
The decay accumulator clause covers similar ground to the
new clause that tests the accumulated second reference error
so it has been removed to reduce complexity.

Change-Id: I4ec1cce32d72bd4ee463ad7def2831a68447d525
2017-11-15 16:58:05 +00:00
paulwilkins 607e45f420 Add clause to alt ref group breakout.
Add a clause to the breakout test for alt ref groups that
examines the size of the accumulated second reference
frame error compared to the cost of intra coding.

This clause causes a reduction in the average group length for many
clips. Alongside the change to the group length the minimum
boost is increased.

On balance the results are positive for psnr and psnr-hvs
but is negative for ssim/fast ssim for the smaller image formats.

Strong gains on some harder clips (eg ducks take off (midres) ~20%,
husky (lowres) 6-17%. Most of the negative cases are lower motion
clips. Subsequent patch hopefully will help with those.

Change-Id: Ic1f5dbb9153d5089e58b1540470e799f91a65dc4
2017-11-15 16:40:12 +00:00
Marco b3c93d60c2 vp9-svc: Fix flag for usage of reuse-lowres partition
Fix/cleaup the conditioning for usage of the reuse-lowres
partition feature.

Replace the non-reference condition with the top temporal
layer, and put this condition in the speed feature.

This prevents doing update_partition_svc() on every
VGA frame, instead it will now only do update for VGA in
the top temporal layer frames.

Also this makes it easier to test/enable this feature
for lower layer temporal frames.

Change-Id: Ia897afbc6fe5c84c5693e310bcaa6a87ce017be5
2017-11-14 20:08:10 -08:00
Scott LaVarnway 8d471fcee2 tiny_ssim.c : clang compile error fix
Change-Id: Ic10ba580fd5da7d6ff7fa0f33db72fb0c1a97801
2017-11-14 04:38:00 -08:00
James Bankoski 7839fb98a8 Merge "add 10 and 12 bit to tiny_ssim" 2017-11-14 00:15:24 +00:00
Jerome Jiang 9df11a7c52 Merge "vp9 svc: Change conditions on VPX_ENCODER_ABI_VERSION." 2017-11-13 21:04:41 +00:00
Jerome Jiang 0d2555bd2e vp9 svc: Change conditions on VPX_ENCODER_ABI_VERSION.
VPX_ENCODER_ABI_VERSION was bumped up in 93e83f.

Change-Id: Id5707f9f9db56fa96549bc8f54e1cfa04e7fa4cd
2017-11-13 11:05:20 -08:00
Jim Bankoski becab42eee add 10 and 12 bit to tiny_ssim
Change-Id: I92e4dba2d1682a0d77ad9a214ec4312b1cf4d42e
2017-11-13 10:56:42 -08:00
paulwilkins a73cee2870 New content type to improve grain retention.
For new VP9 only content type adjust  the rate distortion and ARF
filter based on the relative spatial variance of the source and
reconstruction.

In regards to the RD loop the method favors modes where the
reconstruction variance is similar to the source variance. However it
is currently only applied to regions where the source variance is quite
low.

For very low variance blocks it applies a further bias against intra
coding and large prediction block sizes (the later in particular limit
the usefulness of the loop filter).

The final part of this change is to lower the strength of the ARF
filter for blocks where the source has very low spatial variance, to
encourage some low amplitude texture or noise to pass through
the filter.

This change improves the retention of film grain and fine noise /
texture in spatially flat regions, but as expected causes a significant
drop in PSNR on many clips. This is to be expected because similar
but misaligned noise or texture will give a lower PSNR than a flat
noise free reconstruction. However, it is worth noting that most clips
show a strong gain in FAST SSIM.

The features are enabled on the vpxenc command line by setting
--tune-content=film.

VPX_ENCODER_ABI_VERSION bumped for this change and cvbr.

Change-Id: I26a4e4edfa3dc5cacead82fa701fe7a9118ccd0a
2017-11-13 16:57:23 +00:00
paulwilkins 55fc4d95af Small parameter clean up.
Removed three parameters that are no longer needed in calls
to calc_arf_boost() and associated minor changes.

No impact on encode results.

Change-Id: Ieaf31d0d2e1990b99cf69647170145a1bbfbb9fb
2017-11-13 16:53:57 +00:00
Paul Wilkins 2eddfb46a9 Merge "Fix to frames considered in arf boost calculation." 2017-11-13 16:36:43 +00:00
Paul Wilkins f5817fa612 Merge "CVBR command line option." 2017-11-13 16:32:39 +00:00
Scott LaVarnway 8e6022844f vpx: [x86] add vpx_satd_avx2()
SSE2 instrinsic vs AVX2 intrinsic speed gains:
blocksize   16: ~1.33
blocksize   64: ~1.51
blocksize  256: ~3.03
blocksize 1024: ~3.71

Change-Id: I79b28cba82d21f9dd765e79881aa16d24fd0cb58
2017-11-10 12:24:12 -08:00
Scott LaVarnway 8c7213bc00 Merge "vpx: [x86] add vp9_block_error_fp_avx2()" 2017-11-10 00:45:47 +00:00
Marco Paniconi 1ff68ec035 Merge "vp9-svc: Avoid minmax variance for non-reference frames." 2017-11-10 00:30:04 +00:00
Marco 6c0011a255 vp9-svc: Avoid minmax variance for non-reference frames.
For choose_partitioning (speed >= 6): avoid computation
of minmax variance for non-reference frames in SVC.

Existing condition only avoided this for speed >= 8.
Combine that existing logic with non-reference condition.

Small speedup (~0.5-1%) for 3 layer SVC,
neutral change on avgPSNR/SSIM metrics.

Change-Id: I3e9f3a1af0647b15e475cf170d9402908d672ee5
2017-11-09 16:27:27 -08:00
James Zern 10cb17aec0 Merge "runtime error fix: bitdepth_conversion_avx2.h" 2017-11-10 00:15:03 +00:00
Jerome Jiang 6246d8aa76 vp9: Fix mem rel for non-ref for external buffer.
Release frame buffers for non-ref when the decoder is destroyed.

Enable the non ref test.

BUG=b/68819248

Change-Id: Id87ef3b0a62318f9812e927cd957c05c859047fa
2017-11-09 15:47:21 -08:00
Jerome Jiang 0665b09661 Merge "vp9: SVC feature to use partition from lower resolution." 2017-11-09 23:28:44 +00:00
Jerome Jiang fdb054a05d vp9: SVC feature to use partition from lower resolution.
For SVC with 3 spatial layers:
Add feature to copy/upscale partition from middle spatial layer
to the upper/highest resolution, when superblock sad is not high.

Enabled for speed >= 7 and only for non-reference frames.

Speedup ~3-4%, small loss in avgPNSR/SSIM of ~1%.

Change-Id: I7f0a2716c0fde28bade0f86159d11b7e31d6ab8d
2017-11-09 14:16:50 -08:00
Scott LaVarnway 2387024f41 runtime error fix: bitdepth_conversion_avx2.h
Change-Id: I7364a157de39eb7137b599808474b8d46d19d376
2017-11-09 12:26:43 -08:00
Johann Koenig bdb8b3ad86 Merge "fail early on oversize frames" 2017-11-09 19:50:04 +00:00
Scott LaVarnway 62ab5e99c1 vpx: [x86] add vp9_block_error_fp_avx2()
SSE2 asm vs AVX2 intrinsics speed gains:
blocksize   16: ~1.00
blocksize   64: ~1.17
blocksize  256: ~1.67
blocksize 1024: ~1.81

Change-Id: I2a86db239cf57e3ff617890ccb2d236aba83ad5e
2017-11-09 05:02:31 -08:00
paulwilkins d6e29868ac Fix to frames considered in arf boost calculation.
For a chosen interval "i" the existing arf boost calculation examined frames
+/- (i-1) frames from the current location in the second pass.

This change checks to make sure that the forward search does not extend
beyond the next key frame in the event that the distance to the next key
frame is < (i - 1).

Small metrics gains on all our  test sets but these are localized to a few clips
(e.g. midres set psnr-hvs sintel -2.59% but overall average was only -0.185%)

Change-Id: I26fc9ce582b6d58fa1113a238395e12ad3123cf6
2017-11-09 10:46:10 +00:00
Jerome Jiang adbb4c4d32 Merge "vp9: Add nonref frame buffer test." 2017-11-09 04:41:10 +00:00
Jerome Jiang a68bbcff29 vp9: Add nonref frame buffer test.
The new test will run a SVC bitstream which has non ref frames.
It checks the number of buffer acquired and released to make sure all
external frame buffers are released.

Add a new test bitstream:
vp90-2-22-svc_1280x720_1.webm
which has 400 frames in total, and 1 spatial layer and 2 temporal layers.
There is one non ref frame every other frame.

Disabled for now. Will be enabled with the fix.

BUG=b/68819248

Change-Id: I0515336fd9809a9e1fceba90e4dce53dabaf53a5
2017-11-08 18:41:33 -08:00
Johann Koenig cf8039c25f Merge "Support building AVX-512 and implement sadx4 for AVX-512" 2017-11-08 16:28:40 +00:00
paulwilkins 93e83fd7cf CVBR command line option.
Added command line control of Corpus VBR.

The new corpus vbr mode is a variant of standard
VBR (end-usage=0) where the complexity distribution
mid point is passed in rather than calculated for a specific
clip or chunk.

The new variant is enabled by setting a new command line
parameter --corpus-complexity to a zero value. Omitting
this parameter or setting it to 0 will cause the codec to use
standard vbr mode.

The correct value for a given corpus needs to be derived
experimentally using a training set such that the average
rate for the corpus is close to the target value.

For example our using our low res test set with upper and lower
vbr limits of 50%-150% and a corpus complexity value of 650
gives a similar average data rate across the set to using standard
vbr. However, with the corpus mode easier clips will be allocated
fewer bits and harder clips more bits rather than having the same
rate target for all.

Change-Id: I03f0fc8c6fb0ee32dc03720fea6a3f1949118589
2017-11-08 10:41:04 +00:00
Marco 6fbc354c97 Nonrd_pickmode: avoid computing UV cost when early_term is set.
For nonrd_pickmode: if early_term is set there should be
no need to include UV in rdcost (when color_sensitivity is set).

Neutral change on RTC and RTC_derf metrics, for speed >= 5.
No change for ytlive metrics.

Very small speed gain (~0.5%) on some clips with strong color content.

Change-Id: Ifc00928ecd935fc71e94935ceef0ae7481249f07
2017-11-06 10:22:14 -08:00
Kyle Siefring b383a17fa4 Support building AVX-512 and implement sadx4 for AVX-512
The added AVX-512 support requires the subset of AVX-512 added in Skylake-X.

Change-Id: I39666b00d10bf96d06c709823663eb09b89265b7
2017-11-03 13:37:23 -04:00
Marco eb7d431cb5 Compound prediction mode for nonrd pickmode.
Allow for compound prediction mode in nonrd_pickmode for ZEROMV.
For real-time encoding, 1 pass with non-zero lag-in-frames.

Added speed feature to control the feature.
Enabled for speed >=6 for now, under VBR mode.

avgPSNR/SSIM metrics positive on ytlive set, for speed 6:
some clips up by ~3-5%, some clips neutral gain, average gain
across clips is ~1%.

Small/negligible decrease in speed.

Change-Id: I7a60c7596e69b9a928410c5ee2f9141eecd8613d
2017-11-03 10:13:05 -07:00
Johann 5fe82459ec fail early on oversize frames
Even though frame_size is calculated in uint64_t, it winds up in an int
size value.

This was exposed with the msan test because the memset is called with
(int)frame_size, leading to a segfault.

Change-Id: I7fd930360dca274adb8f3e43e5e6785204808861
2017-11-03 09:49:13 -07:00
Jerome Jiang 3ba9a2c8b2 Merge "vp9: Move allocation of vt2 after early exits." 2017-11-01 16:58:01 +00:00
Jerome Jiang 34805d6d0d vp9: Move allocation of vt2 after early exits.
Remove the memory deallocation on the early exits.

Change-Id: I00b4a814ae6705105ecab89644d055ca3311d9f4
2017-10-31 17:04:04 -07:00
Jerome Jiang 0c84b9b703 Merge "vp9: Reduce stack usage of choose_partitioning." 2017-10-31 21:42:18 +00:00
Jerome Jiang 18b470f486 vp9: Reduce stack usage of choose_partitioning.
Move vt2 to heap.
Reduce the stack usage from ~87K to ~44K.

BUG=b/68362457

Change-Id: I8f5f93712934d59a8cc4564378172d409a736a2e
2017-10-31 13:10:27 -07:00
Jerome Jiang c77822615e Merge "vp9: Reduce stack usage of choose_partioning." 2017-10-30 23:39:41 +00:00
Jerome Jiang cc47231187 vp9: Reduce stack usage of choose_partioning.
Change type of sum_square_error from int64_t to uint32_t.
Change type of sum_error from int64_t to int32_t.

This reduces the stack usage from ~131K to ~87K.

BUG=b/68362457

Change-Id: I147d7c7b226bceb4f0817bb86848e1fa9d9ac149
2017-10-30 13:53:20 -07:00
James Zern acb9460929 vp8: correct if/else '{' placement
swap '{' and c-style comments removing a few redundant ones along the
way; covers most leftovers from the clang-tidy run against an
x86_64-linux config.

Change-Id: I67a45596f80a12389faca49c5be440875092a7df
2017-10-27 12:27:10 -07:00
Scott LaVarnway 3bf02ad74a vpx: hadamard: use ptrdiff_t instead of int for stride
Eliminates the following instruction for the x86 (64 bit)
intrinsic code:

movslq %esi,%rax

Change-Id: I8f5ebd40726f998708a668b0f52ea7a0576befae
2017-10-26 11:41:48 -07:00
Kyle Siefring 037e596f04 Merge "Optimize convolve8 SSSE3 and AVX2 intrinsics" 2017-10-24 19:22:36 +00:00
Kyle Siefring ae35425ae6 Optimize convolve8 SSSE3 and AVX2 intrinsics
Changed the intrinsics to perform summation similiar to the way the assembly does.

The new code diverges from the assembly by preferring unsaturated additions.

Results for haswell

SSSE3
Horiz/Vert  Size  Speedup
Horiz       x4    ~32%
Horiz       x8    ~6%
Vert        x8    ~4%

AVX2
Horiz/Vert  Size  Speedup
Horiz       x16   ~16%
Vert        x16   ~14%

BUG=webm:1471

Change-Id: I7ad98ea688c904b1ba324adf8eb977873c8b8668
2017-10-24 10:39:48 -04:00
Scott LaVarnway e0aa6b24aa Merge "vpx: [x86] vpx_hadamard_16x16_avx2() highbitdepth fix" 2017-10-23 22:02:59 +00:00
Marco 0738d90169 vp9-svc: Allow for adapt_rd_thresh with row-mt.
Set adaptive_row_thresh_mt = 1 at speed >= 7,
for svc when multi-threading is used with row-mt.
This allow the adaptive_rd_thresh feature to be used
in the nonrd-pickmode.

~1-2% speedup for SVC encoding with small quality
loss (< 0.6%) on RTC set.

Change-Id: Iab9878dff117bccdaef3e4d0645165db9808cdfc
2017-10-23 11:47:18 -07:00
Scott LaVarnway 512bf4e029 vpx: [x86] vpx_hadamard_16x16_avx2() highbitdepth fix
Use an intermediate buffer before storing to coeffs when
highbitdepth is enabled.

Change-Id: I101981a1995f1108ad107c55c37d6e09eadb404b
2017-10-23 08:49:32 -07:00
Scott LaVarnway 4906cea027 vpx: [x86] vpx_hadamard_16x16_avx2() improvements
~10% performance gain.  Fixed the cosmetics noted in the
previous commit.

Change-Id: Iddf475f34d0d0a3e356b2143682aeabac459ed13
2017-10-20 08:55:06 -07:00
Scott LaVarnway b58259ab55 Merge "vpx: [x86] add vpx_hadamard_16x16_avx2()" 2017-10-19 23:32:10 +00:00
Paul Wilkins 199971d606 Merge "Corpus VBR tweak for undershoot." 2017-10-19 10:07:45 +00:00
Paul Wilkins 0c493cbe2b Merge "Increase precision of some debug stats output for corpus VBR." 2017-10-19 10:07:30 +00:00
Paul Wilkins d8c34a2552 Merge "Prevent double application of min rate in two pass." 2017-10-19 10:06:33 +00:00
Scott LaVarnway 55c126a5d7 vpx: [x86] add vpx_hadamard_16x16_avx2()
This version is ~1.91x faster than the sse2 version.  When
highbitdepth is enabled, it is ~1.74x.

Change-Id: I2b0e92ede9f55c6259ca07bf1f8c8a5d0d0955bd
2017-10-18 18:00:00 -07:00
Jerome Jiang 401e6d48bf Merge "Add datarate test for vp8 ROI." 2017-10-18 19:39:26 +00:00
Jerome Jiang bd6d82e881 Add datarate test for vp8 ROI.
BUG=webm:1470

Change-Id: Icbc848837e64eacc49491dcc26b4c5802af2ee13
2017-10-18 11:19:59 -07:00
Jerome Jiang ec2fced451 Merge "vp8: Enable use of ROI map." 2017-10-18 18:16:44 +00:00
Kyle Siefring b3a36f7946 Merge "Refactor x86/vpx_subpixel_8t_intrin_avx2.c" 2017-10-18 16:19:52 +00:00
Shiyou Yin df15220a89 Merge "vp8: [loongson] optimize idct with mmi" 2017-10-18 00:55:36 +00:00
Jerome Jiang dbb8926b86 vp8: Enable use of ROI map.
Disable cyclic refresh if ROI is used and add flag to properly handle
the static_thresh deltas.
Remove the ROI test for cyclic refresh (it's allowed but disabled if ROI
is used).
Add an example in vpx_temporal_svc_encoder.c. Turned off by default.

BUG=webm:1470

Change-Id: Ief9ba1d7f967bc00511b412b491c3f70943bfbda
2017-10-17 15:23:03 -07:00
Linfeng Zhang 9336e01621 Merge changes I17fff122,Ic149e3cb
* changes:
  Add 4 to 3 scaling SSSE3 optimization
  Test extreme inputs in frame scale functions
2017-10-17 16:03:29 +00:00
Linfeng Zhang 0d2e95193b Merge "Generalize CheckScalingFiltering in ConvolveTest" 2017-10-17 16:03:07 +00:00
Kyle Siefring 55805e2786 Refactor x86/vpx_subpixel_8t_intrin_avx2.c
Change-Id: I6539111dfb35a43028e9755785b2e9ea31854305
2017-10-17 11:57:40 -04:00
Shiyou Yin 577d4fa792 vp8: [loongson] optimize idct with mmi
1. vp8_dequant_idct_add_y_block_mmi
2. vp8_dequant_idct_add_uv_block_mmi

Change-Id: I9987147be2685ac79d4b045d1d56f6709ee1223c
2017-10-17 03:27:31 +00:00
Linfeng Zhang 580d32240f Add 4 to 3 scaling SSSE3 optimization
Note this change will trigger the different C version on SSSE3 and
generate different scaled output.

Its speed is 2x compared with the version calling vpx_scaled_2d_ssse3().

Change-Id: I17fff122cd0a5ac8aa451d84daa606582da8e194
2017-10-16 15:42:42 -07:00
Marco a9248457b1 Adjust threshold in gf_boost for 1 pass vbr
Small inncrease the sad_thresh1, avoids some false
detection of possible scene changes within lag.

Small improvement in few clips on ytlive, otherwise neutral change.

Change-Id: Ia79b53bb657bbce65a7aac7d20666b6373d5af8b
2017-10-13 15:33:51 -07:00
Paul Wilkins 12df840777 Merge "Further Corpus VBR change." 2017-10-13 15:59:58 +00:00
Paul Wilkins eaa593d293 Merge "Corpus Wide VBR test implementation." 2017-10-13 15:59:45 +00:00
paulwilkins 8842ee0b0d Corpus VBR tweak for undershoot.
In cases of strong undershoot adjust Q range down faster.

Change-Id: I84982beceb3c9b6dc50e52e4a6e891c7dd395d03
2017-10-13 10:27:15 +01:00
Shiyou Yin 3e2770de4f Merge "vp8: [loongson] optimize dct with mmi" 2017-10-13 00:37:57 +00:00
Marco Paniconi 28d1c0535d Merge "Adjust to scene detection for 1 pass vbr." 2017-10-12 19:36:33 +00:00
Marco a673b4f4af Adjust to scene detection for 1 pass vbr.
Expose the threshold for setting key frame on cut,
and increase it for speed 5.
Also small adjustment to min_thresh.

No change in overall metrics or fps.
Small quality improvement and lower encode time on scene cuts.

Change-Id: I36e06ff3b26b6c29aede39c23fce454525fc9026
2017-10-12 10:59:23 -07:00
Jerome Jiang 175b36cb6d Merge "vp9: use nonrd pick_intra for small blocks on keyframes." 2017-10-12 17:29:27 +00:00
Kyle Siefring caa116c9be Merge changes I38783d97,If5160c0c
* changes:
  Extend 16 wide AVX2 convolve8 code to support averaging.
  Add AVX2 version of vpx_convolve8_avg.
2017-10-12 16:12:38 +00:00
paulwilkins 2b247ae91c Increase precision of some debug stats output for corpus VBR.
Change-Id: I75841797cc0c215781b5b36e3a3e9f4b0e35ba63
2017-10-12 10:07:21 +01:00
Jerome Jiang 288890cd43 vp9: use nonrd pick_intra for small blocks on keyframes.
Keyframe encoding is more than 2x faster.
Disabled on Speed 8.

Change-Id: I2157318b6ac8253fa5398322c72d98cd7fa9b2b6
2017-10-11 21:38:01 -07:00
Shiyou Yin f70de09f2a vp8: [loongson] optimize dct with mmi
1. vp8_short_fdct4x4_mmi
2. vp8_short_fdct8x4_mmi
3. vp8_short_walsh4x4_mmi

Change-Id: I89a7df25cfd09fae309fac257ad8b6a3dc1c8acb
2017-10-12 08:50:04 +08:00
Shiyou Yin bc4098a8e9 Merge "vp8: [loongson] optimize quantize with mmi" 2017-10-12 00:33:17 +00:00
Marco 72c69e14ad Adjust threshold in datarate tests for 1 pass VBR
Small increase in threshold for the 1 pass VBR datarate tests.
Needed due to commit:
<017257a Adjustment to scene detection and key frame>

Change-Id: I28b3bd7db2192a8cc2bccc3cb0e3b8dbb910ca16
2017-10-11 11:48:36 -07:00
Linfeng Zhang 1fa3ec3023 Test extreme inputs in frame scale functions
Change-Id: Ic149e3cb59be2ee0f98a3fcfd83226ad5ea30c99
2017-10-11 11:35:19 -07:00
paulwilkins 416b7051d7 Prevent double application of min rate in two pass.
The initial allocation of bits in the two pass code to each frame
should be within the min max limits on the command line. However,
when forming an ARF group the cost of the ARF is shared by frames
in that group such that the residual bits for a frame could drop below
the min value. This change prevents the minimum being re-applied
after the cost of the ARF has been deducted as this may otherwise
cause low rate sections to overshoot their target.

Test runs comparing to a baseline run with min and max section pct
0-2000% vs one closer to the YT use case (50-150%) suggest that
this fix not only results in better rate control but also gives a better
rd outcome.

For example the HD set vs 0-2000% baseline (opsnr, ssim).
Old code (50-150):  +0.751, +1.099
New code(50-150): +0.241, -0.009

Change-Id: I715da7b130bf53ba8aa609532aa9e18b84f5e2ef
2017-10-11 18:00:44 +01:00
Shiyou Yin e8ed2bb762 vp8: [loongson] optimize quantize with mmi
1. vp8_fast_quantize_b_mmi
2. vp8_regular_quantize_b_mmi

Change-Id: Ic6e21593075f92c1004acd67184602d2aa5d5646
2017-10-11 16:45:58 +08:00
Linfeng Zhang 16166bfdaa Add 4 to 1 scaling x86 optimization
Change-Id: I51c190f0a88685867df36912522e67bdae58a673
2017-10-10 16:24:06 -07:00
Jerome Jiang dcfae2cc64 Merge "Fix alignment in vpx_image without external allocation." 2017-10-10 23:02:05 +00:00
Jerome Jiang 33c598990b Fix alignment in vpx_image without external allocation.
This restores behaviors prior to
<40c8fde Fix image width alignment. Enable ImageSizeSetting test.>.

BUG=b/64710201

Change-Id: I559557afe80d5ff5ea6ac24021561715068e7786
2017-10-10 14:26:17 -07:00
Linfeng Zhang 54f7d68c5c Generalize CheckScalingFiltering in ConvolveTest
Let it test extreme inputs and all filter types.
In the future ConvolveTest should test regular 8-bit functions in
high bitdepth mode.

Change-Id: I1042564d1d390589ca203070fe332c6da3315d75
2017-10-10 14:12:43 -07:00
Marco 017257a317 Adjustment to scene detection and key frame.
For 1 pass vbr: use higher threshold on avg_sad
and force key frame under scene cut detection if
above the threshold. Allow it for speed >= 6 for now,
since it does not use the full nonrd_pickmode partition
(as in speed 5).

Improves quality somewhat on scene cut frames.
Neutral on overall metrics and fps for speed 6 on
ytlive set.

Change-Id: I12626f7627419ca14f9d0d249df86c7104438162
2017-10-10 11:20:05 -07:00
Linfeng Zhang 963cc22cef Merge changes I9d4c1af5,I882da3a0
* changes:
  Rename some inline functions in NEON scaling
  Generalize 2:1 vp9_scale_and_extend_frame_ssse3()
2017-10-10 17:29:50 +00:00
paulwilkins 06d231c9fa Further Corpus VBR change.
Change to the bit allocation within a GF/ARF group.

Normal VBR and CQ mode allocate bits to a GF/ARF group based of the mean
complexity score of the frames in that group but then share bits evenly between
the "normal" frames in that group regardless of the individual frame complexity
scores (with the exception of the middle and last frames).

This patch alters the behavior for the experimental "Corpus VBR" mode such that
the allocation is always based on the individual complexity scores.

Change-Id: I5045a143eadeb452302886cc5ccffd0906b75708
2017-10-10 10:41:35 +01:00
paulwilkins 741bd6df4f Corpus Wide VBR test implementation.
This patch makes further changes to support an experimental
corpus wide VBR mode that uses a corpus complexity
number as the midpoint of the distribution used to allocate bits
within a clip, rather than some average error score derived from the
clip itself.

At the moment the midpoint number is hard wired for testing and
the mode is enabled or disabled through a #ifdef.  Ultimately this
would need to be controlled by command line parameters.

Change-Id: I9383b76ac9fc646eb35a5d2c5b7d8bc645bfa873
2017-10-10 10:40:44 +01:00
Kyle Siefring 1b2f92ee8e Extend 16 wide AVX2 convolve8 code to support averaging.
Also adds vpx_convolve8_avg_horiz_avx2.

Change-Id: I38783d972ac26bec77610e9e15a0a058ed498cbf
2017-10-09 19:10:03 -04:00
Linfeng Zhang 27d21a3d13 Rename some inline functions in NEON scaling
Change-Id: I9d4c1af53d57f72fc716bacbe3b0965719c045ac
2017-10-09 11:23:00 -07:00
Linfeng Zhang e1ae3772da Merge "Update vp9_scale_and_extend_frame_ssse3()" 2017-10-09 16:20:00 +00:00
Kyle Siefring 9ca06bcdd2 Add AVX2 version of vpx_convolve8_avg.
vpx_convolve8_avg works by first running a normal horizontal filter then a
vertical filter averages at the end.

The added vpx_convolve8_avg_avx2 calls pre-existing AVX2 code for the
horizontal step.

vpx_convolve8_avg_vert_avx2 is also added, but only uses ssse3 code.

Change-Id: If5160c0c8e778e10de61ee9bf42ee4be5975c983
2017-10-07 23:37:48 -04:00
James Zern 807248ec81 Merge "ppc: Add vpx_idct32x32_1024_add_vsx" 2017-10-07 19:08:26 +00:00
Marco Paniconi 5bc4c37a89 Merge "Revert "Speed >=5 real-time: add TM intra mode for high_source_sad."" 2017-10-06 22:41:34 +00:00
Marco Paniconi bcbc6ed82d Revert "Speed >=5 real-time: add TM intra mode for high_source_sad."
This reverts commit 9311ef18b4.

Reason for revert:
Notice small regression in some clips.
Will revisit in another change.

Original change's description:
> Speed >=5 real-time: add TM intra mode for high_source_sad.
> 
> Small/neutral change in metrics or speed for ytlive.
> Some improvement in quality on frames with big content change.
> 
> Change-Id: Ib3b0703a5f28ea6710e90324436e27598ab7384d

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: I9d8ec5195bb05ddf329d325699355185affb9b13
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2017-10-06 22:14:56 +00:00
Marco e405eb06b1 Adjust threshold in scene detection
For 1 pass vbr: increase min_thresh slightly, and also add
condition on golden/arf update for using full nonrd_pick_partition.

Reduces possible false detection for scene cut detection.

Neutral/small change in metrics or speed for speed 5.

Change-Id: I388f4d9a56e3cc763e0148338c1bc0381e58ad76
2017-10-06 11:08:56 -07:00
Marco Paniconi 7af6c6c9ca Merge "Speed >=5 real-time: add TM intra mode for high_source_sad." 2017-10-06 06:29:46 +00:00
Marco 9311ef18b4 Speed >=5 real-time: add TM intra mode for high_source_sad.
Small/neutral change in metrics or speed for ytlive.
Some improvement in quality on frames with big content change.

Change-Id: Ib3b0703a5f28ea6710e90324436e27598ab7384d
2017-10-05 23:07:03 -07:00
James Zern d2fb834ebd Merge "vpx_codec.h: namespace local defines" 2017-10-06 05:30:16 +00:00
James Zern e8ed030da3 vpx_codec.h: namespace local defines
add VPX_ to UNUSED/*DEPRECATED to avoid conflicts with other headers.

Change-Id: Ie16bdac3575bc1af57a05d37e65b994370585377
2017-10-05 15:12:20 -07:00
James Zern 107eb6a9d4 vp9_ethread_test: abort early/add more detailed output
in the case compare_fp_stats fails report the 2 values and their index

Change-Id: I927a832b7a1e24c392961093b7caee1134223def
2017-10-05 15:02:51 -07:00
Marco Paniconi e095bcce44 Merge "Adjust threshold for adapt_partition for speed 6." 2017-10-05 03:28:06 +00:00
Marco 18262a8576 Adjust threshold for adapt_partition for speed 6.
Lower SAD threshold to select non_rd pickmode partition
at superblock level more often.
Small gain in metrics, small/negligible decrease in speed.

Change-Id: I0f728236b91a604e4ca7e02039adc54d5985c4dc
2017-10-04 18:04:09 -07:00
Marco Paniconi 014976c251 Merge "Avoid nonrd_pick_partition for speed >= 6." 2017-10-04 23:36:27 +00:00
Marco 4bc1fc58b6 Avoid nonrd_pick_partition for speed >= 6.
For 1 pass vbr speed >= 6: when REFERENCE_PARTITION is selected,
avoid doing the full nonrd_pickmode based partition.
No change in overall metrics or speed.
Reduces encode times on scene cuts by 10-20%.

Change-Id: I0310b1610cc1c83793a509e0a9059840e8f18308
2017-10-04 15:31:54 -07:00
Marco Paniconi 6a42bdd25f Merge "Modify early exit for alt_ref in nonrd_pickmode." 2017-10-04 19:38:49 +00:00
Linfeng Zhang 127864deb3 Generalize 2:1 vp9_scale_and_extend_frame_ssse3()
Change-Id: I882da3a04884d5fabd4cd591c28682cbb2d76aa5
2017-10-04 12:35:39 -07:00
Linfeng Zhang b809442521 Update vp9_scale_and_extend_frame_ssse3()
Change-Id: I22622faebfcc36f7a4d1f37e3800ae8ab87c8cd4
2017-10-04 12:32:30 -07:00
Marco 77e51e2035 Modify early exit for alt_ref in nonrd_pickmode.
For 1 pass vbr mode:
On no-show_frame/ARF: instead of skipping alt_ref_frame
completely in mode testing, allow for checking (0, 0) on alt_ref.

Small gain in metrics, ~0.18%, no change in speed.

Change-Id: I32a3c24faca64ab70dd5091071a0dc301db7dd1e
2017-10-04 11:53:39 -07:00
Linfeng Zhang 9a71811d98 Merge changes Id6a8c549,Ib1e0650b,Ic369dd86
* changes:
  Refactor x86/vpx_subpixel_8t_intrin_ssse3.c
  Add vpx_dsp/x86/mem_sse2.h
  Add transpose_8bit_{4x4,8x8}() x86 optimization
2017-10-04 16:15:14 +00:00
Jerome Jiang ffa3a3c441 Merge "Fix image width alignment. Enable ImageSizeSetting test." 2017-10-04 14:48:03 +00:00
Marco 98dbf31c87 Enable arf usage for speed >= 6, 1 pass vbr.
For speed 6 on ytlive set:
On average, speed slowdown ~5%, quality gain ~2%.

Change-Id: Ia18237cc1d52c54d7e2cb3c71f571cf37ef61b44
2017-10-03 17:18:33 -07:00
Marco ab2bd340ac vp9: 1 pass vbr: Limit qpdelta on high_source_sad.
For 1 pass vbr: when significant content/scene change is detected
(high_source_sad = 1) reduce/turnoff the additional qdelta on the
active_worst_quality. This helps somewhat to reduce the occurrence
of large frame sizes and large encode times.
Allow it only when use_altef_onepass is enabled.

Neutral/no change on metrics.

Change-Id: I1dd97dd2ab892d65f707b841b27a5de300b714ea
2017-10-03 16:27:17 -07:00
James Zern 66b6b87471 Merge "vpx: fix nasm build errors" 2017-10-03 21:47:49 +00:00
Scott LaVarnway bc4bc9b622 vpx: fix nasm build errors
BUG=webm:1462,766721

Change-Id: Icfa536a8e38623636b96c396e3c94889bfde7a98
2017-10-03 20:02:21 +00:00
Linfeng Zhang 6543213e87 Refactor x86/vpx_subpixel_8t_intrin_ssse3.c
Change-Id: Id6a8c549709a3c516ed5d7b719b05117c5ef8bac
2017-10-03 13:02:05 -07:00
Linfeng Zhang 0f756a307d Add vpx_dsp/x86/mem_sse2.h
Add some load and store sse2 inline functions.

Change-Id: Ib1e0650b5a3d8e2b3736ab7c7642d6e384354222
2017-10-03 12:59:05 -07:00
Marco c8678fb7f3 Use adapt_partition for ARF in 1 pass.
For speed 6 real-time mode: use adapt_partition
on ARF frame instead of REFERENCE_PARTITION (which is slower).
This requires enabling compute_source_sad_onepass for no-show_frames.

Speedup of ~3-5% on some clips that heavily use ARF,
small loss (~0.2%) in quality on ytlive set.

Change-Id: Ib50acc97df06458244a6ac55d2bd882c30012536
2017-10-03 11:49:55 -07:00
Linfeng Zhang 67c38c92e7 Add transpose_8bit_{4x4,8x8}() x86 optimization
Change-Id: Ic369dd86b3b81686f68fbc13ad34ab8ea8846878
2017-10-03 10:00:30 -07:00
Marco Paniconi fe7b869104 Merge "ARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode." 2017-10-03 03:01:14 +00:00
Marco 33e10dfa7e ARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode.
Speedup of ~2-3% on 1080p clips speed 6.
Neutral/negligible loss in metrics on ytlive.

Change-Id: I7ac47a4d8b58c566920bae29a94a0e8d59c36dee
2017-10-02 19:04:03 -07:00
Linfeng Zhang 0e55b0b0a7 Add 4 to 3 scaling NEON optimization
Speed comparing with the one calling vpx_scaled_2d_neon()
  ~1.7 x in general
  ~2.8x for BILINEAR filter

BUG=webm:1419

Change-Id: I8f0a54c2013e61ea086033010f97c19ecf47c7c6
2017-10-02 15:04:09 -07:00
Linfeng Zhang 2c560c3c22 Specialize 4 to 3 frame scaling in C
Scale 3x3 block instead of 16x16 block in each loop. Disabled by
default.

Benefits:
1. Reduced number of different phase_scaler from 16 to 3.
   Optimization code will be smaller and faster.
2. Maximum phase_scaler drifting will be reduced from 5/16 to 1/24.
   (The drifting is 1/(3*16) in each step.)

BUG=webm:1419

Change-Id: I59a1f7496d89a1b090498c935d30cfcf1d0c282b
2017-10-02 11:56:15 -07:00
Scott LaVarnway 3c700052be Merge "vpxdsp: [x86] add highbd_d135_predictor functions" 2017-10-02 15:00:19 +00:00
Alexandra Hájková fb7fc1dbda ppc: Add vpx_idct32x32_1024_add_vsx
Change-Id: I55cd0a1569ccc47a53d0ecf751aac259d510e10d
2017-09-30 19:31:20 +00:00
Marco c8f6e7b99e Fix partition selection in speed features for arf overlay frame.
For real-time mode. Move the switch to fixed partition
for is_src_frame_alt_ref so all speeds may use it
if use_altref_onepass is set.

Improves metrics by ~2% for ytlive set at speed 4
(where use_altref_onepass is currently used).

Change-Id: I033240386598c9dbd0364da89ccbcca64bc663ee
2017-09-29 15:02:28 -07:00
Marco f2c3d0a7a3 Enable use_altref_onepass for speed 4 real-time mode.
Used for VBR mode with lag-in-frames > 0.
On ytlive set at speed 4: ~3% average gain.

Change-Id: I45dad1700bf8be9d8f177815dc062774f6f2f0de
2017-09-29 10:56:14 -07:00
Scott LaVarnway 3bbd62ed27 vpxdsp: [x86] add highbd_d135_predictor functions
C vs SSE2 speed gains:
_4x4 : ~1.81x

C vs SSSE3 speed gains:
_8x8 : ~1.96x
_16x16 : ~1.88x
_32x32 : ~2.02x

BUG=webm:1411

Change-Id: Iefaf8b39afbbfe34c1ad1d21e3a003b20f1f61e0
2017-09-29 08:56:38 -07:00
Scott LaVarnway 4cae64c32c vpxdsp: [x86] add highbd_d117_predictor functions
C vs SSE2 speed gains:
_4x4 : ~2.04x

C vs SSSE3 speed gains:
_8x8 : ~2.82x
_16x16 : ~5.93x
_32x32 : ~2.79x

BUG=webm:1411

Change-Id: I31d949695991c067dac89d91e0bed3e666c94993
2017-09-28 14:45:28 -07:00
Jerome Jiang 5a40c8fde1 Fix image width alignment. Enable ImageSizeSetting test.
BUG=b/64710201

Change-Id: I5465f6c6481d3c9a5e00fcab024cf4ae562b6b01
2017-09-28 11:25:24 -07:00
Marco a2ef180dd0 Set rc->high_source_sad = 0 before scene detection.
Only has effect when sf->use_altref_onepass is enabled,
as in that case scene detection is skipped for non-show frame
and so high_source_sad does not get reset to 0.

No change in metrics or speed.

Change-Id: I421f066d239341449c18826089e1810b9fc5967f
2017-09-28 10:49:45 -07:00
Marco Paniconi 3b8cc214ef Merge "vp9: Modification to adapt the ARF usage for 1 pass vbr" 2017-09-28 16:52:28 +00:00
Marco 03e8f13337 vp9: Modification to adapt the ARF usage for 1 pass vbr
Add stats for past ARF usage, and use it to disable
ARF usage based on some conditions.

Overall improvement on ytlive set, reduces the regression
on the problem clips for this feature.

Only affects when sf->use_altref_onepass is enabled
(currently off by default).

Change-Id: I66267f227ea132dc86acb730e9882f85bead2cdb
2017-09-28 09:10:30 -07:00
Marco c493ea1a6b Add use_svc condition to the scene detection in 1 pass.
Scene detection is not currently used in SVC 1 pass code.
Speedup of ~0.4%.

Change-Id: I0ab769300919de710cd2da1402014fa3f22a1f86
2017-09-27 14:51:46 -07:00
Marco Paniconi 786b124e20 Merge "Revert "Remove the speed condition on scene detection in 1 pass code."" 2017-09-27 20:42:48 +00:00
Scott LaVarnway 80992a746c Merge "vpxdsp: [x86] add highbd_d153_predictor functions" 2017-09-27 20:40:21 +00:00
Marco Paniconi 8d438dc313 Revert "Remove the speed condition on scene detection in 1 pass code."
This reverts commit 535b7b915a.

This is actually used in CBR to reset the rate control if high source sad is detected.

Original change's description:
> Remove the speed condition on scene detection in 1 pass code.
> 
> Scene detection is used for VBR mode and for screen_content mode.
> 
> It was also enabled for CBR mode via the speed condition,
> but currently the analysis in the scene detection is not used
> in CRB mode (similar computations are done locally at superblock level
> when the source_sad feature is enabled).
> 
> For 1 pass code.
> No change in behavior. Small speed gain, ~0.5%.
> 
> Change-Id: I59991d7ef2af320bea7af4b907596e057affa42f

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: Ib4e6b02047f75632503e7b0fc870af97fa9291c3
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2017-09-27 19:42:48 +00:00
James Zern 690fa6bb6e Merge "fix signed integer overflow of idct" 2017-09-27 19:39:11 +00:00
James Zern 6175f285a9 Merge "vp9_dx_iface: Stop using iter parameter incorrectly" 2017-09-27 18:37:20 +00:00
Linfeng Zhang dbbbd44304 fix signed integer overflow of idct
Exposed by fuzz test in high bitdepth.
The bug is introduced in commit 64653fa.

BUG=webm:1466

Change-Id: Idd77d5c6a60efb9241471611ce1aba0646cb6ff5
2017-09-27 11:17:54 -07:00
Scott LaVarnway 19c45ccd43 vpxdsp: [x86] add highbd_d153_predictor functions
C vs SSE2 speed gains:
_4x4 : ~1.95x

C vs SSSE3 speed gains:
_8x8 : ~3.30x
_16x16 : ~5.67x
_32x32 : ~3.87x

BUG=webm:1411

Change-Id: Ib483989b25614aa89b635e8c087d0879a5d71904
2017-09-27 11:01:11 -07:00
Marco 535b7b915a Remove the speed condition on scene detection in 1 pass code.
Scene detection is used for VBR mode and for screen_content mode.

It was also enabled for CBR mode via the speed condition,
but currently the analysis in the scene detection is not used
in CRB mode (similar computations are done locally at superblock level
when the source_sad feature is enabled).

For 1 pass code.
No change in behavior. Small speed gain, ~0.5%.

Change-Id: I59991d7ef2af320bea7af4b907596e057affa42f
2017-09-27 10:32:54 -07:00
Vignesh Venkatasubramanian 530e60143a vp9_dx_iface: Stop using iter parameter incorrectly
'iter' parameter is being checked for NULL in every call to
decoder_get_frame which is quite pointless because it is always
going to be NULL unless the application changed it. The code works
as described only because vp9_get_raw_frame returns -1 on all
subsequent calls after the first.

Change-Id: Ic736b9e8fe36fc1430fc11d6a9b292be02497248
2017-09-27 09:59:39 -07:00
Linfeng Zhang d203a91a09 Merge "Add vpx_scaled_2d_neon()" 2017-09-27 16:12:48 +00:00
Jerome Jiang 878464150b Merge "Add unit test to expose vp8 bug when width is set odd." 2017-09-27 01:26:59 +00:00
Shiyou Yin f12e786c54 Merge "vp8: [loongson] optimize copymen with mmi" 2017-09-27 00:49:28 +00:00
Jerome Jiang 767503504f Add unit test to expose vp8 bug when width is set odd.
BUG=b/64710201

Change-Id: Ia518af5494a42e80949cf1165244fbed59606cf7
2017-09-26 17:40:13 -07:00
Marco 819c5b365d Remove the speed condition in setting compute_source_sad.
The speed condition is not needed, feature can used for any
speed in 1 pass code.

Change-Id: I878ef3f63a075302eda48c0343fa243c80aab9ba
2017-09-26 15:48:34 -07:00
Marco d5094cfde8 Replace flag USE_ALTREF_FOR_ONE_PASS with speed feature.
To be used for 1 pass VBR.
Off by default in speed features.

Change-Id: I5d6110d6d191990db526fe68ec9715379a4d1754
2017-09-26 11:16:50 -07:00
Marco Paniconi 9e52d3910b Merge "SVC: Add setting for max_intra_rate_pct in sample encoder." 2017-09-26 16:28:30 +00:00
Linfeng Zhang 9d0d13e939 Add vpx_scaled_2d_neon()
BUG=webm:1419

Change-Id: I39c8033734562efc0ac0e28e7f06fa05130f9b96
2017-09-26 09:22:39 -07:00
Linfeng Zhang 28762341ac Merge changes Ib9105462,Idfac00ed,If8d8a0e2
* changes:
  cosmetics: NEON scaling code
  Refactor convolve NEON code
  Refactor convolve code
2017-09-26 16:10:46 +00:00
Shiyou Yin 73102d1ed2 vp8: [loongson] optimize copymen with mmi
1. vp8_copy_mem16x16_mmi
2. vp8_copy_mem8x8_mmi
3. vp8_copy_mem8x4_mmi

Change-Id: I3de29a11fa7402df0e48bbb944440b1e66498a65
2017-09-26 08:40:11 +08:00
Marco 23eccb3ca7 SVC: Add setting for max_intra_rate_pct in sample encoder.
Set it as default to 900.

Change-Id: Id2d990925dccff1f6762411c66ea95973440c92f
2017-09-25 13:39:18 -07:00
Scott LaVarnway a059dc0986 Merge "vpxdsp: [x86] add highbd_d45_predictor functions" 2017-09-25 11:34:14 +00:00
Scott LaVarnway cf82f7276e vpxdsp: [x86] add highbd_d45_predictor functions
C vs SSSE3 speed gains:
_4x4 : ~2.45x
_8x8 : ~10.61x
_16x16 : ~11.34x
_32x32 : ~6.36x

BUG=webm:1411

Change-Id: Ic91389a4f1a8ad093f498afe53765b897fb9be09
2017-09-22 05:20:12 -07:00
James Zern 691585f6b8 Merge changes If59743aa,Ib046fe28,Ia2345752
* changes:
  Remove the unnecessary cast of (int16_t)cospi_{1...31}_64
  Remove the unnecessary upcasts of (int)cospi_{1...31}_64
  Change cospi_{1...31}_64 from tran_high_t to tran_coef_t
2017-09-22 07:35:55 +00:00
Andrew Lewis 10bab1ec29 Merge "Comma-separate VP9 encoder tmp.stt output" 2017-09-21 08:50:53 +00:00
Marco Paniconi 0b08f8892f Merge "vp9: Modify pickmode early exit for ARF in 1pass." 2017-09-21 01:33:12 +00:00
Marco 42373b21ce vp9: Modify pickmode early exit for ARF in 1pass.
Add the condition frames_since_golden > 0 to the
early exit check for ARF usage in nonrd_pickmode.
This improves quality of first frame following ARF, where
frame_since_golden = 0.

Small/neutral gain in metrics for speed 6, neutral change in speed.

Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Change-Id: I82e73e6ff6fc849e5ca5448563cb8a0515fe0cdc
2017-09-20 15:02:37 -07:00
Linfeng Zhang d586cdb4d4 Remove the unnecessary cast of (int16_t)cospi_{1...31}_64
BUG=webm:1450

Change-Id: If59743aafe99226e0ec67ab5d20678ce25f53ab8
2017-09-20 14:13:26 -07:00
Linfeng Zhang 76a3d3fcc5 Remove the unnecessary upcasts of (int)cospi_{1...31}_64
BUG=webm:1450

Change-Id: Ib046fe28caec5b9ebdc9d0152df7c54ff4266858
2017-09-20 14:13:26 -07:00
Linfeng Zhang 64653fa133 Change cospi_{1...31}_64 from tran_high_t to tran_coef_t
The unnecessary upcast to (int) will be cleaned later.

BUG=webm:1450

Change-Id: Ia234575206d5a74540526924b06ed3939322d063
2017-09-20 14:13:26 -07:00
James Zern f7b276c26b Merge "Bug fix: fadst4() in vp9/encoder/vp9_dct.c" 2017-09-20 21:12:45 +00:00
Linfeng Zhang 24afb5d036 Bug fix: fadst4() in vp9/encoder/vp9_dct.c
A new bug was introduced in a80bdfd "Change sinpi_{1,2,3,4}_9 from
tran_high_t to int16_t". Reverted the change in this file.

BUG=webm:1450

Failed test C/TransHT.AccuracyCheck/26.

Change-Id: Id001f57aad811803ef7d367d2b2bc008d8499991
2017-09-20 12:27:29 -07:00
Marco Paniconi f407b30490 Merge "vp9: Modify simple_block_yrd condition for SVC" 2017-09-20 16:42:31 +00:00
Scott LaVarnway b85e391ac8 Merge "vpxdsp: [x86] add highbd_d63_predictor functions" 2017-09-20 11:39:28 +00:00
James Zern 15bea62176 temporal_filter_apply_sse2.asm: add ':' to label
quiets nasm warning:
label alone on a line without a colon might be in error

BUG=webm:1462

Change-Id: I660407ca60e8c9a810dba9d76afb65852029a29c
2017-09-19 18:59:11 -07:00
Linfeng Zhang 7c0529728a cosmetics: NEON scaling code
Change-Id: Ib91054622c1f09c4ca523bc6837d7d8ab9f03618
2017-09-19 16:39:17 -07:00
Linfeng Zhang f357335c38 Refactor convolve NEON code
Rename a couple of hbd static functions.
Move the position of NEON function convolve8_4().

Change-Id: Idfac00edf2e99cdd8e0a73b9f895402f60be6349
2017-09-19 16:28:36 -07:00
Linfeng Zhang bf8bdae913 Refactor convolve code
Extract a couple of static functions into their caller functions.

Change-Id: If8d8a0e217fba6b402d2a79ede13b5b444ff08a0
2017-09-19 16:28:31 -07:00
Scott LaVarnway bc86e2c6a2 vpxdsp: [x86] add highbd_d63_predictor functions
C vs SSE2 speed gains:
_4x4 : ~2.94x

C vs SSSE3 speed gains:
_8x8 : ~8.69x
_16x16 : ~6.32x
_32x32 : ~5.33x

BUG=webm:1411

Change-Id: I2c35b527eac2229f17aaa9d118fb601e7195efe4
2017-09-19 15:47:22 -07:00
Marco aaa6cdcc2e vp9: Modify simple_block_yrd condition for SVC
Modify simple_block_yrd condition in nonrd_pickmode for SVC:
allow it to be used also on base temporal_layer, only when
spatial_layer > 1 and block size < 32x32.

Speed up of about ~2% for 3 layer SVC, with little/negligible
loss in quality.

Change-Id: I7734bdae51cf51f22b96f6b2b27da20ea1d84344
2017-09-19 15:39:05 -07:00
Marco Paniconi 2a7c0e1c4b Merge "Add datarate test for frame_parallel_decoding mode off." 2017-09-19 22:31:08 +00:00
Marco cd463c7acb vp9: Fix condition for limiting ARF 1 pass vbr.
Fix the setting to frames_till_gf_update_due, and
adjust the limit value.
Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Neutral change to metrics and speed for ytlive.

Change-Id: I266d9a00b36221bc8602fa2746d4e8a8f7d4dfae
2017-09-19 11:12:37 -07:00
Marco Paniconi 310e388423 Merge "vp9: Adjustments for ARF usage in 1 pass vbr." 2017-09-19 16:29:19 +00:00
Marco ebb015a539 vp9: Adjustments for ARF usage in 1 pass vbr.
Only when USE_ALT_REF_ONE_PASS is enabled (off by default).
Force fixed partition to 64x64 when is_src_alt_ref_frame is true,
and don't force early exit for some modes in nonrd_pickmode
for ARF noshow frames.

Small gain ~0.2% on ytlive metrics for speed 6.
Neutral speed difference.

Change-Id: I27eb6622d0453c09a06ccdc3b16368762474d11d
2017-09-18 18:46:41 -07:00
Linfeng Zhang a80bdfd081 Change sinpi_{1,2,3,4}_9 from tran_high_t to int16_t
Add "typedef int16_t tran_coef_t;"

BUG=webm:1450

Change-Id: I67866f104898d1dda8989e1abdaf6983fe324154
2017-09-18 09:26:03 -07:00
Linfeng Zhang 9d278465b5 Merge "cosmetics: vp9_rtcd_defs.pl" 2017-09-18 16:23:33 +00:00
Shiyou Yin 2aacfa1acd Merge "vp8: [loongson] optimize dequantize with mmi" 2017-09-15 23:53:40 +00:00
Marco ad31fe36a8 Add datarate test for frame_parallel_decoding mode off.
Add datarate test, for both VBR and CBR mode, with the
frame_parallel_decoding mode disabled (and error_resilience off).

Change-Id: I54feec3248a68ecff4bef8d9a31bb1616fab77df
2017-09-15 11:38:38 -07:00
Paul Wilkins 65f1c90652 Merge "Fix bug in intra mode rd penalty." 2017-09-15 15:43:29 +00:00
Kaustubh Raste 08fda52e18 Merge "mips msa clean-up msa macros" 2017-09-15 01:27:02 +00:00
James Zern 90ed0d2f73 Merge "vp9_scale_test: add C config" 2017-09-15 00:27:58 +00:00
James Zern c12b39626f Merge "Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"" 2017-09-15 00:27:41 +00:00
Hui Su 293734b755 Merge "VP9 level targeting: add a new AUTO mode" 2017-09-14 21:02:38 +00:00
James Zern c24d911847 vp9_scale_test: add C config
Change-Id: I9dfe8255d1c096d246bf9719729f57dbae779ffc
2017-09-14 13:08:04 -07:00
James Zern baf658ec4c Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"
This reverts commit afee58f2c4.

This causes ~8x slowdown in 4:3 in the C-code

Change-Id: I60a7ead12dc4ec1548b1b12cfe4b0be42ef04e0e
2017-09-14 13:07:21 -07:00
Hui Su c3a6943c16 VP9 level targeting: add a new AUTO mode
In the new AUTO mode, restrict the minimum alt-ref interval and max column
tiles adaptively based on picture size, while not applying any rate control
constraints.

This mode aims to produce encodings that fit into levels corresponding to
the source picture size, with minimum compression quality lost. However, the
bitstream is not guaranteed to be level compatible, e.g., the average bitrate
may exceed level limit.

BUG=b/64451920

Change-Id: I02080b169cbbef4ab2e08c0df4697ce894aad83c
2017-09-14 16:20:29 +00:00
Shiyou Yin b81de66171 vp8: [loongson] optimize dequantize with mmi
1. vp8_dequantize_b_mmi
2. vp8_dequant_idct_add_mmi

Change-Id: I505f8afb7a444173392b325906e6a4f420f00709
2017-09-14 20:56:06 +08:00
Shiyou Yin 5b558592f5 vp8: [loongson] optimize idctllm with mmi
1. vp8_short_idct4x4llm_mmi
2. vp8_short_inv_walsh4x4_mmi
3. vp8_dc_only_idct_add_mmi

Change-Id: I616923681e79d78607a4988608fc39df77b093f4
2017-09-14 16:51:11 +08:00
Kaustubh Raste 4ca8f8f5e2 mips msa clean-up msa macros
Removed inline for GP load-store in case of (__mips_isa_rev >= 6)
Created one define LD_V for vector load and ST_V for vector store

Change-Id: Ifec3570fa18346e39791b0dd622892e5c18bd448
2017-09-14 12:29:19 +05:30
Linfeng Zhang 535dee0fb6 cosmetics: vp9_rtcd_defs.pl
Change-Id: I1bf57824e07fa4f8b3b5574984117f2bd7a1c086
2017-09-13 12:13:55 -07:00
Linfeng Zhang 0726dd97d3 Merge "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()" 2017-09-13 17:21:45 +00:00
Andrew Lewis 949730e2dc Comma-separate VP9 encoder tmp.stt output
Also add column headings so that the output can still be parsed if the
set of headers changes later.

Change-Id: I4beaf266521e093db4acf5f715b18fdfb7e3d1cd
2017-09-13 16:26:40 +01:00
Johann Koenig ed3a80cb5e Merge "Revert "Revert "quantize avx: copy 32x32 implementation""" 2017-09-13 14:44:53 +00:00
Kaustubh Raste 83e59914e5 Merge "Optimize mips msa vp9 average mc functions" 2017-09-13 06:02:49 +00:00
Shiyou Yin fa01426ade Merge "vp8: [loongson] optimize loopfilter with mmi" 2017-09-13 01:05:46 +00:00
Johann eb4238ac70 Revert "Revert "quantize avx: copy 32x32 implementation""
This reverts commit 8c42237bb2.

Because ssse3 code is used for the reference, the qcoeff and dqcoeff
reference buffers must be aligned.

Original change's description:
> quantize avx: copy 32x32 implementation
>
> Ensure avx and ssse3 stay in sync by testing them against each other.
>
> Change-Id: I699f3b48785c83260825402d7826231f475f697c

Change-Id: Ieeef11b9406964194028b0d81d84bcb63296ae06
2017-09-12 14:25:38 -07:00
Linfeng Zhang afee58f2c4 Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()
Scale 3x3 block instead of 16x16 block in each loop.

Benefits:
1. Reduced number of different phase_scaler from 16 to 3. Optimization code
   will be smaller and faster.
2. The maximum phase_scaler drifting will be reduced from 5/16 to 1/24.
   (The drifting is 1/(3*16) in each step.)

BUG=webm:1419

Change-Id: Ibb9242a629ddb03e1ff93b859bece738255e698c
2017-09-12 12:05:16 -07:00
Kaustubh Raste 30f1ff94e0 Optimize mips msa vp9 average mc functions
Load the specific destination loads instead of vector load

Change-Id: I65ca13ae8f608fad07121fef848e2a18f54171fe
2017-09-12 16:12:11 +05:30
Scott LaVarnway c39cd9235e Merge "vpxdsp: [x86] add highbd_d207_predictor functions" 2017-09-11 22:32:23 +00:00
Linfeng Zhang a9bbe53dbb Add 4 to 1 scaling NEON optimization
BUG=webm:1419

Change-Id: If82a93935d2453e61b7647aae70983db1740bec7
2017-09-11 10:17:28 -07:00
Scott LaVarnway d6c9bbc2b6 vpxdsp: [x86] add highbd_d207_predictor functions
C vs SSE2 speed gains:
_4x4 : ~2.31x

C vs SSSE3 speed gains:
_8x8 : ~4.73x
_16x16 : ~10.88x
_32x32 : ~4.80x

BUG=webm:1411

Change-Id: I0bac29db261079181ddabc6814bd62c463109caf
2017-09-11 07:36:24 -07:00
Shiyou Yin 761f2f5cb4 vp8: [loongson] optimize loopfilter with mmi
1. vp8_loop_filter_horizontal_edge_mmi
2. vp8_loop_filter_vertical_edge_mmi
3. vp8_mbloop_filter_horizontal_edge_mmi
4. vp8_mbloop_filter_vertical_edge_mmi
5. vp8_loop_filter_simple_horizontal_edge_mmi
6. vp8_loop_filter_simple_vertical_edge_mmi

Change-Id: Ie34bbff3a16cff64e39a50798afd2b7dac9bcdc3
2017-09-11 11:08:09 +08:00
James Zern fb40b5d7a7 intrapred: sync highbd_d63_predictor w/d63_
8/16/32: ~6%/~18%/~33% faster

previously:
7012ba639 vp9_reconintra: simplify d63_predictor

BUG=webm:1411

Change-Id: Ie775f3a4f7fd74df44754e65686d826a51c2cdc2
2017-09-08 19:28:01 -07:00
James Zern 9dfa76f948 vpx_mem: make vpx_memset16 inline
Change-Id: Ibb2cab930c95836e6d6e66300c33e7d08e4474d4
2017-09-08 19:11:46 -07:00
James Zern 5c95fd921e intrapred: sync highbd_d45_predictor w/d45_
8/16/32:: ~19%/~54%/~75.5% faster

previously:
acc481eaa vp9_reconintra: simplify d45_predictor

BUG=webm:1411

Change-Id: Ie8340b0c5070ae640f124733f025e4e749b660d8
2017-09-08 19:09:07 -07:00
James Zern 9a2dd7e67e Merge changes I9ec438aa,I99c954ff
* changes:
  Update convolve functions' assertions
  Add 2 to 1 scaling NEON optimization
2017-09-08 19:23:40 +00:00
paulwilkins 0657f4732c Fix bug in intra mode rd penalty.
The intra mode rd penalty was implemented as a rate penalty.
Code was added to scale the penalty according to block size but
this was not done correctly for the SB level or sub 8x8.

The code did a weird double scaling in regard to bit depth that
has been removed. Given that it is a rate penalty the bit depth
should not matter.

This bug fix improves average metrics  on our standard test
sets by about 0.1%

Change-Id: I7cf81b66aad0cda389fe234f47beba01c7493b1e
2017-09-08 15:10:53 +01:00
James Zern d7caee2170 vpx_scale_test.h: remove #if from inside macro
fixes visual studio error

Change-Id: I86206f17ca951b15e247c1b92561847d8c21ec7a
2017-09-08 00:06:25 -07:00
Shiyou Yin 43cbdc216d Merge "vp8: [loongson] optimize sixtap predict with mmi" 2017-09-08 00:59:31 +00:00
Shiyou Yin 2c7b7424c5 Merge "vpxdsp: [loongson] optimize sad functions with mmi" 2017-09-08 00:55:14 +00:00
Linfeng Zhang ef41c6286d Update convolve functions' assertions
So that 4 to 1 frame scaling can call them.

Change-Id: I9ec438aa63b923ba164ad3c59d7ecfa12789eab5
2017-09-07 12:33:58 -07:00
Linfeng Zhang 71b38a144e Add 2 to 1 scaling NEON optimization
BUG=webm:1419

Change-Id: I99c954ffa50a62ccff2c4ab54162916141826d9b
2017-09-07 12:33:50 -07:00
Linfeng Zhang 3ec20445b2 Refactor convolve8 NEON functions
Change-Id: I4ac576875c91fee7cb150d298fae4a2c156d374c
2017-09-06 15:55:17 -07:00
Linfeng Zhang d5d2cbcc75 Add ScaleFrameTest
Move class VpxScaleBase to new file test/vpx_scale_test.h.
Add new file test/vp9_scale_test.cc with ScaleFrameTest.

BUG=webm:1419

Change-Id: Iec2098eafcef99b94047de525e5da47bcab519c1
2017-09-06 15:54:58 -07:00
Linfeng Zhang 7219f31904 Merge "Remove get_filter_base() and get_filter_offset() in convolve" 2017-09-06 22:39:15 +00:00
Scott LaVarnway 0e95039bd9 Merge "vpxdsp: [x86] add highbd_dc_128_predictor functions" 2017-09-06 21:53:32 +00:00
Peter Boström 6822fb2f09 Remove support for stdatomic.h.
This header doesn't build on g++ v6 as it's a C and not C++ header
(_Atomic is not a keyword in C++11). Since the C and C++ invocations
cannot be guaranteed to point to the same underlying atomic_int
implementation, remove support for them and use compiler intrinsics
instead.

BUG=webm:1461

Change-Id: Ie1cd6759c258042efc87f51f036b9aa53e4ea9d5
2017-09-06 11:59:50 -04:00
Linfeng Zhang d331e7a1c0 Remove get_filter_base() and get_filter_offset() in convolve
so that the convolve functions are independent of table alignment.

Change-Id: Ieab132a30d72c6e75bbe9473544fbe2cf51541ee
2017-09-05 15:22:36 -07:00
Scott LaVarnway bc4bcca3fd vpxdsp: [x86] add highbd_dc_128_predictor functions
C vs SSE2 speed gains:
_4x4 : ~7.64x
_8x8 : ~16.60x
_16x16 : ~8.15x
_32x32 : ~5.05x

BUG=webm:1411

Change-Id: If165d419711cfda901bd428a05ca1560a009e62e
2017-09-05 07:57:42 -07:00
Shiyou Yin 0095213790 vp8: [loongson] optimize sixtap predict with mmi
1. vp8_sixtap_predict16x16_mmi
2. vp8_sixtap_predict8x8_mmi
3. vp8_sixtap_predict8x4_mmi
4. vp8_sixtap_predict4x4_mmi

Change-Id: I186669d1a1d998a0f3ba3a548e25eee8b52c251b
2017-09-02 19:08:20 +00:00
Shiyou Yin f4150163a2 vpxdsp: [loongson] optimize sad functions with mmi
1. vpx_sadWxH_c
2. vpx_sadWxH_avg_c
3. vpx_sadWxHx3_c
4. vpx_sadWxHx8_c
5. vpx_sadWxHx4d_c

Change-Id: Ie13161e3d73a052ea6ea7bac9cfadf55598fea7a
2017-09-02 15:11:32 +00:00
James Zern d49a1a5329 test,Android.mk: export gtest include path
fixes test file builds

Change-Id: Iaa725ad95d56cf77d9fef8994981a80102e9a966
2017-09-01 19:44:12 -07:00
clang-format 7587a97551 apply clang-format
Change-Id: If4c3e8a396d0fcb304f407b44e28cac3219f038c
2017-09-01 01:24:03 -07:00
James Zern 053bd263eb .clang-format: update to 4.0.1
based on Google style with the following differences:

3a4
> # Generated with clang-format 4.0.1
13c14
< AllowShortCaseLabelsOnASingleLine: false
---
> AllowShortCaseLabelsOnASingleLine: true
23c24
< BraceWrapping:
---
> BraceWrapping:
43c44
< ConstructorInitializerAllOnOneLineOrOnePerLine: true
---
> ConstructorInitializerAllOnOneLineOrOnePerLine: false
46,47c47,48
< Cpp11BracedListStyle: true
< DerivePointerAlignment: true
---
> Cpp11BracedListStyle: false
> DerivePointerAlignment: false
51c52
< IncludeCategories:
---
> IncludeCategories:
78c79
< PointerAlignment: Left
---
> PointerAlignment: Right
80c81
< SortIncludes:    true
---
> SortIncludes:    false

Change-Id: Ibc0ef87a516b8eae88d426dfdd7624be57e7b87c
2017-09-01 01:24:03 -07:00
Peter Boström be2ba48cac Merge "Prevent data race from low-pass filter." 2017-09-01 05:37:51 +00:00
James Zern 334e9abb0b Merge "inv_txfm_vsx: fix loads in high-bitdepth" 2017-09-01 03:09:49 +00:00
Peter Boström 9ab4d9df38 Prevent data race from low-pass filter.
Makes main thread wait for the filter level to be picked to avoid a race
between the LPF thread and update_reference_frames(). This also
re-enables the failing tests under thread_sanitizer where this data race
was detected.

BUG=webm:1460

Change-Id: I7f5797142ea0200394309842ce3e91a480be4fbc
2017-08-31 18:37:55 -07:00
Peter Boström 03191f738e Merge "Add atomics to vp8 synchronization primitives." 2017-09-01 01:36:22 +00:00
Peter Boström d42e876164 Add atomics to vp8 synchronization primitives.
Fixes issue on iPad Pro 10.5 (and probably other places) where threads
are not properly synchronized. On x86 this data race was benign as load
and store instructions are atomic, they were being atomic in practice as
the program hasn't been observed to be miscompiled.

Such guarantees are not made outside x86, and real problems manifested
where libvpx reliably reproduced a broken bitstream for even just the
initial keyframe. This was detected in WebRTC where this device started
using multithreading (as its CPU count is higher than earlier devices,
where the problem did not manifest as single-threading was used in
practice).

This issue was not detected under thread-sanitizer bots as mutexes were
conditionally used under this platform to simulate the protected read
and write semantics that were in practice provided on x86 platforms.

This change also removes several mutexes, so encoder/decoder state is
lighter-weight after this change and we do not need to initialize so
many mutexes (this was done even on non-thread-sanitizer platforms where
they were unused).

Change-Id: If41fcb0d99944f7bbc8ec40877cdc34d672ae72a
2017-08-31 17:55:57 -07:00
Scott LaVarnway ab5704f02c Merge "vpxdsp: [x86] add highbd_dc_left_predictor functions" 2017-08-31 21:34:27 +00:00
Jerome Jiang 20973508da Merge "vp9: Skip testing duplicate zero mv in nonrd-pickmode." 2017-08-31 17:16:19 +00:00
Jerome Jiang ebf3ae1a29 vp9: Skip testing duplicate zero mv in nonrd-pickmode.
Neutral on rtc set for speed 8. Neutral on ytlive for speed 5.

Saves some computation cycles but no speed gain observed on Pixel.

Change-Id: I34c4642cd543aa89c5b9c4bff6b7113577c64c91
2017-08-31 17:13:31 +00:00
James Zern f8f64c309b inv_txfm_vsx: fix loads in high-bitdepth
vec_vsx_ld -> load_tran_low

Change-Id: Id3144cdd528d2d406a515e5812e2ea9e4db64bf1
2017-08-30 23:47:56 -07:00
Jerome Jiang 297c110dcb Merge "Revert "Re-enable disabled tests under TSan."" 2017-08-31 01:52:42 +00:00
Jerome Jiang d7ba519b9f Revert "Re-enable disabled tests under TSan."
This reverts commit df9ce12259.

Reason for revert: 

Re-enabled tests still fail tsan in high bitdepth.

Original change's description:
> Re-enable disabled tests under TSan.
> 
> These tests point to an already-fixed bug, this should no longer have a
> data race.
> 
> BUG=webm:1049
> 
> Change-Id: Iaedc5db8df99362bdc501b70ff7fdebf8756fdb8

TBR=jzern@google.com,pbos@chromium.org,builds@webmproject.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug: webm:1049
Change-Id: I232f1f7726bf795b301abfb2e07cad6756642e53
2017-08-30 23:44:21 +00:00
Scott LaVarnway c39a05ff61 vpxdsp: [x86] add highbd_dc_left_predictor functions
C vs SSE2 speed gains:
_4x4 : ~6.49x
_8x8 : ~10.82x
_16x16 : ~7.61x
_32x32 : ~5.29x

BUG=webm:1411

Change-Id: Ibc30c50cb7139049bf05298010803499e6ef949b
2017-08-30 09:29:06 -07:00
Scott LaVarnway 2d0c11093e Merge "vpxdsp: [x86] add highbd_dc_top_predictor functions" 2017-08-30 11:25:07 +00:00
Scott LaVarnway f783e3a75d vpxdsp: [x86] add highbd_dc_top_predictor functions
C vs SSE2 speed gains:
_4x4 : ~7.39x
_8x8 : ~11.36x
_16x16 : ~8.68x
_32x32 : ~4.33x

BUG=webm:1411

Change-Id: I7f1487cd1531d4e7f0fbb4596fed3bfb72a59d58
2017-08-29 12:53:30 -07:00
Jerome Jiang fed52670a2 Merge "vp9: Speed 8: Enable skip_encode_sb" 2017-08-29 16:45:09 +00:00
Peter Boström 2f5fb37dac Merge "Re-enable disabled tests under TSan." 2017-08-29 15:42:39 +00:00
Scott LaVarnway 81f45ff798 Merge "vpxdsp: [x86] add highbd_h_predictor functions" 2017-08-29 14:05:13 +00:00
Scott LaVarnway 30d9a1916c vpxdsp: [x86] add highbd_h_predictor functions
C vs SSE2 speed gains:
_4x4 : ~8.12x
_8x8 : ~9.71x
_16x16 : ~8.21x
_32x32 : ~5.0x

BUG=webm:1422

Change-Id: I5e8a1ed4db7b8dc539b3e2a728b0b34d8b4b1993
2017-08-28 17:31:18 -07:00
Jerome Jiang 7c10251f22 vp9: Speed 8: Enable skip_encode_sb
Neutral in borg tests.

Some clips show 3-4% speed gain on 2 threads on Pixel.

Change-Id: Ic959f34e44892a854551de6e9a3d9ec819ffed00
2017-08-28 17:05:48 -07:00
Peter Boström df9ce12259 Re-enable disabled tests under TSan.
These tests point to an already-fixed bug, this should no longer have a
data race.

BUG=webm:1049

Change-Id: Iaedc5db8df99362bdc501b70ff7fdebf8756fdb8
2017-08-28 16:24:38 -07:00
Jerome Jiang 64c55576b7 vp9: Remove resolution condition for using source_sad in speed 6.
Rev d147771 fixed the test failure. So remove the resolution condition
for using source_sad in speed 6.

BUG=webm:1452

Change-Id: I1efba97e1ef5bd4de5f886299f6fcb907187abcd
2017-08-28 12:49:54 -07:00
Matt Oliver 588413f839 project: Update AppVeyor to use nuget gitlink. 2017-08-28 03:44:56 +10:00
Matt Oliver 7e75853e91 project Fix importing data objects in other projects. 2017-08-28 03:27:08 +10:00
Matt Oliver 1bdf0fb518 project: Fix build events when OutDir contains spaces. 2017-08-28 03:27:08 +10:00
Matt Oliver d0db9f1e94 project: Cleanup unneeded options. 2017-08-27 22:02:21 +10:00
Matt Oliver 75be519d52 project: Set minimum VS version to 2013. 2017-08-27 21:40:27 +10:00
Marco Paniconi 255241c6d0 Merge "vp9: Speed 6 adapt_partition for live/vbr usage." 2017-08-25 22:00:08 +00:00
Marco Paniconi 43b9e785ba Merge "vp9: SVC: Modify mv search condition in speed features." 2017-08-25 21:46:35 +00:00
Marco a0de2692fc vp9: Speed 6 adapt_partition for live/vbr usage.
Enable adapt_partition for vbr mode for speed 6.
This allows the usage of the pickmode-based partition
(used in speed 5), but only selectively for superblocks
with high source sad, otherwise the faster variance based
partition scheme is used.

For speed 6 on ytlive set: avgPSNR/SSIM metrics up by ~0.6%,
several clips up by ~1.5%. Small/negligible decrease in speed.

Change-Id: I12f3efef6b3e059391de330fdbe5a44c2587f1f8
2017-08-25 11:36:34 -07:00
Marco Paniconi 3e069846b9 Merge "Revert "quantize avx: copy 32x32 implementation"" 2017-08-25 18:20:31 +00:00
Marco a74593b30c vp9: SVC: Modify mv search condition in speed features.
For SVC at speed >= 7: only use the improved mv search
on base spatial layer, if top layer resolution is above 640x360.

~2.3% speedup
Small/negligible loss in avgPSNR metrics on rtc set.

Change-Id: Iaef75a57ebf1c248931bc1aa28d20b7fecac1851
2017-08-25 10:12:38 -07:00
Marco Paniconi 8c42237bb2 Revert "quantize avx: copy 32x32 implementation"
This reverts commit f60d1dcd3d.

Reason for revert: <INSERT REASONING HERE>
Failures in AVX/VP9QuantizeTest in nightly tests.
Original change's description:
> quantize avx: copy 32x32 implementation
> 
> Ensure avx and ssse3 stay in sync by testing them against each other.
> 
> Change-Id: I699f3b48785c83260825402d7826231f475f697c

TBR=slavarnway@google.com,johannkoenig@google.com,builds@webmproject.org

Change-Id: Ibd38636212269328317dd0721be9d25452113d1c
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2017-08-25 16:56:08 +00:00
Shiyou Yin ece1989fa2 Merge "vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi." 2017-08-25 06:44:02 +00:00
Marco Paniconi 34e48d6115 Merge "vp9: Adjust 16x16 splot threshold for variance partition" 2017-08-24 22:26:43 +00:00
Tom Finegan ccae8da7c6 Make sure diff is present at configure time.
This avoids an endless build loop at vpx_version.h
creation time when diff is not present.

Change-Id: I16ae386dbdaf14f9a2b85e4c5d1aaa6c08f52a45
2017-08-24 12:11:48 -07:00
Johann Koenig 6c21650c0e Merge "quantize avx: copy 32x32 implementation" 2017-08-24 18:55:03 +00:00
Shiyou Yin 9e4647c7ab vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi.
Change-Id: Ia576a721df6312329b599c31cfe1fb1267a9f174
2017-08-25 01:58:49 +08:00
Marco d14777157e vp9: Adjust 16x16 splot threshold for variance partition
For speeds < 7, increase threshold that controls the split
of 16x16->8x8 blocks, for resolutions 720p and higher.

Minor change for speed 5 (since it uses reference partition scheme
which only uses variance partition as first step).
For speed 6: ~0.5% increase in avgPSNR/SSIM metrics on ytlvie set.
No change in speed.

Change-Id: I5126580973201538d8ca26a9256b93c4d11d685b
2017-08-24 10:44:05 -07:00
Johann Koenig 258122fdc6 Merge "quantize test: skip block was removed" 2017-08-24 17:43:10 +00:00
Johann f60d1dcd3d quantize avx: copy 32x32 implementation
Ensure avx and ssse3 stay in sync by testing them against each other.

Change-Id: I699f3b48785c83260825402d7826231f475f697c
2017-08-24 10:42:34 -07:00
Johann 1787e7dbe0 quantize ssse3: copy implementation to intrinsics
Still does not pass tests. Does match the previous assembly, although
saving the sign before multiplying is dubious.

Change-Id: Ia163f18c755aba542d6e93f7bf7343184660df5a
2017-08-24 07:47:51 -07:00
Johann 92aafefa1e quantize test: skip block was removed
Change-Id: I1d93698bc27529b0544d79dd7b9fe37afa51ef87
2017-08-24 07:21:42 -07:00
Johann Koenig 2dc0a5132d Merge "quantize test: set threshold for 32x32" 2017-08-24 14:04:29 +00:00
Shiyou Yin d080c92524 Merge "vpx_dsp:loongson optimize vpx_mseWxH_c(case 16x16,16X8,8X16,8X8) with mmi." 2017-08-24 00:55:11 +00:00
Marco Paniconi 30c261b1eb Merge "vp9: SVC: Skip NEWMV for small blocks for (0, 0) base_mv." 2017-08-23 23:09:33 +00:00
Johann e89344d61a quantize test: set threshold for 32x32
Change-Id: I77be617c7d7c64929dd51c6077322f4f8ad23897
2017-08-23 15:59:11 -07:00
Johann Koenig f53b656207 Merge "quantize avx: copy implementation to intrinsics" 2017-08-23 21:14:13 +00:00
Marco c9ff7b6637 vp9: SVC: Skip NEWMV for small blocks for (0, 0) base_mv.
For SVC encoding:
average speedup ~1.5%, with small ~0.57 loss in avgPSNR metrics.

Change-Id: Icebce6f6ef4e819d7dfcf8db898c583167351de4
2017-08-23 13:08:27 -07:00
Scott LaVarnway 1aad50c092 Merge "vpx_dsp: get32x32var_avx2() cleanup" 2017-08-23 19:59:25 +00:00
Johann Koenig dfafd10ef5 Merge "quantize neon: round dqcoeff towards zero" 2017-08-23 19:20:53 +00:00
Johann 7c27872164 quantize avx: copy implementation to intrinsics
Adds an early exit based on ptest. Slightly slower than ssse3 in the
full case because of the extra check, but potentially faster if lots of
rows can be skipped.

Very close in speed to the assembly.

Can run in 32 bit, unlike the assembly. Allows reworking the function
prototype to use structs.

Change-Id: If80e2b9ba059370a4cad3c973196e82a97b4330e
2017-08-23 09:19:16 -07:00
Johann 2a5aa98a35 quantize neon: round dqcoeff towards zero
Add 1 if negative to get dqcoeff to round towards zero.

10-15% faster than converting to positive before shifting.

Change-Id: I01a62fd0c9bca786b6885b318bd447bb9229903d
2017-08-23 08:05:50 -07:00
Johann e83d99d7b8 quantize fp: neon implementation
About 4x faster when values are below the dequant threshold and 10x
faster if everything needs to be calculated.

Both numbers would improve if the division for dqcoeff could be
simplified.

BUG=webm:1426

Change-Id: I8da67c1f3fcb4abed8751990c1afe00bc841f4b2
2017-08-23 08:01:30 -07:00
Shiyou Yin 59e065b6ed vpx_dsp:loongson optimize vpx_mseWxH_c(case 16x16,16X8,8X16,8X8) with mmi.
Change-Id: I2c782d18d9004414ba61b77238e0caf3e022d8f2
2017-08-23 15:14:15 +08:00
Marco Paniconi 0207f17144 Merge "vp9: Condition lighting change detection on CBR mode." 2017-08-22 22:52:05 +00:00
Johann Koenig 103e4e50a8 Merge changes I53f8a160,I48f282bf
* changes:
  quantize ssse3: copy style from sse2
  quantize sse2: copy opts from ssse3
2017-08-22 22:27:56 +00:00
Marco a31461c853 vp9: Condition lighting change detection on CBR mode.
This feature is used for the CBR RTC encoding mode
at speed >= 6. This change will exclude it for VBR mode.

For speed 6 live encoding (VBR):
avgPSNR/SSIM metrics on ytlive set up by ~1% (few clips up by 2/3%).
No change in speed.

Change-Id: I1a0dd94c334f7df309ab5a48d477d7e25355b798
2017-08-22 14:59:37 -07:00
Johann b9c1dcc5fa quantize ssse3: copy style from sse2
Change-Id: I53f8a160e640c674ea035fc112e207b6dca42598
2017-08-22 14:25:27 -07:00
Johann Koenig 7f2993f5e4 Merge "quantize: capture skip block early" 2017-08-22 20:03:02 +00:00
Johann 75752ab7c0 quantize sse2: copy opts from ssse3
Simplify eob calculations based on ssse3 implementation.

General clean up and re-scoping.

Change-Id: I48f282bf9bd28ee9bc2c7a6779be9d45b5a3a3ee
2017-08-22 13:01:44 -07:00
Johann Koenig ab27b68693 Merge changes Icfb70687,I9a963e99,Ie8ac00ef,I1272917c
* changes:
  quantize: ignore skip_block in arm
  quantize: ignore skip_block in x86
  quantize fp: ignore skip_block in arm
  quantize fp: ignore skip_block in x86
2017-08-22 19:19:14 +00:00
Johann 7a178a5631 quantize: capture skip block early
This should probably be handled before vp9_regular_quantize_b_4x4 even
gets called.

Fixes an assert resulting from removing skip_block from the quantize
functions.

BUG=webm:1459

Change-Id: I7f52b53f959b4654b3d4517ebda31a678f4d0fde
2017-08-22 12:10:55 -07:00
James Zern 419ce36294 Merge "ppc: Add vpx_idct16x16_256_add_vsx" 2017-08-22 00:48:39 +00:00
Shiyou Yin bff5aa9827 Merge "vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi." 2017-08-22 00:37:23 +00:00
Johann 2c56bb97f2 quantize: ignore skip_block in arm
Change-Id: Icfb70687476b2edb25d255793ba325b261d40584
2017-08-21 14:37:50 -07:00
Johann c02fdd0258 quantize: ignore skip_block in x86
Change-Id: I9a963e99f08761f0c8d6a305619270b2f1c4edf8
2017-08-21 14:37:03 -07:00
Johann b527b47312 quantize fp: ignore skip_block in arm
Change-Id: Ie8ac00efa826eead2a227726a1add816e04ff147
2017-08-21 14:34:48 -07:00
Johann 7b13d99b98 quantize fp: ignore skip_block in x86
Change-Id: I1272917c49cf6e6710e52c36535b2fc8c8dced78
2017-08-21 14:33:41 -07:00
Johann 661efeca97 quantize test: test _fp_ version of quantize
None of the x86 optimizations pass the tests.

Change-Id: Ic67f2ba1977b657e68f2a13b0711fc5fcbafd909
2017-08-21 12:29:41 -07:00
Johann 13eed991f9 Remove skip_block from quantize
This condition is handled before this code is reached. The ssse3 version
of the function has always crashed when attempting to handle the
skip_block condition.

Add assert() and comments regarding the usage of skip_block.

Removing the parameter is a fairly involved process so leave it be for
the moment.

Change-Id: Ib299f6fc6589d7ee102262cc74a7aeb60110bc5a
2017-08-21 09:49:04 -07:00
Scott LaVarnway eab3f5e0cc vpx_dsp: get32x32var_avx2() cleanup
renamed to get32x16var_avx2()

BUG=webm:1404

Change-Id: Icb8f3986c9c9c646e13a69430db7235fc7e1a036
2017-08-18 13:44:09 -07:00
Scott LaVarnway 2c5478e383 Merge "vpx_dsp: vpx_get16x16var_avx2() cleanup" 2017-08-18 20:30:59 +00:00
Scott LaVarnway 2f7497f341 vpx_dsp: vpx_get16x16var_avx2() cleanup
BUG=webm:1404

Change-Id: I88aceb07f4db4870a06eee21d87296974ce3221a
2017-08-18 12:23:49 -07:00
Johann Koenig 1426f04e91 Merge "quantize: normalize intermediate types" 2017-08-18 16:00:28 +00:00
Shiyou Yin 7d82e57f5b vpx_dsp:loongson optimize vpx_subtract_block_c (case 4x4,8x8,16x16) with mmi.
Change-Id: Ia120ad1064d0b6106d9685cf075bdab373eef19e
2017-08-18 09:06:49 +08:00
James Zern bb15fd51be highbd_idct32x32*,idct32_34_4x32_quarter_1_2: fix typo
135 -> 34

fixes unused function warnings for highbd_idct32_34_4x32_quarter_[12]

Change-Id: I4f50ff6ea514200af93dd59ff94c7f9717409682
2017-08-17 15:37:38 -07:00
Johann 7f602d6114 quantize: normalize intermediate types
Despite abs_coeff being a positive value, all the other implementations
treat it as signed which simplifies restoring the sign.

HBD builds cast qcoeff to avoid a visual studio warning. Match
vp9_quantize.c style of casting the entire expression.

Change-Id: I62b539b8df05364df3d7644311e325288da7c5b5
2017-08-17 12:34:28 -07:00
James Zern e038d1610e inv_txfm_sse2.h: correct idct*/iadst* prototypes
fixes mismatch between prototypes and definitions

Change-Id: Ib5e7dfcce244dbb8401815be2cdd183d96792652
2017-08-16 23:06:09 -07:00
Paul Wilkins f64e14047d Merge "Prevent parameters that can cause invalid ARF groups." 2017-08-16 18:25:57 +00:00
Paul Wilkins 372336d1e5 Merge "Fix corrupt arf groups due to low "lag_in_frames"" 2017-08-16 18:25:29 +00:00
Linfeng Zhang f95686895b Merge changes I08b562b6,Ia275940a,I51106e90
* changes:
  Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1}
  Update highbd idct x86 optimizations.
  Update 32x32 idct sse2 and ssse3 optimizations.
2017-08-16 16:36:37 +00:00
paulwilkins b814e2d898 Prevent parameters that can cause invalid ARF groups.
Having a very low "lag_in_frames" value could cause the encoder to create
incorrect / corrupt ARF groups including displayed frames that update the
ARF buffer and false overlay frames that are coded at low rate but are not
actually overlays of a real ARF frame.

This is linked to a reported unit test "slow down" where the chosen parameters
(lag of 3 frames) gave rise to such "broken" ARF group(s).

See also BUG=webm:1454

Change-Id: If52d0236243ed5552537d1ea9ed3fed8c867232c
2017-08-16 14:33:59 +01:00
paulwilkins 48110d0f79 Fix corrupt arf groups due to low "lag_in_frames"
Having a very small value for "lag_in_frames" can result in
corrupt arf groups including displayed frames that update
the arf buffer and fake overlay frames that are not in fact
overlays of real arfs but are nevertheless starved of bits.

Leaving lag_in_frames at the default of 25 for these 5 frame two
pass VBR tests should now give rise to a valid ARF coding pattern
as follows:-  K(ey), A(rf), N(ormal), N, N, O(verlay).

This change is part of a response to BUG=webm:1454 where broken
arf groups interacted badly with a change that corrects for large rate
misses. However, it may still in some cases increase encode time by
virtue of the fact that the unit test now codes a correct coding pattern
with "hidden" ARF frames.

Change-Id: Ifd0246a4c1d0be247247c754024d7a4ed5f66a6b
2017-08-16 14:07:24 +01:00
Paul Wilkins 0472382dbe Merge "Fix for encoder slowdown (for speeds >= 3)" 2017-08-16 13:01:38 +00:00
paulwilkins e15be3025b Fix for encoder slowdown (for speeds >= 3)
Some clips in nightly unit test exhibiting significant encoder slowdown which
appears to bisect to Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a.

The above change allowed for emergency iterations of the recode loop and
adjustment of the Q range if there is a large rate miss.

This patch disables the above adaptation for cases of cpu_speed >= 3 or more
specifically where cpi->sf.recode_loop >= ALLOW_RECODE_KFARFGF.

For speeds >= 3 the code does not currently run a dummy bit pack operation
inside the recode loop. Without this dummy pack operation there is no up to
date estimate of the current frame's size to use as a basis for assessing the
requirement for a recode. In practice it was using the previous frames size (or 0
for the first frame) which could cause odd behavior.

If we require the emergency rate correction added in  Change-Id: I6923.. for
the higher speed settings it will be necessary to enable the dummy pack
which will in turn hurt encode speed.

BUG=webm:1454

Change-Id: I4fb3c6062ca9508325a6f31582f8e80f1a9b126f
2017-08-16 10:56:52 +01:00
Jerome Jiang 6b9c691daf Merge "Clean up writing YUV files for debug purpose." 2017-08-15 18:28:54 +00:00
Marco Paniconi 14437d0fa6 Merge "vp9: Denoiser fix: use correct bsize for skin detection." 2017-08-15 17:53:08 +00:00
Jerome Jiang a153080b55 Clean up writing YUV files for debug purpose.
Change legacy vp8/9_write_yuv_frame to vpx_write_yuv_files.
Delete some flags that can be enabled during build.

To enable writing denoised YUV, use the following command line:
CFLAGS='-DOUTPUT_YUV_DENOISED' ./configure
--enable-vp9-temporal-denoising

For skinmap, use CFLAGS='-DOUTPUT_YUV_SKINMAP'

Change-Id: I236974ac8b3cf279d20c4dc7f6162d8b480b6528
2017-08-15 10:44:03 -07:00
Johann Koenig c59d1a4dc7 Merge changes I1f1edeaa,I89313cac
* changes:
  quantize: silence unsigned overflow warning
  quantize test: quiet overflow warning
2017-08-15 17:37:59 +00:00
Marco e9ccc6fe79 vp9: Denoiser fix: use correct bsize for skin detection.
Change-Id: I9d201fa3a4b00ebd147b57ed519fab8d59b0a802
2017-08-15 10:02:19 -07:00
Johann 77ed4414d6 quantize: silence unsigned overflow warning
The result of the xor operation is unsigned. If coeff was negative,
this results in an unsigned value - INT_MIN.

Change-Id: I1f1edeaa6de1f4c68b848e8a82a666d390b749f0
2017-08-15 09:48:24 -07:00
Scott LaVarnway 7e8357d664 Merge "vp9: strip temporal filter code" 2017-08-15 15:35:33 +00:00
Johann 08cb7b5c68 quantize test: quiet overflow warning
Promote the result of RandRange to signed

Change-Id: I89313cace3bcbe9af96946bef00b6857fc48b128
2017-08-15 08:28:09 -07:00
Paul Wilkins ca393c9726 Merge "Patch relating to Issue 1456." 2017-08-15 14:57:56 +00:00
Paul Wilkins 5009302bce Merge "Enable emergency fast Q adaptation for VBR test case." 2017-08-15 14:57:22 +00:00
Linfeng Zhang d72e20b123 Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1}
BUG=webm:1412

Change-Id: I08b562b60fa85fbc2fec1c15c323a3444b44618f
2017-08-14 17:05:22 -07:00
Linfeng Zhang 69775d2f40 Update highbd idct x86 optimizations.
BUG=webm:1412

Change-Id: Ia275940af7d7d8637e9a851a9e39d655bfbe4069
2017-08-14 16:59:50 -07:00
Linfeng Zhang 3f05a70c41 Update 32x32 idct sse2 and ssse3 optimizations.
Change-Id: I51106e90344035452621c49a6e1be7d5276b6c70
2017-08-14 16:59:31 -07:00
Scott LaVarnway fa85cf131c vp9: strip temporal filter code
when CONFIG_REALTIME_ONLY is enabled.

BUG=webm:1446

Change-Id: Id547783ec75383966c40ab5cf6abb4a0f7984f52
2017-08-14 14:27:53 -07:00
Johann Koenig ff184e482a Merge changes I4b4beab1,I02f74dec
* changes:
  quantize test: check skip_block
  quantize test: use negative input
2017-08-14 20:52:52 +00:00
Johann Koenig 45b39750d6 Merge "temporal filter test: adjust inputs and runtime" 2017-08-14 20:46:22 +00:00
Jerome Jiang 2caff16151 vp9 svc: Fix the stats output when sl = 1.
Actual frame size and bitrate is all 0 when using SVC sample encoder
with sl = 1 because the stats are set in parse_superframe_index which
will not caculate properly when sl = 1 since there is no superframe.

Use pkt->data.frame.sz instead when sl = 1.

Change-Id: I93f5e98a4c779e32b007e1564ba5396af9e34ad6
2017-08-14 12:00:00 -07:00
Scott LaVarnway 1ab60466ec Merge "vp9: strip mb graph code" 2017-08-14 18:01:44 +00:00
Johann c06d6649c5 temporal filter test: adjust inputs and runtime
Use input with a narrow range because the filter only applies when the
frames are similar.

Run CompareReferenceRandom more times. Especially before narrowing the
input range, the filter frequently did not apply.

Change-Id: Ie249bedf6d0d33dfa5884611cb1835788e418b38
2017-08-14 17:24:11 +00:00
James Zern 746c0eab3b disable SSSE3/VP9QuantizeTest* in hbd builds
this test fails with the configuration similar to the assembly prior to:
d52cb5972 quantize: copy ssse3 optimizations to intrinsics

BUG=webm:1458

Change-Id: Idc5c0b84c0598259fc49609a9f0756de531d3baf
2017-08-14 09:31:14 -07:00
Scott LaVarnway e702b68b6c vp9: strip mb graph code
when CONFIG_REALTIME_ONLY is enabled.

BUG=webm:1446

Change-Id: I4b1b8e9a456830ba1b1bd3a8882e038d37ee7903
2017-08-11 12:59:40 -07:00
Johann e022ce84ac Rename vp8 quantize file
BUG=webm:1457

Change-Id: Ie8fae018ad8417724fde087055b90228850d631d
2017-08-11 10:44:36 -07:00
Jerome Jiang d48be6ad73 Merge "vp9 SVC: Fix the denoiser frame buffer management." 2017-08-11 00:54:35 +00:00
Jerome Jiang 0f8ebddec4 vp9 SVC: Fix the denoiser frame buffer management.
Change the denoiser frame buffer management for SVC to more generally
handle the layer patterns in SVC (where last is not always refreshed).

This change is only for SVC with denoising and is bitexact.

Change-Id: Ic2b146a924cdf6e7114609158afa3d4880fe3fae
2017-08-10 16:56:46 -07:00
Linfeng Zhang 15193ce51f Merge "Clean highbd idct x86 code with inline functions" 2017-08-10 20:25:18 +00:00
Johann Koenig 9bb8ce5efb Merge "neon: vpx_quantize_b_32x32" 2017-08-10 15:42:49 +00:00
Johann Koenig 0b393ae505 Merge "quantize: copy ssse3 optimizations to intrinsics" 2017-08-10 15:42:20 +00:00
paulwilkins db8fa86a6c Patch relating to Issue 1456.
Testing of 4k videos encoded with a fixed arbitrary chunking interval
uncovered a bug where by if a chunk ends 1 frame before a real scene cut,
the next chunk may be encoded with two consecutive key frames at the start
with the first being assigned 0 bits.

This fix insures that where there is a key frame group of length 1 it is
at least assigned 1 frames worth of bits not 0.

See also patch Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a
which by virtue of allowing fast adaptation  of Q made this bug more visible.

BUG=webm:1456

Change-Id: Ic9e016cb66d489b829412052273238975dc6f6ab
2017-08-09 16:34:43 +01:00
Linfeng Zhang 39da7fb786 Clean highbd idct x86 code with inline functions
Created inline functions highbd_butterfly_cospi16_sse2()
and highbd_butterfly_cospi16_sse4_1()

BUG=webm:1412

Change-Id: Icbc53a73712b6207379872a5e88d0a4d09e2322a
2017-08-08 17:53:28 -07:00
Marco Paniconi 68805583e9 Merge "vp9: Partition logic adjustment for speed 6 feature." 2017-08-08 23:08:10 +00:00
Johann 357adb68b2 quantize test: check skip_block
Not all sizes were tested previously. Only 4x4 and 32x32

Change-Id: I4b4beab1b92a810a097a7306de04cc9e0e260315
2017-08-08 14:21:58 -07:00
Johann 1092cc7f1a quantize test: use negative input
coeff contains signed values.

Change-Id: I02f74decf30379a28122169ab3e844d0f3bd7d23
2017-08-08 14:19:56 -07:00
Johann 93166c5e51 neon: vpx_quantize_b_32x32
With skip block the neon is about twice as fast as C.

The neon has no shortcut for coeff < zbin so it always takes the
same amount of time. Even if the C can take the shortcut, it is over
twice as fast in neon. If it can't, that gap increases to over 10x.

BUG=webm:1426

Change-Id: I400722146c1b5a5f6289f67d85fd642463d2bfc6
2017-08-08 14:05:18 -07:00
Johann d52cb59729 quantize: copy ssse3 optimizations to intrinsics
Fairly minor differences from sse2. pabsw and psignw are the big gains.
Also re-uses some values in eob calculation to avoid an extra pcmp.

Fixes test failures in HBD and OS X builds.

Allows using it in 32bit builds, where it is about 40% faster than sse2.

Substantially faster than the assembly for skip_block. 10-20% faster the
rest of the time.

Change-Id: If783bb3567e561e47667e10133b9c84414a334e2
2017-08-08 12:22:14 -07:00
Marco 427de67e63 vp9: Partition logic adjustment for speed 6 feature.
When adapt_partition_source_sad is enabled (currently only at
speed 6 for resoln <= 360p): use lower subsize (8x8 instead of 16x16)
for nonrd_select_partition on 32X32 blocks.

And force avoiding rectangular partition checks in
nonrd_pick_partition for speed >= 6.

Small increase ~0.5 in metrics for speed 6 on rtc_derf,
no change in speed.

Change-Id: Id751bc8f7573634571b2d6f5e29627cd5cebccae
2017-08-08 11:31:27 -07:00
Linfeng Zhang 853165ba39 Update 32x32 idct sse2 funcs, add partial case 135
Change-Id: I2b9add83f6fd8f9138fed3bec04a59877a237a6a
2017-08-07 17:37:02 -07:00
Linfeng Zhang d670678f26 Rename highbd_multiplication_and_add_xx() to highbd_butterfly_xx()
in idct x86 code

Change-Id: I5159499a73a5c1b680516f6ca9c3d84f00c35083
2017-08-04 15:33:37 -07:00
Linfeng Zhang fa829e0e5a Replace multiplication_and_add() with butterfly() in idct x86 code
Change-Id: I266e45a3d75a5357c7d6e6f20ab5c6fdbfe4982e
2017-08-04 15:33:34 -07:00
Linfeng Zhang c9fb719ee1 Update butterfly() in idct x86 optimizations.
Change-Id: Ic73e03bab9fdc085146f52094014db4af36ad701
2017-08-04 15:33:28 -07:00
Linfeng Zhang 7f20c3ac44 Add vpx_highbd_idct16x16_{10, 38, 256}_add_sse4_1
BUG=webm:1412

Change-Id: I8877c986b4042f7b8e33f5674c86700675a0e4ca
2017-08-04 15:31:17 -07:00
Linfeng Zhang 22b6dc9fdf Update for loop increment of idct x86 functions
Change-Id: Ided7895eaf41d5bc9d64fe536a17f5a078da68d4
2017-08-04 15:29:19 -07:00
Linfeng Zhang 0c61331244 Update high bitdepth 16x16 idct x86 code
Prepare for high bitdepth 16x16 idct sse4.1 code.
Just functions moving and renaming.

BUG=webm:1412

Change-Id: Ie056fe4494b1f299491968beadcef990e2ab714a
2017-08-04 15:12:33 -07:00
Johann Koenig cbb83ba4aa Merge "quantize test: consolidate sizes" 2017-08-04 20:34:50 +00:00
Johann 9578a84205 quantize test: consolidate sizes
Pass a max txfm size parameter and combine the base quantize
test with the 32x32 test.

Change-Id: I72ddf020fe6888e864ea9f3642ee2d9a8e48a04b
2017-08-04 12:45:32 -07:00
Scott LaVarnway c42517568d vpx_dsp: merge avx2 variance files
BUG=webm:1404

Change-Id: Ieb8f85c3811b05df78722cb41eeb1166966ceec4
2017-08-04 07:49:30 -07:00
Kaustubh Raste 39e8b8dac6 Fix mips dspr2 6 tap filter clobber list
Change-Id: Ib7c07e6ce00a5c7e59113b16e6661a8369f9e646
2017-08-04 10:56:56 +05:30
Linfeng Zhang e921c7ba8d Merge "Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function" 2017-08-04 01:16:35 +00:00
Scott LaVarnway f6c6f37e0c Merge "vpx_dsp: Use correct check for halfpel in" 2017-08-03 23:17:09 +00:00
Linfeng Zhang 563d58ab84 Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function
BUG=webm:1412

Change-Id: I945f0fb6807b8948747243794dc7352b959221f7
2017-08-03 13:59:47 -07:00
Linfeng Zhang 6624f20785 Merge changes I76727df0,I66297d78,I1d000c6b
* changes:
  Extract inlined 16x16 idct sse2 code into header file
  Add transpose_32bit_8x4() sse2 optimization
  Update x86 idct optimization
2017-08-03 20:51:02 +00:00
Scott LaVarnway 8334a48d3a vpx_dsp: Use correct check for halfpel in
vpx_sub_pixel_variance32xh_avx2() and
vpx_sub_pixel_avg_variance32xh_avx2

see:
17fae3a Change to use correct check for halfpel

Change-Id: Ib0741c5c2fd011e9650ca62b76009f1b59fdbe4c
2017-08-03 06:57:40 -07:00
paulwilkins 76d77aa013 Enable emergency fast Q adaptation for VBR test case.
Enable fast adaptation of Q when there is a large overshoot
for the  #ifdef AGGRESSIVE_VBR test case.

AGGRESSIVE_VBR  is not currently enabled by default.

Change-Id: I7240bb6589795964b6b0b66df4468e4f21504e0f
2017-08-03 12:06:07 +01:00
Yunqing Wang 6843e7c7f3 Merge "Force the bit exactness in the first pass" 2017-08-03 00:03:10 +00:00
Linfeng Zhang 15a47db730 Extract inlined 16x16 idct sse2 code into header file
Will be called by high bitdepth functions.

Change-Id: I76727df00941b5a27adceaba8347f275475fcd8c
2017-08-02 16:17:43 -07:00
Linfeng Zhang 8c0ab7607e Add transpose_32bit_8x4() sse2 optimization
Change-Id: I66297d78b38db718cfe3ebb8ea972f5a72c17955
2017-08-02 16:15:58 -07:00
Yunqing Wang bfd0f41f9b Force the bit exactness in the first pass
Originally, for the purpose of keeping a fast first pass, the first-pass
stats between row_mt_mode = 0 and row_mt_mode = 1 are not bit exact, but
that difference is very small that doesn't cause a mismatch between the
final bitstreams. However, if the encoder changes, this minor difference
may cause a mismatch. Thus, this patch always forces the first pass to
be bit exact.

BUG=webm:1453

Change-Id: I2b67cf529dee81f660f9d9e7fe9a60ea3c7b12b8
2017-08-02 15:58:39 -07:00
Johann Koenig 787970a625 Merge "quantize test: add speed comparison" 2017-08-02 21:16:35 +00:00
Marco b9577e07fc vp8: Drop due to overshoot for non-screen content.
For 1 pass CBR mode:
Apply the logic for dropping (and re-adjusting rate control)
due to large overshoot to the case of non-screen content when
drop_frames_allowed is enabled.

For the non-screen content case: add additional condition that
rate correction factor is close to minimum state, and flag to
constrain the frequency of the dropping.

Also handle the case of temporal layers and multi-res encoding.
Add some flags/counters to the layer context for temporal layers.
For multi-res: drop due to overshoot is checked on lowest stream,
and if overshoot is detected we force drops on all upper streams
for that frame.

This feature is to avoid large frame sizes on big content
changes following low content period.

No change in behavior for screen_content_mode = 2.

Change-Id: I797ab236cbbf3b15cad439e9a227fbebced632e6
2017-08-02 13:12:48 -07:00
Scott LaVarnway 698e56f26c Merge "vpxdsp: variance_impl_avx2.c cleanup" 2017-08-02 19:08:10 +00:00
Johann 1059b5cc52 quantize test: add speed comparison
Test some possible scenarios.

Change-Id: I1a612e7153b31756be66390ceea55877856d5a33
2017-08-02 09:33:35 -07:00
Scott LaVarnway 632fe8286a vpxdsp: variance_impl_avx2.c cleanup
BUG=webm:1404

Change-Id: I8d8498009e5ef7bf1137e4ff16ec81738a020b02
2017-08-02 05:57:39 -07:00
shiyou yin 0e87b16022 Merge "loongson mmi configuration patch." 2017-08-02 01:08:43 +00:00
Linfeng Zhang 6738ad7aaf Update x86 idct optimization
Move constant coefficients preparation into inline function.

Change-Id: I1d000c6b161794c8828ff70768439b767e2afea1
2017-08-01 14:40:12 -07:00
Linfeng Zhang c0490b52b1 Merge "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2" 2017-08-01 21:39:39 +00:00
Johann Koenig 847394fe77 Merge "neon: vpx_quantize_b" 2017-08-01 16:44:31 +00:00
Paul Wilkins 3be14200fc Merge "Respond more rapidly to excessive local overshoot." 2017-08-01 08:58:36 +00:00
Marco Paniconi c22b17dcef Merge "vp9: Adjust noise estimation for 360p." 2017-08-01 02:48:13 +00:00
Marco 5d6c1c2d8f vp9: Adjust noise estimation for 360p.
Change-Id: Ib76875232491b14f7114061e8e913e87004427a0
2017-07-31 17:12:58 -07:00
Linfeng Zhang bf14d468c1 Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2
This replaces commit aa1c4cd, which has a bug and was reverted in
commit 3c73e58.

The bug is caused by rounding -step1[5] in highbd_idct8x8_12_half1d().

Change-Id: I37b3a5f0d91815f2dc570209091dc6626fd178a8
2017-07-31 16:36:13 -07:00
James Zern 78e2da3e42 Merge "highbd_inv_txfm_sse4: make << of neg. val a multiply" 2017-07-31 22:43:41 +00:00
Johann 2d6b5df657 neon: vpx_quantize_b
With skip block or coeff < zbin it is about twice as fast as C.

If most coeff values are > zbin it is about 10-15x as fast as C.

BUG=webm:1426

Change-Id: I5d3c007b014a372d5ef0882b39bb48983b4131c7
2017-07-31 10:38:46 -07:00
YinShiyou 2758de5cb2 loongson mmi configuration patch.
enable loongson mmi optimization: ../configure --enable-mmi

Change-Id: I7792c3adeac1d5b573917d7857bba6c1cc05fea5
2017-07-31 17:29:36 +00:00
Marco Paniconi ebb023deb6 Merge "Revert "Revert "vp9: Speed feature to adapt partition based on source_sad.""" 2017-07-31 14:58:15 +00:00
Marco 999bd6ea84 vp9: Fix denoising condition when pickmode partition is used.
When the superblock partition is based on the nonrd-pickmode,
we need to avoid the denoising. Current condition was based on
the speed level. This change is to make the condition at the
superblock level, as the switch in partitioning may be done at
sb level based on source_sad (e.g., in speed 6).

Change-Id: I12ece4f60b93ed34ee65ff2d6cdce1213c36de04
2017-07-30 23:16:38 -07:00
Jerome Jiang f027908ad0 Revert "Revert "vp9: Speed feature to adapt partition based on source_sad.""
This reverts commit c9266b8547.

Disable source_sad when resolution > 1080P. The test should
pass now.

BUG=webm:1452

Change-Id: I72dde88e66590ff9e41da5e5dd83f5550a83f082
2017-07-30 19:49:31 -07:00
James Zern 78155b7ed5 highbd_inv_txfm_sse4: make << of neg. val a multiply
left shifting a negative value is undefined; quiets a ubsan warning.
this is applied to a constant, no change in the generated code.

Change-Id: I595f0ff7904ef025e07bb80234293d958dc9f254
2017-07-30 12:48:28 -07:00
Matt Oliver e6f66cb71f project: Fix for changes in gitlink cli usage. 2017-07-30 21:20:19 +10:00
James Zern facb124941 Merge "Revert "vp9: Speed feature to adapt partition based on source_sad."" 2017-07-30 03:26:10 +00:00
James Zern c9266b8547 Revert "vp9: Speed feature to adapt partition based on source_sad."
This reverts commit 064fc570ff.

This causes an assertion failure in vp9_mcomp.c when running
gtest_filter=VP9/MotionVectorTestLarge.OverallTest/41:
`mv->col >= -((1 << (11 + 1 + 2)) - 1) && mv->col < ((1 << (11 + 1 + 2))
- 1)'

Change-Id: I449e777bf18b661cb3f1d82253610c55c51687f6
2017-07-29 11:36:58 -07:00
James Zern d35b627340 Revert "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2"
This reverts commit aa1c4cd140.

This fails the following tests with extreme input coefficients:
SSE2/InvTrans8x8DCT.CompareReference/0
SSE2/InvTrans8x8DCT.CompareReference/2

previously the optimized path was skipped in this range

Change-Id: I9af015a46eba96208834a219fafd651d37556a80
2017-07-29 11:12:27 -07:00
Marco Paniconi 5d0bef4763 Merge "vp9: Adjust logic in source sad for screen content." 2017-07-29 01:46:58 +00:00
Marco Paniconi e48dfcead1 Merge "vp9: Speed feature to adapt partition based on source_sad." 2017-07-29 01:45:19 +00:00
Jerome Jiang ac211fe23e vp9: Adjust logic in source sad for screen content.
Change-Id: I917d106f4c95ea44e413e23881f6303982e1a6a3
2017-07-28 17:25:41 -07:00
Marco 064fc570ff vp9: Speed feature to adapt partition based on source_sad.
Move the source_sad feature to speed 6 (from speed 7), and
add speed feature to switch from the variance-based partition
to reference_partition (which uses nonrd-pickmode for bsize selection)
if source_sad is high.

Currently used only for speed 6 for resoln <= 360p.
About 4-5% improvement on 360p in RTC set.
Some speed slowdown, but still ~30% faster than speed 5.

Change-Id: Ib0330ee5fe9fdd2608aed91359a2a339d967491c
2017-07-29 00:20:26 +00:00
Urvang Joshi 7105e66d19 Remove the DP version of vp9_optimize_b().
The greedy version was already enabled by default here:
https://chromium-review.googlesource.com/c/546848/

And the speed+compression gains from greedy version were already
mentioned here:
https://chromium-review.googlesource.com/c/531675/

Change-Id: Iad9f7d03490c845ad1e230af028c9d39edddca97
2017-07-28 23:12:57 +00:00
Linfeng Zhang 75653b7032 Merge changes Ia0e20f5f,I28150789,I35df041b,I221dff34
* changes:
  Update vpx_idct16x16_10_add_sse2()
  Add vpx_idct16x16_38_add_sse2()
  Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2
  Refactor highbd idct 4x4 and 8x8 x86 functions
2017-07-28 22:43:00 +00:00
James Zern 3c73e587d1 Revert "quantize ssse3: declare all variables"
This reverts commit 03f5e300d6.

This causes test failures under OSX:
SSSE3/VP9QuantizeTest.EOBCheck/0
SSSE3/VP9QuantizeTest.OperationCheck/0

Change-Id: I122732717ead1f7af5b04c529a6948e382e5e59b
2017-07-28 01:22:16 -07:00
Linfeng Zhang 5232e35bc2 Update vpx_idct16x16_10_add_sse2()
Change-Id: Ia0e20f5fa47382af5785221eebb05212b40bd35c
2017-07-27 18:03:25 -07:00
Linfeng Zhang 7f4acf8700 Add vpx_idct16x16_38_add_sse2()
Change-Id: I28150789feadc0b63d2fadc707e48971b41f9898
2017-07-27 18:02:43 -07:00
Linfeng Zhang aa1c4cd140 Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2
BUG=webm:1412

Change-Id: I35df041b757d42278ac7a5cdbd909e8ffcee1455
2017-07-27 18:02:36 -07:00
Linfeng Zhang 9c43d81bc2 Refactor highbd idct 4x4 and 8x8 x86 functions
BUG=webm:1412

Change-Id: I221dff34dd5f71b390b5e043d0a137ccb0a01dec
2017-07-27 18:01:03 -07:00
Johann Koenig a83e1f1d53 Merge "quantize ssse3: declare all variables" 2017-07-27 21:18:35 +00:00
Jerome Jiang 905b8ec27f Merge "vp8: Remove isolated skin & non skin blocks." 2017-07-27 20:24:08 +00:00
Jerome Jiang 56d95b77f5 vp8: Remove isolated skin & non skin blocks.
Neutral on RTC metrics and speed on Pixel.

Change-Id: I26b907483fe133e6e4c1009d147631f0d0e0f2fb
2017-07-26 14:44:36 -07:00
James Zern 1c666465af inv_txfm_{sse2,ssse3}: clear conversion warnings
visual studio reports tran_high_t (int64) -> short in calls to
_mm_set1_epi16

Change-Id: Icb8d1baee77ad3d45edb1477a443d3e648f0b745
2017-07-25 20:13:49 -07:00
James Zern 62682ac8ad highbd_idct*_sse*.c: clear conversion warnings
visual studio reports tran_high_t (int64) -> int in calls to
_mm_setr_epi32

Change-Id: Ic2247c8e3800991202151790d78bd94c4f4aed05
2017-07-25 20:11:09 -07:00
James Zern 85736e616e vpx_variance16x16_sse2: correct cast order
allow the right shift to operate on 64-bits, this matches the rest of
the implementations

previously:
b0f1ae147 vpx_get16x16var_avx2: correct cast order

Change-Id: I632ee5e418f3f9b30e79ecd05588eb172b0783aa
2017-07-25 16:45:40 -07:00
Alexandra Hájková 666c543f7b ppc: Add vpx_idct16x16_256_add_vsx
Change-Id: Ibc3f7965423fd91179f8d8e77c7ae3e6d7f80572
2017-07-25 12:34:15 +00:00
James Zern b0f1ae1475 vpx_get16x16var_avx2: correct cast order
allow the right shift to operate on 64-bits, this matches the rest of
the implementations

missed in:
6acd061aa variance_avx2: sync variance functions with c-code

Change-Id: Icae436b881251ccb9f9ed64fcbf8d358c58a4617
2017-07-24 16:29:44 -07:00
James Zern 8836e46ffd set_var_thresh_from_histogram: prevent negative variance
For 8-bit the subtrahend is small enough to fit into uint32_t.

For 10/12-bit apply:
63a37d16f Prevent negative variance

previously:
47b9a0912 Resolve -Wshorten-64-to-32 in highbd variance.
c0241664a Resolve -Wshorten-64-to-32 in variance.

Change-Id: I181c85f0b9a03da37c2e8b89482d48aa3dbc0aee
2017-07-22 13:27:32 -07:00
Marco 8c7a60e04d vp8: Fix compile warning in vp8_multi_resolution_encoder.c
Change-Id: I49c960179dfc1902aa5e5c99915789878c06bc3d
2017-07-20 14:19:43 -07:00
Johann Koenig e8bd534c42 Merge "quantize test: promote RandRange() result to signed" 2017-07-20 19:46:05 +00:00
Johann Koenig 0c30b75f40 Merge "quantize test: lowbd functions do not pass in highbd" 2017-07-20 19:45:59 +00:00
Jerome Jiang 494188505b Merge "vp9: Removed unused skin detection function." 2017-07-20 16:58:01 +00:00
Johann af08fbb444 quantize test: promote RandRange() result to signed
Avoid unsigned overflow warning:
unsigned integer overflow: 19974 - 32703 cannot be represented in type
'unsigned int'

Change-Id: Ifebee014342e4c6f3b53306c0cad6ae0b465ac12
2017-07-20 08:17:48 -07:00
Johann c782f27ead quantize test: lowbd functions do not pass in highbd
qcoeff output looks OK but dqcoeff is no good.

BUG=webm:1448

Change-Id: I07211db8a8b74f1f45fdd059852e2de0e5ee18fd
2017-07-20 08:17:48 -07:00
Johann Koenig 4702bb26be Merge "quantize test: eob is output" 2017-07-20 15:17:26 +00:00
Johann Koenig e1809501d0 Merge "Earmark extra space for VSX." 2017-07-19 21:35:57 +00:00
Jerome Jiang 9dd992b6f0 Merge "Roll libwebm: Fix android build failure with NDK r15b." 2017-07-19 21:30:21 +00:00
Johann bde2e4aa36 quantize test: eob is output
eob values are generated by the function.

Change-Id: I8ce92100e83022bff99888a5a7e6ef378c49fda3
2017-07-19 14:17:19 -07:00
Han Shen b72d3e8a25 Earmark extra space for VSX.
Backend specific optimization for PPC VSX reads 16 bytes, whereas arm neon /
sse2 only reads <= 8 bytes. Although the extra bytes read are actually never
used, this is not a warrant for groping around.  Fixed by allocating more when
building for VSX. This is reported by asan.

Also note - PPC does have assembly that loads 64-bit content from memory - lxsdx
loads one 64-bit doubleword (whereas lxvd2x loads two 64-bit doubleword) from
memory. However, we only have "vec_vsx_ld" builtins that mapped to lxvd2x, no
builtins to lxsdx. The only way to access lxsdx is through inline assembly,
which does not fit well in the origin paradigm.

Refer:
  vsx:
    vpx_tm_predictor_4x4_vsx @ third_party/libvpx/git_root/vpx_dsp/ppc/intrapred_vsx.c
  neon:
    vpx_tm_predictor_4x4_neon @ third_party/libvpx/git_root/vpx_dsp/arm/intrapred_neon_asm.asm
  sse2:
    tm_predictor_4x4 @ third_party/libvpx/git_root/vpx_dsp/x86/intrapred_sse2.asm

BUG=b/63112600

Tested:
  asan tests passed.

Change-Id: I5f74b56e35c05b67851de8b5530aece213f2ce9d
2017-07-19 13:59:32 -07:00
Johann Koenig 89a116f4cb Merge "variance: call C comp_avg_pred" 2017-07-19 20:34:13 +00:00
Jerome Jiang 8ad9338e2e Roll libwebm: Fix android build failure with NDK r15b.
BUG=webm:1447

Change-Id: I8defe45cb94eb9c209ba72ce446786f24c14c0b8
2017-07-18 16:52:46 -07:00
Jerome Jiang 4526644615 vp9: Removed unused skin detection function.
Change-Id: I6702b7b11aa4ac9aac5fd54deef4377cdcb29c64
2017-07-18 14:52:04 -07:00
Jerome Jiang 59e461db1f Merge "vp9: Allocate alt-ref in denoiser for SVC." 2017-07-18 21:30:04 +00:00
Jerome Jiang babef23a5f Merge "vp9: Remove isolated skin & non-skin blocks." 2017-07-18 20:48:32 +00:00
Johann Koenig 56d3f1573a Merge changes I62c2e313,Ibd7a0337,I94e1d886
* changes:
  quantize test: test sse2 and avx optimizations
  quantize test: extend arrays
  quantize test: restrict and correct input
2017-07-18 20:42:39 +00:00
Johann 4b9a848bb3 variance: call C comp_avg_pred
Keep optimized code out of the reference implementation. This matches
the style of the other sub calls.

Change-Id: I3da6acd4f2c647b029c420e22ac9410a18259689
2017-07-18 20:22:53 +00:00
Jerome Jiang fd216268ad vp9: Allocate alt-ref in denoiser for SVC.
When SVC is used, allocate alt-ref in denoiser.

Change-Id: I1b17221b55b9444cd23b97d481b54ff8d296d857
2017-07-18 13:22:47 -07:00
Johann 03f5e300d6 quantize ssse3: declare all variables
Copy missing line from avx implementation.

Change-Id: I9755c5b4d4034867de6fa9f741c24bf49dce3a27
2017-07-18 12:32:57 -07:00
Johann 101981b736 quantize test: test sse2 and avx optimizations
ssse3 does not pass either of the tests.

avx 32x32 does not pass.

Change-Id: I62c2e31336fd2327327afaa0da896ad79a3def44
2017-07-18 12:08:16 -07:00
Jerome Jiang adbfc4308a vp9: Remove isolated skin & non-skin blocks.
0.007% regression on rtc and 0.004% gain on rtc_derf.
1 thread on QVGA,VGA and HD has ~0.2% speed regression while 2 threads has
~0.2% speed gain on Google Pixel.

Change-Id: Ia4a6ec904df670d7001e35e070b01e34149d23dc
2017-07-18 11:29:14 -07:00
Johann c7ebe82253 quantize test: extend arrays
Officially the quant structures are 8 elements, with one dc element and
7 repeated ac elements. The low bit depth optimizations take advantage
of this to fill the xmm registers. The high bit depth version manually
duplicates the values.

If all the optimizations were unified, the structure sizes could be
greatly reduced.

Change-Id: Ibd7a0337a7832ce2a1a05ee433c310077e1059ae
2017-07-18 09:55:47 -07:00
Johann cb61ba02f4 quantize test: restrict and correct input
Use only valid values for quantize inputs. These were determined by
looping over vp9_init_quantizer and looking for max and min values.

This allows extending the test to the low bit depth functions which were
not designed to handle all possible inputs but only valid inputs.

Change-Id: I94e1d8863a49ac227845b65c6b50130e10e6319e
2017-07-18 09:40:45 -07:00
Marco 817f68cdcf vp9: Disable usage of sb_use_mv_part for SVC.
To fix valgrind issueis with SVC tests.
SVC encoding uses prune_evenmore which is causing uinit value.

Will re-enable later when issue is resolved.

Change-Id: I257ff878cf78197ddd813db056582a4d5fe94f44
2017-07-18 09:28:56 -07:00
Marco ad56371343 vp9: Fix to setting content_state for real-time mode.
When content_state_sb is set to LowVarHighSumdiff, don't reset
it to VeryHighSad. Visually better on clips with strong lighting changes.

Small/negligible change in RTC metrics and speed.

Change-Id: I20c383e3c4cf8d1149de5f9260449c0b7cf7c6aa
2017-07-17 16:21:25 -07:00
Marco 0c9e2f4c15 vp9: Reuse motion from choose_partitioning in NEWMV search.
When int_pro_motion_estimation is done for superblock in
choose_partitioning, use it to avoid the full_pixel_search
for NEWMV mode, if bsize is >= 32X32.

For speed > 7.
Small/neutral change on RTC metrics.
~1-2% speedup on arm on high motion clip.

Change-Id: I3cfe6833ff4bf75d4afa83eaf058ad45729de85b
2017-07-17 13:15:48 -07:00
James Zern 9223b947ca Merge "fix 'make exampletest' w/CONFIG_REALTIME_ONLY" 2017-07-15 18:37:10 +00:00
Jerome Jiang 682135fa60 vp9: Compute skin only for blocks eligible for noise estimation.
Change-Id: Iddcb83a5968db57cfd312c5bc44b2a226a2a3264
2017-07-14 15:14:30 -07:00
Marco 666e394d41 vp9: Adjust minmax threshold for variance partitioning.
Only affects speed 7. Improvement on high motion clips.

Change-Id: Ibddb68fed9c63207df29ffd790f9205b1cecf687
2017-07-13 21:19:37 -07:00
Johann e3fa4ae8e3 quantize test: use Buffer
Although the low bitdepth functions are identical (excepting the need
for larger intermediate values) they do not pass these tests. This
improves the error output to aid debugging.

Simplify buffer usage with Buffer and removing unnecessarily aligned
variables.

eob is a single element and never written using aligned instructions.

BUG=webm:1426

Change-Id: Ic95789a135cf1e8a3846d85270f2b818f6ec7e35
2017-07-13 15:54:48 -07:00
James Zern 960466939d fix 'make exampletest' w/CONFIG_REALTIME_ONLY
for tests that aren't explicitly testing 2-pass behavior use --passes=1
with this configuration

Change-Id: I6a1520ecc65d0f626486604310af29dacb9f197f
2017-07-13 10:47:20 -07:00
James Zern b578d59623 Merge "remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY" 2017-07-12 23:30:04 +00:00
Johann Koenig e0d79bc7b5 Merge "sad4d neon: 64x[32,64]" 2017-07-12 20:15:00 +00:00
Marco Paniconi f6586b8bf8 Merge "vp9: Fix to SVC and denoising for fixed pattern case." 2017-07-12 19:13:05 +00:00
Johann Koenig 3158752980 Merge changes Ibf5e61dc,I44b48512,I7de2500c,I5081b5ce
* changes:
  sad4d neon: 32x[16,32,64]
  sad4d neon: 16x[8,16,32]
  sad4d neon: 8x[4,8,16]
  sad4d neon: 4x4, 4x8
2017-07-12 15:01:30 +00:00
Johann e381753926 sad4d neon: 64x[32,64]
Rewrite 64x64.

BUG=webm:1425

Change-Id: I336bf5a3aa4b783389c10b16a50f0f559346ecbf
2017-07-12 13:26:39 +00:00
Johann e1bde306c8 sad4d neon: 32x[16,32,64]
Rewrite 32x32. Use half the accumulator registers.

BUG=webm:1425

Change-Id: Ibf5e61dc4ba15056102aef8495f4a02c668c5d13
2017-07-12 13:25:18 +00:00
Johann 807ce8fb1e sad4d neon: 16x[8,16,32]
Rewrite 16x16. Use half the accumulator registers.

BUG=webm:1425

Change-Id: I44b48512b1e3629505d83c2645e800f53878ccc2
2017-07-12 13:25:11 +00:00
Johann 8152b0904d sad4d neon: 8x[4,8,16]
BUG=webm:1425

Change-Id: I7de2500cca4b621f21478c4b0333c56d76dbc9a4
2017-07-12 13:25:03 +00:00
Johann dd4347e9ec sad4d neon: 4x4, 4x8
BUG=webm:1425

Change-Id: I5081b5ce131821d590c53ac1206a94f50cb8b468
2017-07-12 03:38:03 +00:00
Urvang Joshi 1dee320446 Merge "Remove the token state array from greedy optimize_b." 2017-07-12 00:08:56 +00:00
James Zern df18412f32 remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY
BUG=webm:1446

Change-Id: I6e0ea9342c715d354c641109737172afa649b85b
2017-07-11 13:10:16 -07:00
Urvang Joshi 5322a31b18 Remove the token state array from greedy optimize_b.
Reduces memory usage, and speeds up encoding for some difficult clips.
No impact on output or metrics.

Ported from aomedia patch:
https://aomedia-review.googlesource.com/c/14501

Change-Id: I26ec69af8336f9e80da486a1cfbfc89a3596954d
2017-07-11 13:05:29 -07:00
James Bankoski 7d5afa227a Merge "Reintroduce fix for max qindex calculation of a gf interval" 2017-07-11 19:47:16 +00:00
Jerome Jiang 1a4d8f2033 Merge "vp9: Move skinmap computation into multithreading loop." 2017-07-11 19:44:22 +00:00
Jim Bankoski 689ad89e86 Reintroduce fix for max qindex calculation of a gf interval
This reintroduces the fix:
  https://chromium-review.googlesource.com/c/422807/
and later reverted here:
  https://chromium-review.googlesource.com/c/447843/

BUG=webm:1355

This time behind a compile time flag :

configure --disable-always_adjust_bpm
configure --enable-always_adjust_bpm

This should make side by side testing easier and let users of the
lib pick which way they want to go.

Change-Id: I7d7b37b83015dc001810af84c132cbc1e71ba8d6
2017-07-11 18:40:26 +00:00
Marco 3818a3723b vp9: Fix to SVC and denoising for fixed pattern case.
For fixed pattern SVC: keep track of denoised last_frame buffer
for base temporal layer, and if alt_ref is updated on middle/upper
temporal layers, force an update to denoised last_frame buffer.
This allows for improved denoising on top temporal layers.

Change-Id: Icbd08566027d4d2eabc024d3b7a0d959d2f8c18b
2017-07-11 11:27:04 -07:00
Jerome Jiang 3d6b0cb825 vp9: Move skinmap computation into multithreading loop.
Change-Id: Iebc9dd293d8b1449c0674c0295349297e9b90646
2017-07-10 17:18:15 -07:00
Johann 66a96fd3de avg_neon: fix 4x4, update 8x8
4x4 was failing with a bus error. Most likely due to clang alignment
hints on 32bit loads.

Change-Id: Ib191ce0e6239fc55d85f10e4dbe15876e5052edb
2017-07-10 15:29:34 -07:00
Johann 87610ac45e neon: consolidate horizontal adds
Change-Id: Iaf9e88ff636ccf8f0ef310869c6827f3f205cca8
2017-07-10 15:29:13 -07:00
Johann Koenig 4b78c6e6f7 Merge "remove vp9_full_sad_search" 2017-07-10 20:42:40 +00:00
Jerome Jiang 125a532b34 Merge "vp9: Remove alt-ref from denoiser." 2017-07-10 20:03:51 +00:00
Johann 109faffe9b remove vp9_full_sad_search
This code is unused in vp9. Only vp8 still contains references to
vpx_sad_NxMx[3|8] and only for sizes 16x16, 16x8, 8x16, 8x8 and 4x4.

Remove the remaining sizes and all the highbitdepth versions.

BUG=webm:1425

Change-Id: If6a253977c8e0c04599e25cbeb45f71a94f563e8
2017-07-10 11:20:35 -07:00
Jerome Jiang 2ac7c549e9 vp9: Remove alt-ref from denoiser.
Denoiser is used in real-time mode which does not use alt-ref.
Reduce memory usage when denoiser is enabled.

Change-Id: I54ba3bcaeeb1818bbdf718ef90e97d4897ff793d
2017-07-10 10:56:03 -07:00
Johann Koenig 4e16f70703 Merge changes Id84d9780,Iaa6ea75b,I3362e0dd,I0020a49e,Ia42e4f36, ...
* changes:
  sad neon: avg for 64x[32,64]
  sad neon: macroize 64xN definitions
  sad neon: avg for 32x[16,32,64]
  sad neon: macroize 32xN definitions
  sad neon: avg for 16x[8,16,32]
  sad neon: macroize 16xN definitions
2017-07-07 21:01:23 +00:00
James Zern 5d6060b62f Merge "cosmetics,vp9/: normalize inv/fwd_txfm naming" 2017-07-07 19:15:02 +00:00
Johann Koenig 6c375b9cd0 Merge "fdct neon: 32x32_rd" 2017-07-07 14:05:51 +00:00
Johann e4e08556db sad neon: avg for 64x[32,64]
BUG=webm:1425

Change-Id: Id84d97807a6a0fbcc889c4dfe11929d54f85493d
2017-07-07 07:04:04 -07:00
Johann 6ae8f8dbe8 sad neon: macroize 64xN definitions
Change-Id: Iaa6ea75b10e75784f31b1e08637eecf0dcb5cff9
2017-07-07 07:04:04 -07:00
Johann 67cffc1ef6 sad neon: avg for 32x[16,32,64]
BUG=webm:1425

Change-Id: I3362e0dded3b46ca032caa7f44db42f324bc596d
2017-07-07 07:04:04 -07:00
Johann b0d15713be sad neon: macroize 32xN definitions
Change-Id: I0020a49e77d27514375a03095d5821dc0aa7d128
2017-07-07 07:04:04 -07:00
Johann 527e0c9b1c sad neon: avg for 16x[8,16,32]
BUG=webm:1425

Change-Id: Ia42e4f36547c5fe12114fb58379e34bce82eb2f2
2017-07-07 07:04:04 -07:00
Johann 3c18acf452 sad neon: macroize 16xN definitions
Change-Id: I5aea6ffbfa48eb1970afe3be54f0bba275d7fa58
2017-07-07 07:04:04 -07:00
Johann Koenig 9b253f9f0a Merge changes I7b36a57e,If2ab51e3,Ifc685a96
* changes:
  sad neon: macroize 8xN definitions
  sad neon: avg for 8x[4,8,16]
  sad neon: avg for 4x4 and 4x8
2017-07-07 14:03:13 +00:00
Marco Paniconi 2075af4b16 Merge "vp9: Nonrd mode: use content_state_sb for high motion." 2017-07-07 03:00:59 +00:00
James Zern 80b83c73ba cosmetics,vp9/: normalize inv/fwd_txfm naming
+ vpx_dsp/, test/

itxfm -> inv_txfm, ftxfm -> fwd_txfm

Change-Id: I3aacdb65143576d64cfe5c9b14dd358c17c1fe7e
2017-07-06 18:35:44 -07:00
James Zern 777ca80f0a Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  vp9: remove FrameWorkerData & vp9_dthread.h
  vp9: remove (un)lock_buffer_pool
2017-07-06 23:31:30 +00:00
James Zern a15c6a7ebf vp8cx,cosmetics: correct VP9_SET_TILE_COLUMNS docs
this has been set to max since:
f5c36a5ce VP9: turn on tile-columns and frame-parallel-mode by default
~v1.4.0

Change-Id: Ic796fc05abe73a58700ec50e3f8e72d3462898ec
2017-07-06 16:24:35 -07:00
Marco 8c3f18efa1 vp9: Nonrd mode: use content_state_sb for high motion.
In the content_state for a superblock is set to HighSad,
use that to bias some decisions in variance partition and
nonrd pickmde: use int_pro_motion for sad computation in
choose_partitioning, and set large_block in pickmode based
on the content_state_sb.

Only affects speed >= 7.

Immprovement for high motion content.
Small gain (~1%) in RTC metrics.
Speedup of ~5 for high motion clip on android (speed 8, 1 thread).

Change-Id: I5774c4854f012b89c8e969f6129b60988c2ce11c
2017-07-06 15:05:19 -07:00
James Zern 26a9a4cd64 vp8cx,cosmetics: correct VP9_SET_FRAME_PARALLEL_DECODING docs
this has been on by default since:
f5c36a5ce VP9: turn on tile-columns and frame-parallel-mode by default
~v1.4.0

Change-Id: I52017ab0157feaf429dce3d9e1af8a53bb5c1b65
2017-07-06 10:40:18 -07:00
Johann d6423b3166 sad neon: macroize 8xN definitions
Change-Id: I7b36a57e893c1795a37ba7994995bec7ff021409
2017-07-06 07:51:59 -07:00
Johann 63bdc574e5 sad neon: avg for 8x[4,8,16]
BUG=webm:1425

Change-Id: If2ab51e3050e078b0011b174efe41fcb65a15f44
2017-07-06 07:43:09 -07:00
Johann 6bac3f80ee sad neon: avg for 4x4 and 4x8
BUG=webm:1425

Change-Id: Ifc685a96cb34f7fd9243b4c674027480564b84fb
2017-07-06 07:12:47 -07:00
Johann 75b00592c7 fdct neon: 32x32_rd
About 40% faster than the non-rd version.

BUG=webm:1424

Change-Id: Ia99d14eb9532302eeaab8cd3e503395b0374b5a2
2017-07-06 06:30:50 -07:00
James Zern 5227b8200b vp9: remove FrameWorkerData & vp9_dthread.h
the file was empty after the struct removal. the only remaining use was
within vp9_dx_iface, but the wrapper became unnecessary after the
removal of frame_parallel_decode.

BUG=webm:1395

Change-Id: I515ab585d701e77d388d12b2802d844c424f9bcd
2017-07-05 22:32:00 -07:00
James Zern 48c4a038eb vp9: remove (un)lock_buffer_pool
there is no threaded access to this pool after the removal of
frame_parallel_decode

BUG=webm:1395

Change-Id: I710769b87102edc898c59eb9a2e7a91d8c49107f
2017-07-05 21:07:00 -07:00
James Zern af3cab7b24 Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  vp9_onyxc_int,RefCntBuffer: rm unused members
  remove vp9_dthread.c
  vp9: reduce FRAME_BUFFERS by 3
2017-07-06 04:06:30 +00:00
James Zern 4ffd8350be Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  VP9_COMMON: rm frame_parallel_decode
  VP9Decoder: rm frame_parallel_decode
  vp9_dx: rm worker thread creation
2017-07-05 23:53:22 +00:00
James Zern 0d245d42c4 Merge "test_vector_test,vp8: correct thread range" 2017-07-05 22:33:51 +00:00
Yaowu Xu f2b1dc529f Merge "Further refactoring of mod error calculation." 2017-07-05 21:43:50 +00:00
Yaowu Xu e3cafbc8df Merge "Fix incorrect index test in GF group rate assignment." 2017-07-05 21:43:43 +00:00
James Zern 24d0391efb Merge "googletest: suppress unsigned overflow in the LCG" 2017-07-05 21:19:44 +00:00
Johann Koenig 9a05f9771a Merge "test/buffer.h: move range checking to compiler" 2017-07-05 21:15:13 +00:00
James Zern a22bb9809e Merge "dct_partial_test: cover vpx_fdct8x8_1_msa in hbd" 2017-07-05 21:08:46 +00:00
Hui Su 3e08a88854 Merge "level tests: allow level undershoot" 2017-07-05 20:47:20 +00:00
James Zern 23d60be414 dct_partial_test: cover vpx_fdct8x8_1_msa in hbd
this was enabled in:
5ac88162b partial fdct test

Change-Id: Ibae2031ec1308fe3a3b84a1ce6e7bacda3a7cb82
2017-07-05 13:01:41 -07:00
James Zern a6531cbc54 Merge changes from topic 'missing-proto'
* changes:
  fwd_txfm_msa.c: add missing vpx_dsp_rtcd.h
  vpx_convolve_*_msa.c: add missing vpx_dsp_rtcd.h
  loopfilter_*_msa.c: add missing vpx_dsp_rtcd.h
2017-07-05 20:00:25 +00:00
Johann Koenig b6321025cd Merge "partial fdct neon: maintain neon registers" 2017-07-05 19:12:38 +00:00
Johann da2ad47d66 test/buffer.h: move range checking to compiler
Pass low/high values as type T. Out of range values should be caught by
static analysis instead.

Change-Id: I0a3ee8820af05f4c791ab097626174e2206fa6d5
2017-07-05 11:21:18 -07:00
paulwilkins 5b44ef0c50 Respond more rapidly to excessive local overshoot.
This patch attempts to address a bug reported for 4K video.
https://b.corp.google.com/issues/62215394

In this instance a perfect storm of a moderate complexity section
followed by a much easier section where a CGI overlay helped to
suppress film grain noise, followed by a much harder and very grainy
section at the end, cause a massive local rate spike that pushed a chunk
over the upper allowed rate limit.

This patch detects cases where the rate for a frame is much higher than
expected and allows, in this special case, for rapid adjustment of the active
Q range.

For the example chunk in the bug report the target rate was 18Mb/s and the
observed rate was over 37 Mb/s with a surge for the last few frames to over
100Mb/s. This patch brings the overall chunk rate right back down to ~18.2 Mbit/s
and  almost completely eliminates the rate spike at the end. (See graphs appended
to bug report)

Also see  I108da7ca42f3bc95c5825dd33c9d84583227dac1 which fixes a bug
unearthed during testing of this patch and also has a bearing on high rate
encodes such as 4K.

This patch does have a negative impact on some metrics. Most notably there are
clips in our standard test set where it hurts global psnr (though in many cases it
conversely helps SSIM, FAST SSIM and PSNR-HVS). It is also worth noting that
the clips (and data rates) where there is a big metric impact, are almost all cases
where there is currently a significant overshoot vs the target rate and overall rate
accuracy is greatly improved.

Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a
2017-07-05 16:51:52 +01:00
paulwilkins a1af335f44 Further refactoring of mod error calculation.
Further refactoring to support alternative error distributions.

Change-Id: I0f7fa3fd6f3baa4b0a1e53c6aa3be63966e97b82
2017-07-05 16:49:37 +01:00
paulwilkins b0459ec8ea Fix incorrect index test in GF group rate assignment.
Correct test for middle frame in the group.

Change-Id: I1ee49fa33968eb3c4a01d6a27a60bb1409e3e68c
2017-07-05 16:45:36 +01:00
James Zern 7d526c1654 Merge "buffer.h: incorrect RandRange results" 2017-07-02 03:48:53 +00:00
Johann 6cb3178192 buffer.h: incorrect RandRange results
'low' was promoted to unsigned, triggering a ubsan warning

Change-Id: Id49340079d39c105da93cf13e96cf852a93a94ba
2017-07-01 20:01:22 -07:00
James Zern fb135ff050 Merge changes I4ed1312f,Id2673eec
* changes:
  ppc: Add vpx_idct8x8_64_add_vsx
  ppc: Add vpx_idct4x4_16_add_vsx
2017-07-02 02:38:39 +00:00
Alexandra Hájková c757d6dde4 ppc: Add vpx_idct8x8_64_add_vsx
Change-Id: I4ed1312f365509e0595dcc09890ecb050f6f2069
2017-07-01 12:55:47 -07:00
Alexandra Hájková d8c277030c ppc: Add vpx_idct4x4_16_add_vsx
Change-Id: Id2673eece32027fb245919c7a5c81994a4a19fd8
2017-07-01 12:32:18 -07:00
Alex Converse f7645138d4 googletest: suppress unsigned overflow in the LCG
Local application of:
https://github.com/google/googletest/pull/1066

Suppress unsigned overflow instrumentation in the LCG

The rest of the (covered) codebase is already integer overflow clean.

TESTED=gtest_shuffle_test goes from fail to pass with -fsanitize=integer

Change-Id: I8a6db02a7c274160adb08b7dfd528b87b5b53050
2017-07-01 12:24:32 -07:00
James Zern 3dd993e4be highbd_idct8x8_add_sse4: make << of neg. val a multiply
left shifting a negative value is undefined; quiets a ubsan warning.
this is applied to a constant, no change in the generated code.

Change-Id: Ia17a7672d4832463decbc4afd6cd42974d02698e
2017-07-01 11:56:56 -07:00
Matt Oliver 40cd25649f project: Update appveyor to build using latest available windows sdk. 2017-07-02 04:49:09 +10:00
Johann 3ae458f2f3 partial fdct neon: maintain neon registers
Finish the calulations in neon registers. This avoids a potentially
expensive move from neon to gp and allows at least clang to store
directly to memory.

BUG=webm:1424

Change-Id: Idef25eec95f7610947167818e9194bde8b00d282
2017-07-01 09:29:38 -07:00
James Zern a876d04072 fwd_txfm_msa.c: add missing vpx_dsp_rtcd.h
+ only expose compatible functions in high-bitdepth build

quiets -Wmissing-prototypes warnings

Change-Id: I8ef7db08a34c5c54b5cde6e732c0d70f4287c89a
2017-06-30 18:53:30 -07:00
James Zern 8710c6d884 vpx_convolve_*_msa.c: add missing vpx_dsp_rtcd.h
quiets -Wmissing-prototypes warnings

Change-Id: I1ab5b8ae4a62f54e0f9eb3fc81371c9b99972c30
2017-06-30 18:50:56 -07:00
James Zern 329dabf57e loopfilter_*_msa.c: add missing vpx_dsp_rtcd.h
+ make some functions static

quiets -Wmissing-prototypes warnings

Change-Id: I2130e06142e71a004a1eb30e173feba4f6fe68a0
2017-06-30 18:50:52 -07:00
James Zern 27e37e1a8a fwd_txfm_msa.c: correct vpx_fdct8x8_1_msa prototype
this makes the function compatible with high-bitdepth and fixes test
failures since:
5ac88162b partial fdct test

Change-Id: Ib630694608237f0c515948942e05dbea259ba338
2017-06-30 18:50:47 -07:00
James Zern af3ab45867 test_vector_test,vp8: correct thread range
testing::Range does not include the end parameter in the set of values.
also adjust the start to 2 as the single threaded case is already
covered in another instantiation

Change-Id: Iae3bf3ed4363dd434eccfa5ad4e3c5e553fbee60
2017-06-30 16:21:06 -07:00
James Zern 5a8e4110c7 Merge "gen_msvs_sln: fix solution version for 2015/17" 2017-06-30 22:05:32 +00:00
James Zern 37e03b1d13 Merge "cosmetics,vp9/encoder: s/txm/txfm/" 2017-06-30 21:57:16 +00:00
Jerome Jiang f781cc8a46 Merge "vp9: Adjust condition for checking intra mode." 2017-06-30 21:55:00 +00:00
Marco 2290898ac7 vp9: Adjust condition for checking intra mode.
For nonrd_pickmode: add condition for checking
intra mode if the sb content state is VeryHighSad.

Reduces artifacts when sudden change in content.

Metrics on RTC/RTC_derf neutral (small gain).
No speed loss observed.

Change-Id: I07006d28fd2dc06c1d06b07630102b0fece50c40
2017-06-30 14:52:00 -07:00
Linfeng Zhang 1e3a93e72e Merge changes I5d038b4f,I9d00d1dd,I0722841d,I1f640db7
* changes:
  Add vpx_highbd_idct8x8_{12, 64}_add_sse4_1
  sse2: Add transpose_32bit_4x4x2() and update transpose_32bit_4x4()
  Refactor highbd idct 4x4 sse4.1 code and add highbd_inv_txfm_sse4.h
  Refactor vpx_idct8x8_12_add_ssse3() and add inv_txfm_ssse3.h
2017-06-30 20:49:19 +00:00
James Zern 70b7713059 gen_msvs_sln: fix solution version for 2015/17
these are rewritten to 12; 15 causes the open to fail under vs2017

Change-Id: I9c3fd38b632180fa10f1713d4a5d9d15aefd8569
2017-06-30 12:36:46 -07:00
James Zern 303cb3106b vp9_onyxc_int,RefCntBuffer: rm unused members
the last frame_worker_owner, row and col references were removed in:
131bd06e6 remove vp9_dthread.c

BUG=webm:1395

Change-Id: Ia7fb2e8782b12a58d2a2263849d20a8abf06aef6
2017-06-30 12:03:07 -07:00
James Zern bc837b223b VP9_COMMON: rm frame_parallel_decode
this has been 0 since the removal of frame_parallel_decode in
vp9_dx_iface.

BUG=webm:1395

Change-Id: I3a562b2c6b82050064d2b2ccb18a3e77c700b2da
2017-06-30 12:03:07 -07:00
James Zern 78fe5ca360 remove vp9_dthread.c
and the related prototypes in vp9_dthread.h. the last references were
removed in:
09dabc58d VP9_COMMON: rm frame_parallel_decode

vp9_dx_iface.c still uses FrameWorkerData

BUG=webm:1395

Change-Id: Ica8e98ae776fc0105f1fbbed9e0a729808980810
2017-06-30 12:03:07 -07:00
James Zern ba76b662af VP9Decoder: rm frame_parallel_decode
this has been 0 since the removal of frame_parallel_decode in
vp9_dx_iface.

BUG=webm:1395

Change-Id: I3f579766ecfa4777395b99686738e1c5610f86ef
2017-06-30 12:03:07 -07:00
James Zern fb134e759a vp9: reduce FRAME_BUFFERS by 3
the additional buffers are unneeded with the removal of
frame_parallel_decode

BUG=webm:1395

Change-Id: Id9ec4cb6462af5d07a0d3cf939bd216db27d9d9e
2017-06-30 12:03:07 -07:00
James Zern 86d51dfbf5 vp9_dx: rm worker thread creation
creating a thread associated with the sole worker isn't necessary when
only execute() is being used after the removal of frame_parallel_decode.

BUG=webm:1395

Change-Id: I2255ce72607321e5708bc82a632dc6825d4eff5c
2017-06-30 12:03:07 -07:00
James Zern 469986f963 Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  vp9_dx,vpx_codec_alg_priv: rm *worker_id*
  vp9_dx,vpx_codec_alg_priv: rm *cache*
  vp9_dx,vpx_codec_alg_priv: rm frame_parallel_decode
2017-06-30 19:02:05 +00:00
Johann c2044fda1d buffer.h: use stride_ instead of stride()
Change-Id: Ib51231349bf0ff3e23672762dc7bfa49b5fe4083
2017-06-30 07:37:20 -07:00
Johann ce5b17f9ad testing: ranges for random values
Add a method to acm_random.h to generate ranges of values

Add a way to call that method to buffer.h

Adjust dct_[partial_]test.cc to use it.

Change-Id: I8c23ae9d27612c28f050b0e44c41cb4ad2494086
2017-06-30 07:25:30 -07:00
Johann Koenig 89d3dc043e Merge changes Id5beb35d,I2945fe54,Ib0f3cfd6,I78a2eba8
* changes:
  partial fdct neon: add 32x32_1
  partial fdct neon: add 16x16_1
  partial fdct neon: add 4x4_1
  partial fdct neon: move 8x8_1 and enable hbd tests
2017-06-30 01:00:07 +00:00
Linfeng Zhang c338f3635e Add vpx_highbd_idct8x8_{12, 64}_add_sse4_1
BUG=webm:1412

Change-Id: I5d038b4fa842ce2f6b9bd5c8c44c70647bda9591
2017-06-29 17:19:34 -07:00
Linfeng Zhang ee5cb8d87f sse2: Add transpose_32bit_4x4x2() and update transpose_32bit_4x4()
BUG=webm:1412

Change-Id: I9d00d1ddbd724fd5f825fd974c4cf46a9bca6cb3
2017-06-29 17:18:01 -07:00
Linfeng Zhang 0fa59a4baf Refactor highbd idct 4x4 sse4.1 code and add highbd_inv_txfm_sse4.h
Also clean highbd_inv_txfm_sse2.h

BUG=webm:1412

Change-Id: I0722841d824ce602874019bd9779b10d49d10c0b
2017-06-29 17:17:43 -07:00
Linfeng Zhang 9ac78ae35f Refactor vpx_idct8x8_12_add_ssse3() and add inv_txfm_ssse3.h
BUG=webm:1412

Change-Id: I1f640db71ad4c644b7521305a781f2218eb1ba9d
2017-06-29 17:13:28 -07:00
James Zern e3c8f2f152 vp9_dx,vpx_codec_alg_priv: rm *worker_id*
+ available_threads

these are unused with the removal of frame_parallel_decode

BUG=webm:1395

Change-Id: I59c5075542a5a74d4a539c213682f566b005f5a6
2017-06-29 16:21:50 -07:00
James Zern dcdb013b55 vp9_dx,vpx_codec_alg_priv: rm *cache*
these fields are unused with the removal of frame_parallel_decode

BUG=webm:1395

Change-Id: Ia3821f7fb81d17b20033b094e5265b1030ee4030
2017-06-29 16:21:50 -07:00
James Zern ab76f1f224 vp9_dx,vpx_codec_alg_priv: rm frame_parallel_decode
this field has been 0 since:
01d23109a vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op

BUG=webm:1395

Change-Id: I15448e9401e15329b54c6878dda033b17be5ec6b
2017-06-29 16:21:50 -07:00
James Zern 67d7a6df2d Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  rm vp9_frame_parallel_test.cc
  test_vector_test: rm ref to VPX_CODEC_USE_FRAME_THREADING
2017-06-29 23:21:18 +00:00
James Zern e5bdab98e9 rm vp9_frame_parallel_test.cc
VPX_CODEC_USE_FRAME_THREADING was made a no-op in:
01d23109a vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op

and the tests in this file have been disabled since:
6ab0870d4 disable VP9MultiThreadedFrameParallel tests

BUG=webm:1395

Change-Id: I2c7a250acb65cf9522cf8a7bb724bb92070e41c6
2017-06-29 15:15:56 -07:00
James Zern 508ef2a6e3 test_vector_test: rm ref to VPX_CODEC_USE_FRAME_THREADING
this was made a no-op in:
01d23109a vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op

and the test hitting this branch has been disabled since:
6ab0870d4 disable VP9MultiThreadedFrameParallel tests

rename the test to VP9MultiThreaded to exercise the tile-based threading

BUG=webm:1395

Change-Id: I35564a75eb5a7d7f7ccb923133b1b07295201f4c
2017-06-29 15:15:48 -07:00
James Zern 8d1bda93f4 cosmetics,vp9/encoder: s/txm/txfm/
txfm is more commonly used as an abbreviation through the codebase

Change-Id: I86fd90ef132468f9da270091c05daa1f5a49ece2
2017-06-29 15:08:47 -07:00
James Zern bd77931421 dct_partial_test,fwd_txfm: change << to *
left shift of a negative number is undefined in C; quiets a ubsan
warning

Change-Id: Ib1624ad5326ac8e0eead9348468ef7fe5d4df9a4
2017-06-29 14:42:03 -07:00
Jerome Jiang 74dc640565 vp9: Remove avg2x2 in skin detection and clean up.
Change-Id: I6510e36866138f8ac4cb82c207e58e0b9522e499
2017-06-29 03:06:23 +00:00
Johann 9fe510c12a partial fdct neon: add 32x32_1
Always return an int32_t. Since it needs to be moved to a register for
shifting, this doesn't really penalize the smaller transforms.

The values could potentially be summed and shifted in place.

BUG=webm:1424

Change-Id: Id5beb35d79c7574ebd99285fc4182788cf2bb972
2017-06-28 15:37:44 -07:00
Johann f310ddc470 partial fdct neon: add 16x16_1
For the 8x8_1, the highbd output fit nicely in the existing function. 12
bit input will overflow this implementation of 16x16_1.

BUG=webm:1424

Change-Id: I2945fe5478b18f996f1a5de80110fa30f3f4e7ec
2017-06-28 15:37:44 -07:00
Johann 4959dd3eb3 partial fdct neon: add 4x4_1
BUG=webm:1424

Change-Id: Ib0f3cfd6116fc1f5a99acb8bfd76e25b90177ffc
2017-06-28 15:37:44 -07:00
Johann cf75ab6ccd partial fdct neon: move 8x8_1 and enable hbd tests
The function was originally written with HBD in mind. Enable it and
configure the tests.

BUG=webm:1424

Change-Id: I78a2eba8d4d9d59db98a344ba0840d4a60ebe9a1
2017-06-28 15:37:43 -07:00
Johann Koenig 81e25512c3 Merge changes Ib454762d,I966650df,Ie126553e,I068f06c6,Icb72a94e
* changes:
  sad neon: rewrite 64x64 and add 64x32
  sad neon: rewrite 32x32, add 32x16 and 32x64
  sad neon: rewrite 16x8, 16x16, add 16x32
  sad neon: rewrite 8x8 and 8x16
  sad neon: rewrite 4x4 and add 4x8
2017-06-28 22:37:00 +00:00
Johann Koenig d91af5f905 Merge "buffer.h: Only allow Init() to be called once." 2017-06-28 22:36:05 +00:00
Johann Koenig 35f8515c3f Merge "partial fdct test" 2017-06-28 22:34:53 +00:00
Johann 5ac88162b9 partial fdct test
Test the _1 variant of the fdct, which simply sums the block and applies
a modifying shift based on the block size.

BUG=webm:1424

Change-Id: Ic80d6008abba0c596b575fa0484d5b5855321468
2017-06-28 20:32:20 +00:00
Johann ad011aaab8 sad neon: rewrite 64x64 and add 64x32
BUG=webm:1425

Change-Id: Ib454762d1c61b05a98324fe81ad58c9e09784717
2017-06-28 12:21:34 -07:00
Johann 77a648885c sad neon: rewrite 32x32, add 32x16 and 32x64
BUG=webm:1425

Change-Id: I966650df7e3face93e1e771634d1cc5458a35f85
2017-06-28 12:20:27 -07:00
Johann 469643757f sad neon: rewrite 16x8, 16x16, add 16x32
BUG=webm:1425

Change-Id: Ie126553e5fffcdfaf3d82a85b368ac10ce9ab082
2017-06-28 12:16:00 -07:00
Johann e40e78be24 sad neon: rewrite 8x8 and 8x16
BUG=webm:1425

Change-Id: I068f06c67b841f09ea07c04ada0c2f1706102138
2017-06-28 12:15:57 -07:00
Johann 46d8660ce3 sad neon: rewrite 4x4 and add 4x8
The previous implementation loaded 8 values (discarding half)

BUG=webm:1425

Change-Id: Icb72a94e2557a4ee2db7091266ab58fd92f72158
2017-06-28 11:14:59 -07:00
Jerome Jiang 8582d33a0d Merge "vp9: compute skinmap only once before encoding." 2017-06-28 18:01:46 +00:00
Johann e0330c4810 buffer.h: Only allow Init() to be called once.
Change-Id: I041c8b6f314802833c5287a176dbfeec9461b08e
2017-06-28 10:59:39 -07:00
Marco 88d11f473c vp9: Speed >= 8: Remove logic on reducing subpel.
Existing logic was only affecting resolutions above 720p.
Needs more testing for reducing subpel for speed >= 8.

No change on RTC metrics.

Change-Id: I2f4bf9f25891614aafa9a86aa5a5063a3ccfce4d
2017-06-27 20:27:02 -07:00
James Zern 515fed8f38 Merge "highbd_quantize_fp_32x32: normalize abs_qcoeff type" 2017-06-27 23:30:17 +00:00
Jerome Jiang a220b931f5 vp9: compute skinmap only once before encoding.
This could save some cycles since skin detection is used in multiple
places in vp9.

1~2% speed up on ARM.

Change-Id: I86b731945f85215bbb0976021cd0f2040ff2687c
2017-06-27 16:16:02 -07:00
hui su d4595de5db level tests: allow level undershoot
Obtaining a level that is lower than the target should be tolerated.

Change-Id: I90a55ee6d7142e9f6cc525ebbd1e0501defcbe28
2017-06-26 15:17:04 -07:00
Linfeng Zhang 0bb31a46a4 Update vpx_idct8x8_12_add_ssse3()
Change-Id: I0f38801c391db87ddae168602a786a062cd34b1d
2017-06-26 14:57:41 -07:00
Linfeng Zhang 39972d999d Merge "Update load_input_data() in x86" 2017-06-26 21:48:50 +00:00
Jerome Jiang ddae8f7632 Merge "vp8: Clean up skinmap debugging codes." 2017-06-26 21:32:52 +00:00
Linfeng Zhang a76b6b232c Update load_input_data() in x86
Split to load_input_data4() and load_input_data8().
Use pack with signed saturation instruction for high bitdepth.

Change-Id: Icda3e0129a6fdb4a51d1cafbdc652ae3a65f4e06
2017-06-26 13:38:33 -07:00
James Zern f749905d0a roll libwebm snapshot
git log --no-merges --oneline 9732ae9..a97c484
9096786 mkvparser: fix float conversion warning
84e8257 disable -Wdeprecated-declarations in legacy code
a98f495 AddGenericFrame: fix memory leak on failure
da131dd AddCuePoint: fix memory leak on failure
b0cea9c Add(Audio|Video)Track: fix memory leak on failure
5261a67 webm_info: check vp9 ParseUncompressedHeader return
85f7e2e webm_info,PrintVP9Info: validate alt ref sizes
9b97ca1 vp9_header_parser_tests: check parser return
300d6d8 CuePoint::Find: check Track pointer
50c44bb webm_info,OutputCues: fix indexing of tracks
a0d27f0 mkvparser,Block::Parse: remove incorrect assert
784fc1b vttdemux,CloseFiles: check file pointer before closing
b4522c1 .gitattributes: force mkv/webm to be treated as binary
a118f3d Add test for projection parse failures.
d398479 Add test for primary chromaticity parse failures.
9bbec4c Fix permissions on test file.
2cef4d5 mkvparser:Parse: s/FLT_MIN/-FLT_MAX/
35a3c88 mkvmuxer: Turn off estimate_file_duration_ by default
5a41830 mkvparser: Avoid double free when Chromaticity parse fails.
67e3ffa mkvparser: Avoid casts of values too large for float in
Projection elements.
87bcddf vttdemux::ChapterAtomParser: check for NULL display string
a534a24 Update .gitignore
a0d67d0 mkvmuxer: Fix hard-coded data size in EbmlElementSize
c36112c mkvparser: #include sys/type.h
686664e Fix cmake generation warnings on Windows.
2b2c196 cmake: Fix required flag check.
166e40f Cmake refactor.
9fb774a Add missing include in webm2pes.cc.
4956b2d mkvmuxer: Force new clusters when audio queue gets too long.
54f1559 cmake: Cache results of CXX flag tests.
81c73fc mkvparser: Avoid alloc failures in SeekHead::Parse.

Change-Id: Ib81b1772ec81e7af3852dcfef2d312416f6db53d
2017-06-26 18:38:47 +00:00
Linfeng Zhang ec4afbf74a Merge "Add vpx_highbd_idct4x4_16_add_sse4_1()" 2017-06-24 01:15:14 +00:00
Urvang Joshi 30eeb4dc32 Merge "Enable greedy version of optimize_b() in VP9 by default." 2017-06-24 00:58:24 +00:00
Linfeng Zhang 6ff5de68dd Merge "Cosmetics, 8x8 idct SSE2 optimization" 2017-06-24 00:51:08 +00:00
Urvang Joshi 4bb99ee27e Enable greedy version of optimize_b() in VP9 by default.
Improvements were already mentioned in the previous patch:
https://chromium-review.googlesource.com/#/c/531675/

Change-Id: I4906ab1c61c25a815bdeb986016fad6dcb69eb71
2017-06-23 17:04:58 -07:00
James Zern ee1fcb0e69 Merge "variance_test: move Subpel* from tuples to TestParams" 2017-06-23 22:48:40 +00:00
Marco Paniconi 8ad23a072a Merge "vp9: Use scene detection for CBR mode." 2017-06-23 21:59:32 +00:00
Linfeng Zhang 8253a27904 Add vpx_highbd_idct4x4_16_add_sse4_1()
BUG=webm:1412

Change-Id: Ie33482409351a01be4e89466b0441834eb1e905a
2017-06-23 14:30:12 -07:00
Linfeng Zhang b8a4b5dd8d Cosmetics, 8x8 idct SSE2 optimization
Change-Id: Id21fa94fd323e36cd19a2d890bf4a0cafb7d964d
2017-06-23 14:30:12 -07:00
Jerome Jiang b0c4d87ac7 vp8: Clean up skinmap debugging codes.
Use the computed skinmap.

Change-Id: I8aabb5922ef5190ec85b9e01807cb79b4803b925
2017-06-23 14:05:33 -07:00
James Zern 0d1c782306 Merge "datarate_test: rename thread -> Thread in test name" 2017-06-23 20:00:51 +00:00
James Zern 54bcd98314 variance_test: move Subpel* from tuples to TestParams
this normalizes these tests with the regular variance ones both in
implementation and test list output

Change-Id: I387aea81456f94b8223b8fb2a28cab94bc1aa9d5
2017-06-23 12:54:18 -07:00
Marco 18805eee6c vp9: Use scene detection for CBR mode.
Use the scene detection for CBR mode, and use it to reset the
rate control if large source sad is detected and rate
correctioni fact/QP is at minimum state.

Avoids large frame sizes after big content change following
low content period.

Only affects CBR mode for 1 pass at speeds 5, 6, 7.
Change-Id: I56dd853478cd5849b32db776e9221e258998d874
2017-06-23 11:44:50 -07:00
Jerome Jiang e187b27438 Merge "vp8: Compute skinmap only once before encoding." 2017-06-23 17:17:02 +00:00
James Zern 88a302e743 Merge changes from topic 'missing-proto'
* changes:
  onyxd_int.h: add missing prototypes
  onyxd.h: add vp8dx_references_buffer prototype
  vp[89],vpx_dsp: add missing includes
  vp8,encodeframe.h: correct prototypes
  vp8: add temporal_filter.h
  add picklpf.h
  add ethreading.h
  vp8,bitstream.h: add missing prototypes
  vp8: remove vp8_fast_quantize_b_mmx
  vp8,loopfilter_filters: make some functions static
  vp9_ratectrl: make adjust_gf_boost_lag_one_pass_vbr static
  vp9_encodeframe: make scale_part_thresh_sumdiff static
  vp9_alt_ref_aq: correct vp9_alt_ref_aq_create proto
  tiny_ssim: make some functions static
2017-06-23 05:44:24 +00:00
James Zern a377f99088 onyxd_int.h: add missing prototypes
vp8cx_init_de_quantizer, vp8_mb_init_dequantizer
quiets -Wmissing-prototypes

Change-Id: Ib63d14caf0144eff31a75b7cdb667b7e1f9d83ae
2017-06-22 20:20:23 -07:00
Johann Koenig 794a5ad713 Merge "fdct32x32 neon implementation" 2017-06-23 01:58:00 +00:00
Jerome Jiang b28fc490aa vp8: Compute skinmap only once before encoding.
Get ready for other uses (i.e. cyclic refresh).
Then use it when needed.

Change-Id: Id0519a9921045e5fb7f3badb54e9f04e903f28f9
2017-06-22 17:12:15 -07:00
Marco Paniconi 4f917912b9 Merge "vp9: Add high source sad to content state." 2017-06-22 22:18:48 +00:00
Linfeng Zhang c5f9de573f Merge changes I783c5f4f,I365f8e53,I5dac0e98
* changes:
  Clean vpx_idct16x16_256_add_sse2()
  Update vpx_idct{8x8,16x16,32x32}_1_add_sse2()
  Clean 32x32 full idct sse2 and ssse3 code
2017-06-22 21:42:23 +00:00
Paul Wilkins 92145006c3 Merge "Fix int overflow in rate control for high bit rates." 2017-06-22 16:30:40 +00:00
Johann e67660cf37 fdct32x32 neon implementation
Almost 3x faster in constrained loop testing. Over 10x faster in HBD
builds.

BUG=webm:1424

Change-Id: I2b7f8453e1d4ada63cde729d8115d684c4a71ff9
2017-06-22 06:40:17 -07:00
paulwilkins efe1982e63 Fix int overflow in rate control for high bit rates.
Fix misplaced cast that caused an overflow and incorrect rate adaptation
behavior for high data rates. This in particular will have affected 4k encodes
but could also have come into play for some higher rate 1080p cases.

In our standard test sets the quality impact is small though several high rate
clips show improved rate accuracy. This can also impact the number of recode
loop hits and on one problem 4k  clip the encode time for speeds 0 and 1 was
reduced by >25%

Change-Id: I108da7ca42f3bc95c5825dd33c9d84583227dac1
2017-06-22 10:34:21 +01:00
Marco d7515b1187 vp9: Add high source sad to content state.
Use it to limit NEWMV early exit in nonrd pickmode

Small change in RTC metrics, has some improvement
for high motion clips.
Change-Id: I1d89fd955e1b3486d5fb07f4472eeeecd553f67f
2017-06-21 20:57:17 -07:00
Marco Paniconi 33a9394eb1 Merge "vp9: Adjustments for aq-mode and pickmode for speed >= 8." 2017-06-22 03:27:47 +00:00
James Zern dd88bd87db datarate_test: rename thread -> Thread in test name
this is consistent with other threaded tests and ensures gtest_filters
meant to operate on these pick them up

Change-Id: I99ce53720553a22c4b9905a2882273c2be2c031b
2017-06-21 20:05:31 -07:00
James Zern 828a1fa6de Merge "vp8_dx_iface: clear -Wclobbered warnings" 2017-06-22 02:01:11 +00:00
James Zern 7c0788b07f onyxd.h: add vp8dx_references_buffer prototype
quiets -Wmissing-prototypes

Change-Id: I6bee535f3fb67e54a390266d787a5a92127aeadc
2017-06-21 19:00:15 -07:00
James Zern 44418c659f vp[89],vpx_dsp: add missing includes
quiets -Wmissing-prototypes

Change-Id: I841cfc019d592f2bc6b3fec5818051a31f4c53b5
2017-06-21 19:00:15 -07:00
James Zern b24ed95f44 vp8,encodeframe.h: correct prototypes
+ add missing include
quiets -Wmissing-prototypes

Change-Id: I64af0368ba3d7f1d4de22a5887b631bb2cf15b8a
2017-06-21 19:00:15 -07:00
James Zern b093d998fc vp8: add temporal_filter.h
quiets -Wmissing-prototypes

Change-Id: Iffa77467720affe030de5335e9335232b9e70af1
2017-06-21 19:00:15 -07:00
James Zern eb8226b903 add picklpf.h
quiets -Wmissing-prototypes

Change-Id: Ic24164aa1f86fe99a493a633d64606e6f44ecdc1
2017-06-21 19:00:14 -07:00
James Zern 864bc77e7a add ethreading.h
quiets -Wmissing-prototypes in encodeframe.c

Change-Id: Ic216d0bdd6130eac44f2183639a715b2f1088ebe
2017-06-21 19:00:14 -07:00
James Zern d5d6a609d0 vp8,bitstream.h: add missing prototypes
quiets -Wmissing:prototypes

Change-Id: I835a80eddca2b16280780e18558c321df3272c43
2017-06-21 19:00:14 -07:00
James Zern 1d86383512 vp8: remove vp8_fast_quantize_b_mmx
and vp8_fast_quantize_b_impl_mmx; this was never enabled in rtcd
an sse2 version exists so there isn't much reason to keep a mmx
implementation around.

Change-Id: I8b3ee7f46ba194ffa0d0a6225a0f299f2a4dea90
2017-06-21 19:00:14 -07:00
James Zern 18335f193d vp8,loopfilter_filters: make some functions static
quiets -Wmissing-prototypes

Change-Id: Ie5b00537f64a05e68a38dc558463691523988994
2017-06-21 19:00:14 -07:00
James Zern 07f847873b vp9_ratectrl: make adjust_gf_boost_lag_one_pass_vbr static
quiets -Wmissing-prototypes

Change-Id: I72d899c2d8de1ddc52d90ac081f2629374b3a6e9
2017-06-21 19:00:14 -07:00
James Zern 9a329b5285 vp9_encodeframe: make scale_part_thresh_sumdiff static
quiets -Wmissing-prototypes

Change-Id: I696223d75860edba13c6b6f38c1f8db353a6f812
2017-06-21 19:00:14 -07:00
James Zern 3f296533f6 vp9_alt_ref_aq: correct vp9_alt_ref_aq_create proto
quiets -Wmissing-prototypes

Change-Id: Ib2d4f294f1982739bb2ac98155e789e040d309a1
2017-06-21 19:00:04 -07:00
James Zern 9e1d2de67c highbd_quantize_fp_32x32: normalize abs_qcoeff type
use an int to quiet an unsigned rollover warning similar to:
25110f283 Fix an ubsan warning: vp9_quantizer.c

Change-Id: Iedecb79a17249bc18f10c0920f88cf704920f12b
2017-06-21 18:56:10 -07:00
Marco 21afafa31a vp9: Put skin detection usage around cpi flag.
Skin detection usage in choose_partitioning should be
around the cpi->use_skin_detection.

Change-Id: I6986179af9ce94c60c0974d66c311fc07cc04cfe
2017-06-21 17:32:56 -07:00
Marco 8cf6f78fce vp9: Adjustments for aq-mode and pickmode for speed >= 8.
Adjust the threshold for turning off cyclic refresh for high motion,
and avoid testing golden in nonrd pickmode for speed >= 8 if
golden refresh was long ago.

No change/neutral on RTC metrics.
Change-Id: I40959b8d9637f3553e7458bbabd8c6024c2c09c0
2017-06-21 16:01:24 -07:00
James Zern fbba31e241 vp8_dx_iface: clear -Wclobbered warnings
with gcc 6.x

Change-Id: Ib2070421603a6777892d4ea01f4b0921696f38b3
2017-06-21 15:09:58 -07:00
Johann Koenig 355432b0d2 Merge "dct tests: align InvAccuracyCheck buffers" 2017-06-21 21:16:23 +00:00
Linfeng Zhang 466b667ff3 Clean vpx_idct16x16_256_add_sse2()
Remove macro IDCT16 which is redundant with idct16_8col().

Change-Id: I783c5f4fda038a22d5ee5c2b22e8c2cdfb38432c
2017-06-21 13:47:15 -07:00
Linfeng Zhang 42522ce0b7 Update vpx_idct{8x8,16x16,32x32}_1_add_sse2()
Change-Id: I365f8e53d9ccd028cef0f561d4de9e5916278609
2017-06-21 13:47:05 -07:00
Linfeng Zhang 2b43a1ee18 Clean 32x32 full idct sse2 and ssse3 code
vpx_idct32x32_1024_add_ssse3() is actually a sse2 function and faster
than vpx_idct32x32_1024_add_sse2(). Replace the slow one. All are
code relocations, no new code.

Change-Id: I5dac0e98cc411a4ce05660406921118986638d19
2017-06-21 13:46:49 -07:00
Hui Su 96ec8a425b Merge "VP9 level targeting: properly handle max_gf_interval" 2017-06-21 20:38:45 +00:00
Johann 1c48915233 dct tests: align InvAccuracyCheck buffers
'in' is used for the reference fdct. 'coeff' is input to the idct being
tested and 'dst[16]' is output

Fixes a segfault on unaligned memory access on x86.

Change-Id: I3691b1380ed49986897dd89a63ce63a80a0e0962
2017-06-21 11:47:00 -07:00
James Zern 0aa3677d9d fix build, rm ref to vpx_idct8x8_64_add_ssse3
this was deleted in:
98967645a Remove vpx_idct8x8_64_add_ssse3()

but this was merged in:
9e03eedf6 Merge changes Ib26dd515,Ie60dabc3

after:
a92991133 Merge "dct tests: run all possible sizes in one test"

which added a new reference

Change-Id: I8da4a6c80d27b237a378ff15eead1daab89e7e25
2017-06-20 19:46:45 -07:00
Linfeng Zhang 9e03eedf62 Merge changes Ib26dd515,Ie60dabc3
* changes:
  Clean 8x8 idct x86 optimization
  Remove vpx_idct8x8_64_add_ssse3()
2017-06-21 00:38:25 +00:00
hui su d96ed96c0f VP9 level targeting: properly handle max_gf_interval
Don't overide max_gf_interval if it's not specified. It will
be assigned with a default value in vp9_rc_set_gf_interval_range().

BUG=b/62803416

Change-Id: Ide46ce00279ed076865fc54ce98c55a994f0c798
2017-06-20 16:29:04 -07:00
Marco 492d52b9cc vp9: Adjust key-frame pars in vpx_temporal_svc_encoder.
Sample encoder change: reduce max-intra-rate to 1000 and
buf-initial to 600. Paramaters affect target size of key frame.

Change-Id: I2be6bc2927f5fa74e19e1efa3fb574d23a503300
2017-06-20 12:22:03 -07:00
Marco ae3a173352 vp9: Adjust key-frame pars in vpx_temporal_svc_encoder.
Sample encoder change: reduce max-intra-rate to 1500 and
buf-initial to 700. Paramaters affect target size of key frame.

Change-Id: I01e238378b63eeef28dfc2178baadffcd3cc7561
2017-06-20 09:08:13 -07:00
Johann Koenig a929911339 Merge "dct tests: run all possible sizes in one test" 2017-06-20 15:04:25 +00:00
Marco Paniconi 737aa5c9e4 Merge "vp9: SVC: Rework the usage of base_mv for SVC." 2017-06-20 03:08:32 +00:00
Marco b55240057f vp9: Adjust key-frame pars in vpx_temporal_svc_encoder.
Adjust some parameters in sample encoder: vpx_temporal_svc_encoder.
Parameters adjusted to set lower QP for initial key frame,
and allow for larger target size on subsequent key frames.

Change-Id: I092ad968e5b51b9f495dadb6ee96e810663c910e
2017-06-19 18:29:39 -07:00
Marco Paniconi 782aacc3d3 Merge "vp9: Speed >= 8: Adjust resolution threshold for subpel." 2017-06-19 23:45:35 +00:00
Johann 4ebb9a36f1 dct tests: run all possible sizes in one test
Modify fdct4x4_test.cc to support all size combinations. This does not
add any new tests and in fact fails a few. There were minimal changes
made to the tests so it's not entirely surprising that some of the
larger 12 bit transforms are failing since it was initially only used
for 4x4.

In follow up patches the tests in fdct8x8_test.cc, dct16x16_test.cc and
dct32x32_test.cc will be evaluated and moved to dct_test.cc.

BUG=webm:1424

Change-Id: I72a23430f457d7fae8c91e706adc0e77c25abc8f
2017-06-19 15:39:35 -07:00
James Zern ed56ddfef8 Merge "libs.mk: retry partial testdata download" 2017-06-19 22:15:06 +00:00
James Zern ada640a508 libs.mk: retry partial testdata download
attempt retry on transient failures uncaught by --retry

Change-Id: I7cd8846ff88daf0f521af9ee182e30bfd79f51f3
2017-06-19 14:40:39 -07:00
Marco ff7fb4b280 vp9: Speed >= 8: Adjust resolution threshold for subpel.
Get some quality gain on RTC metrics (~7%), with
~5-8% speed slowdown.

Change-Id: I0d02942a77074424ee0326b6e110ddff09f2df5e
2017-06-19 13:58:08 -07:00
Jerome Jiang bf41a982b4 Merge "Enable 8x8 skin detection for vp8." 2017-06-19 16:42:14 +00:00
Marco 112cd95507 vp9: SVC: Rework the usage of base_mv for SVC.
Set the base_mv_aggressive for temporal enhancement layers (TL > 0).
Under the aggressive mode, skip the NEWMV depending on the
SSE of the base_mv. Also reduce the subpel motion to 1/2 under
aggressive mode if base_mv is good.

Speedup ~3% with small/negligible loss in quality on RTC.
Affects speed >= 6.

Change-Id: I89341b279cad6da2a04b76d5e726016191dacdb8
2017-06-18 22:35:46 -07:00
James Zern 31d6ba9a54 tiny_ssim: make some functions static
quiets -Wmissing-prototypes

Change-Id: If2e77c921b2fba456ed8d94119773e360d90b878
2017-06-16 15:36:32 -07:00
James Zern 038522e4a0 Merge "configure: test for -Wparentheses-equality" 2017-06-16 20:07:52 +00:00
Jerome Jiang a36017e007 Enable 8x8 skin detection for vp8.
If 2 or more 8x8 blocks are identified as skin, the macroblock will be
labeled as skin.

Change-Id: I596542c81a2df9e96270cab39d920bbfeb02bc6e
2017-06-15 20:53:03 -07:00
James Zern 27c2185954 configure: test for -Wparentheses-equality
Change-Id: I36de79c58461907deaea70d6131da9119bc0bc69
2017-06-15 16:05:20 -07:00
Linfeng Zhang c7e4917e97 Clean 8x8 idct x86 optimization
Create load_buffer_8x8() and write_buffer_8x8().

Change-Id: Ib26dd515d734a5402971c91de336ab481b213fdf
2017-06-15 14:30:00 -07:00
Linfeng Zhang 98967645a1 Remove vpx_idct8x8_64_add_ssse3()
It's almost identical with vpx_idct8x8_64_add_sse2(), except little
difference in instructions order.

Change-Id: Ie60dabc35eaa6ebae7c755e6cff00a710aad284f
2017-06-15 14:09:33 -07:00
Urvang Joshi a4ea7e131b VP9: Add greedy version of av1_optimize_b().
This was ported from the greedy version in AV1, written by Dake He
(dkhe@google.com).
See:
https://aomedia.googlesource.com/aom/+/master/av1/encoder/encodemb.c#137

Greedy version is disabled by default, but can be picked by setting
USE_GREEDY_OPTIMIZE_B to 1.
To be enabled by default later.

This is both faster and better in terms of compression.

Compression Improvement:
------------------------
lowres: -0.119
midres: -0.064
hdres:  -0.405

Speed Improvement:
------------------
(Based on encode time of 3 videos of different difficulties at
3 different target bitrates)
With --cpu-used=0: 0.38% to 5.55% faster
With --cpu-used=1: 0.24% to 2.79% faster
With --cpu-used=2: 0.29% to 1.46% faster

Change-Id: Ia7a23b3b244ad8eb253ac9e43cd03c5e021d2635
2017-06-15 11:19:08 -07:00
Linfeng Zhang 8d391a111a Merge changes Ibf9d120b,I341399ec,Iaa5dd63b,Id59865fd
* changes:
  Update high bitdepth load_input_data() in x86
  Clean array_transpose_{4X8,16x16,16x16_2) in x86
  Remove array_transpose_8x8() in x86
  Convert 8x8 idct x86 macros to inline functions
2017-06-15 17:57:50 +00:00
Marco Paniconi 8b48f68c0d Merge "vp8: Adjust the pred_err threhsold for drop on overshoot." 2017-06-14 15:59:55 +00:00
Linfeng Zhang 6da6a23291 Update high bitdepth load_input_data() in x86
BUG=webm:1412

Change-Id: Ibf9d120b80c7d3a7637e79e123cf2f0aae6dd78c
2017-06-13 16:53:53 -07:00
Linfeng Zhang d6eeef9ee6 Clean array_transpose_{4X8,16x16,16x16_2) in x86
Change-Id: I341399ecbde37065375ea7e63511a26bfc285ea0
2017-06-13 16:50:44 -07:00
Linfeng Zhang 9c72e85e4c Remove array_transpose_8x8() in x86
Duplicate of transpose_16bit_8x8()

Change-Id: Iaa5dd63b5cccb044974a65af22c90e13418e311f
2017-06-13 16:50:44 -07:00
Linfeng Zhang cbb991b6b8 Convert 8x8 idct x86 macros to inline functions
Change-Id: Id59865fd6c453a24121ce7160048d67875fc67ce
2017-06-13 16:50:43 -07:00
James Zern 4f9d852759 vp8_skin_detection: add 'vp8_' prefix to public fns
BUG=webm:1438

Change-Id: I5feb31c254d02e116e624cfe702e73ba5a1f7aca
2017-06-12 20:13:28 -07:00
James Zern 98666368ee rename vp8/common/skin_detection.[hc] -> vp8_*
some build systems have trouble with duplicate basenames.
vpx_dsp/skin_detection.[hc] were added in:
658e85425 Merge skin detection code in vp8/9.

BUG=webm:1438

Change-Id: Ieaa70b40bda409ec23e6d179b47a930ac6243b05
2017-06-12 20:13:23 -07:00
Marco b6e1bdfc76 vp8: Adjust the pred_err threhsold for drop on overshoot.
Change-Id: Ica2a09ac87160936b6f7bd01f167f464ea3ac41c
2017-06-12 09:54:16 -07:00
Hui Su 21e1661b54 Merge "vp9 level targeting: more strict constraint on min_gf_interval" 2017-06-12 16:38:02 +00:00
Jerome Jiang a46bc0268b Merge "Remove duplication on vp8/9_write_yuv_frame." 2017-06-10 04:50:19 +00:00
Marco e540ca7155 vp9: SVC: Use prune_evenemore only for non_reference.
Set subpel prune_evenmore only for non_reference frames,
instead of all TL > 0 frames. Gain some quality back at
cost of small speed loss (~1-2%).

Change only effects SVC encoding at speed >= 7.

Change-Id: I5b9f51e51dccfd7050521a66996176b0415ca3f9
2017-06-09 17:52:20 -07:00
Jerome Jiang ff2d220d21 Remove duplication on vp8/9_write_yuv_frame.
Change-Id: Ib3546032a27c715bf509c0e24d26a189bc829da8
2017-06-09 17:08:26 -07:00
Johann Koenig 6dcd9b37ea Merge "idct_test: don't use std::nothrow anymore" 2017-06-09 20:42:39 +00:00
Johann Koenig 8aa4ee1f10 Merge "buffer.h: allow declaring an alignment" 2017-06-09 20:42:21 +00:00
Johann Koenig 65f4299d65 Merge "Remove some dead code. Coverity CID 1310058" 2017-06-09 20:41:57 +00:00
Johann 92373a5bb2 idct_test: don't use std::nothrow anymore
But still check for NULL before calling Init()

Change-Id: I2bf2887e1064c9103d29c542d20365c0aea75d76
2017-06-09 11:09:06 -07:00
Johann 5aee8ea752 buffer.h: allow declaring an alignment
x86 simd register operations generally prefer and may require 16 byte
alignment.

Change-Id: I73ce577a90dc66af60743c5727c36f23200950ba
2017-06-09 11:03:15 -07:00
Sylvestre Ledru c12d1d9b98 Remove some dead code. Coverity CID 1310058
Change-Id: I1186cf1dd8cde42f5970928f43edfc852298289d
2017-06-09 17:56:38 +00:00
James Zern b3a262dff3 Merge "vp8_decode_frame: fix oob read on truncated key frame" 2017-06-08 23:17:50 +00:00
James Zern 45daecb4f7 vp8_decode_frame: fix oob read on truncated key frame
the check for error correction being disabled was overriding the data
length checks. this avoids returning incorrect information (width /
height) for the decoded frame which could result in inconsistent sizes
returned in to an application causing it to read beyond the bounds of
the frame allocation.

BUG=webm:1443
BUG=b/62458770

Change-Id: I063459674e01b57c0990cb29372e0eb9a1fbf342
2017-06-08 23:16:04 +00:00
Johann e50ea014c3 Revert "buffer.h: use size_t"
This reverts commit f08581c1d0.

type conversion warnings abound.

Change-Id: I41d4c0e7a388e1008bdbc55fefda4bbca3f89f00
2017-06-08 10:20:21 -07:00
Jerome Jiang 943f9ee25c Merge "Merge skin detection code in vp8/9." 2017-06-08 16:36:00 +00:00
Johann Koenig 903375a48a Merge "fdct16x16 neon optimization" 2017-06-08 15:19:36 +00:00
Jerome Jiang 658e854252 Merge skin detection code in vp8/9.
BUG=webm:1438

Change-Id: Ie3dc034c7dbb498a0b088a767b1936ddeed4df14
2017-06-07 21:20:34 -07:00
hui su 21d2273efa vp9 level targeting: more strict constraint on min_gf_interval
min_gf_interval should be no less than min_altref_distance + 1,
as the encoder may produce bitstream with alt-ref distance being
min_gf_interval - 1.

BUG=b/38450599

Change-Id: Ifb733daa643ebc668d1b23e1ce92db94b66dabe8
2017-06-07 17:40:25 -07:00
Johann eae7cf2368 fdct16x16 neon optimization
Roughly 2x speedup. Since the only change for HBD is to store(), the
improvement appears to hold there as well.

BUG=webm:1424

Change-Id: I15b813d50deb2e47b49a6b0705945de748e83c19
2017-06-07 14:59:55 -07:00
Marco Paniconi 9cea3a3c4e Merge "vp9: SVC: Enable simple_block_yrd for temporal layers." 2017-06-07 21:12:14 +00:00
Johann Koenig 0c4f74d129 Merge changes Iade45f69,I18d90658,Ieca3f1ef
* changes:
  buffer.h: add num_elements_
  buffer.h: zero-init all values
  buffer.h: use size_t
2017-06-07 19:20:16 +00:00
Marco 14d4718043 vp9: SVC: Enable simple_block_yrd for temporal layers.
Enable simple_block_yrd for temporal enhancement layers (TL > 0).
And remove block size condiiton for SVC mode.
Only affects speed >= 7 SVC.

Speedup ~3-4%.
avgPSNR regression on RTC for (3 spatial, 3 temporal) layers: ~1%.

Change-Id: Iff4fc191623b71c69cd373e7c0823385e7ac67ed
2017-06-07 11:41:50 -07:00
Johann 902d63759e buffer.h: add num_elements_
raw_size_ was being incorrectly computed and used

Change-Id: Iade45f69964c567ffb258880f26006a96ae5a30d
2017-06-07 11:31:20 -07:00
Johann 4a37e3e2a0 buffer.h: zero-init all values
Change-Id: I18d90658bcd4365d49adcadd6954090b3b399aa8
2017-06-07 11:27:26 -07:00
Johann f08581c1d0 buffer.h: use size_t
Change-Id: Ieca3f1ef23cd1d7b844ea3ecb054007ed280b04f
2017-06-07 11:24:27 -07:00
Marco 13b02a8efe vp9: SVC: Enable row-mt in sample encoder.
Change-Id: I4b51043cb3f5955efe947fe4685aed4a21adb8bd
2017-06-07 10:32:44 -07:00
James Zern ff42e04f9c Merge "ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}" 2017-06-06 23:52:39 +00:00
Marco Paniconi 27b34a109d Merge "vp9: SVC: Adjust some speed settings for SVC speed >= 7." 2017-06-06 23:07:45 +00:00
Marco 7d2f5f8e9d vp9: SVC: Adjust some speed settings for SVC speed >= 7.
Keep the 1/4subpel for all frames, use SUBPEL_TREE_PRUNED_EVENMORE
for all temporal enhancement layer frames.

Change-Id: Ibc681acbb6fc75b7b3c57fc483fcb11d591dfc9a
2017-06-06 15:30:24 -07:00
Johann de4cb716ee buffer.h: split out init
Change-Id: Idfbd2e01714ca9d00525c5aeba78678b43fb0287
2017-06-06 15:02:50 -07:00
Johann 8659764a07 buffer.h: Use T for values
Change-Id: I2da4110e843b6e361028b921c24b6ca2ea9077d9
2017-06-06 12:05:14 -07:00
Jerome Jiang cf07d85809 Initialize cost_list all to INT_MAX.
It is initialized to be { INT_MAX, 0, ... } in ffe0f9b.
No effect on encoders.
Make it consistent with other initializations.

BUG=webm:1440

Change-Id: Ie2a180d93626b55914c8c4255e466a1986d2b922
2017-06-06 10:42:37 -07:00
James Zern 6df142e2ab vp9_mcomp,get_cost_surf_min: quiet conversion warning
visual studio will warn if a 32-bit shift is implicitly converted to 64.
in this case integer storage is enough for the result.
since:
f3a9ae5ba Fix ubsan failure in vp9_mcomp.c.

Change-Id: I7e0e199ef8d3c64e07b780c8905da8c53c1d09fc
2017-06-05 22:52:58 -07:00
Jerome Jiang 968a5d6bc2 Merge "Fix valgrind failure on uninitialized variables." 2017-06-06 03:47:31 +00:00
James Zern 4753c23983 Merge "ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx" 2017-06-06 02:19:41 +00:00
Jerome Jiang ffe0f9b7fb Fix valgrind failure on uninitialized variables.
BUG=webm:1440

Change-Id: I7074e42bdfa8dd25f11bbb3f2ab1b41d6f4c12e4
2017-06-05 13:09:29 -07:00
Jerome Jiang f3a9ae5baa Fix ubsan failure in vp9_mcomp.c.
Change-Id: Iff1dea1fe9d4ea1d3fc95ea736ddf12f30e6f48d
2017-06-02 21:37:13 -07:00
Marco e30781ff80 vp9: SVC: Force subpel search off under certain conditions.
For SVC 1 pass non-rd mode:
Force subpel seach off for SVC for non-reference frames
under motion threshold.

Add flag to svc context to indicate if the frame is not used
as a reference.

Little/no quaity loss, ~2% speedup.

Change-Id: Ic433c44b514d19d08b28f80ff05231dc943b28e9
2017-06-01 20:48:52 -07:00
Marco Paniconi ff637d1903 Merge "vp9: Speed >8: Set subpel_search_method for low motion." 2017-06-01 23:57:19 +00:00
Marco 8c6fa5c5e3 vp9: Speed >8: Set subpel_search_method for low motion.
Speed >=8: for resolutions above CIF, and for low motion content,
set subpel_search_method to SUBPEL_TREE_PRUNED_EVENMORE.

Small speed gain (~2%) on vga clips,
RTC metrics up by ~2-3% on average.

Change-Id: Ie26ba0264589652f92dfe74308740debf94cf0cc
2017-06-01 16:16:13 -07:00
Jerome Jiang 68f035026f vp8 skin detection: Fix visual studio build failure.
Change-Id: I510b755550ebbfa2aaf9b974920d7f1c6454a845
2017-06-01 13:46:46 -07:00
Jerome Jiang e254969df2 Fix corruption in skin map debugging output yuv.
For both vp8 and vp9.

BUG=webm:1437

Change-Id: Ifd06f68a876ade91cc2cc27c574c4641b77cce28
2017-06-01 16:59:43 +00:00
Jerome Jiang f1a300acc4 vp8: Clean up skin detection.
Use only the average of center 2x2 pixels in vp8.

Change-Id: I2b23ff19a90827226273e0fca49e90c734eda59b
2017-05-31 14:57:10 -07:00
Johann Koenig 755b3daf90 Merge "comp_avg_pred neon: used by sub pixel avg variance" 2017-05-31 18:17:28 +00:00
Jerome Jiang 32d8992147 Merge "Write skin map of vp8 skin detection for debug." 2017-05-31 16:37:07 +00:00
Linfeng Zhang 30ea3ef283 Merge "Update vpx_highbd_idct4x4_16_add_sse2()" 2017-05-31 15:56:20 +00:00
Johann f695b30ac2 comp_avg_pred neon: used by sub pixel avg variance
BUG=webm:1423

Change-Id: I33de537f238f58f89b7a6c1c2d6e8110de4b8804
2017-05-30 22:47:34 +00:00
Jerome Jiang c39526da8a Write skin map of vp8 skin detection for debug.
Change-Id: Ica1b4e918aa759cd0ce65920f9d88452bbf9e3b4
2017-05-30 10:30:05 -07:00
Linfeng Zhang 45048dc9dc Update vpx_highbd_idct4x4_16_add_sse2()
BUG=webm:1412

Change-Id: I26e4b34ae9bc1ae80c24f56d740d737a95f1ab84
2017-05-30 09:25:30 -07:00
Johann Koenig b9649d2407 Merge "comp_avg_pred: alignment" 2017-05-30 16:21:05 +00:00
Johann Koenig 48c0e13286 Merge "remove DECLARE_ALIGNED from neon code" 2017-05-30 15:58:17 +00:00
Johann ea8b4a450d comp_avg_pred: alignment
x86 requires 16 byte alignment for some vector loads/stores.

arm does not have the same requirement.

The asserts are still in avg_pred_sse2.c. This just removes them from
the common code.

Change-Id: Ic5175c607a94d2abf0b80d431c4e30c8a6f731b6
2017-05-30 07:46:43 -07:00
Jerome Jiang a5ab38093f Merge "Fix vp8 race when build --enable-vp9-highbitdepth." 2017-05-30 05:47:44 +00:00
Johann 42ce25821d remove DECLARE_ALIGNED from neon code
Unlike x86 neon only requires type alignment when loading into vectors.

Change-Id: I7bbbe4d51f78776e499ce137578d8c0effdbc02f
2017-05-26 10:41:57 -07:00
Johann Koenig 2693b89c19 Merge "subpel variance neon: reduce stack usage" 2017-05-26 17:25:47 +00:00
Johann Koenig 47174d60c8 Merge "Use vdup instead of vmov" 2017-05-26 17:25:24 +00:00
Jerome Jiang 0afa2dad76 Fix vp8 race when build --enable-vp9-highbitdepth.
Split vp8/vp9 implementations on yv12_copy_frame_c.
Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c.
Clean up vp8 codes usage in vp9.

BUG=webm:1435

Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d
2017-05-26 09:45:01 -07:00
Marco 146005a911 vp9: SVC: Fix to condiiton on using source_sad.
Fix the condition on usage of source_sad for temporal layers.
FIx allows it to be used for the case of 1 temporal layer.

Change-Id: I02b1b0ade67a7889d1b93cee66d27c0951131fc3
2017-05-26 08:46:50 -07:00
Marco Paniconi 9ec9415fd9 Merge "vp9: Use source_sad only on top temporal enhancement layer." 2017-05-26 05:24:06 +00:00
Marco Paniconi 4be18ab295 Merge "vp9: SVC: Enable copy partition for SVC speed >= 7." 2017-05-26 05:23:47 +00:00
Marco ea914456af vp9: Use source_sad only on top temporal enhancement layer.
For 1 pass CBR SVC mode.

Change-Id: Ic026740f9d0ec5eee7c5845be9c5b15884fec48d
2017-05-25 16:32:05 -07:00
Jerome Jiang 327c9bb1da Refactor: Move vp8 skin detection to new files.
Change-Id: If760f28cbbf22beac1cc9bd1546f13831e9dd3f0
2017-05-25 16:12:27 -07:00
Marco 747cf7a505 vp9: SVC: Enable copy partition for SVC speed >= 7.
Adjust the max_copied_frame setting for temporal layers.
Keep the same setting for non-SVC at speed 8.
This change also enables copy_partiton for non-SVC at speed 7,
but with smaller value of max_copied_frame (=2).

~2% speedup for SVC speed 7, 3 layers, with little/no quality loss.

Change-Id: Ic65ac9aad764ec65a35770d263424b2393ec6780
2017-05-25 12:21:46 -07:00
Johann f3c97ed32e subpel variance neon: reduce stack usage
Unlike x86, arm does not impose additional alignment restrictions on
vector loads. For incoming values to the first pass, it uses vld1_u32()
which typically does impose a 4 byte alignment. However, as the first
pass operates on user-supplied values we must prepare for unaligned
values anyway (and have, see mem_neon.h).

But for the local temporary values there is no stride and the load will
use vld1_u8 which does not require 4 byte alignment.

There are 3 temporary structures. In the C, one is uint16_t. The arm
saturates between passes but still passes tests. If this becomes an
issue new functions will be needed.

Change-Id: I3c9d4701bfeb14b77c783d0164608e621bfecfb1
2017-05-24 13:28:13 -07:00
Johann d204c4bf01 Use vdup instead of vmov
Change-Id: Idb6248c1429b55176bb3e9f4e8365ea0ed2be62a
2017-05-24 11:38:15 -07:00
Johann Koenig de1a9c77a7 Merge changes Iaab2b9a1,Idfb458d3
* changes:
  sub pel avg variance neon: 4x block sizes
  sub pel variance neon: 4x block sizes
2017-05-24 18:33:53 +00:00
Johann Koenig b11a37f540 Merge changes I31fa6ef8,I228c6f29
* changes:
  sub pel avg variance neon: add neon optimizations
  sub pel variance neon: normalize variable names
2017-05-24 18:32:02 +00:00
James Zern f0279ceb92 Merge "partial_idct_test,InitInput: fix rollover in mult" 2017-05-24 16:27:21 +00:00
James Zern 566f6d75bd partial_idct_test,InitInput: fix rollover in mult
promote coeff to signed 64-bit to avoid exceeding integer bounds when
squaring the value

Change-Id: If77bef6bc0a6a4c39ca3013e5e2ddb426a1c6e1f
2017-05-24 15:27:38 +02:00
Alexandra Hájková 8bf6eaf433 ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}
Change-Id: I547d0099e15591655eae954e3ce65fdf3b003123
2017-05-24 13:27:09 +00:00
Linfeng Zhang 6444958f62 Update inv_txfm_sse2.h and inv_txfm_sse2.c
Extract shared code into inline functions.

Change-Id: Iee1e5a4bc6396aeed0d301163095c9b21aa66b2f
2017-05-23 14:54:46 -07:00
Linfeng Zhang 36f1b183e4 Update InitInput() in test/partial_idct_test.cc
Make it work in high bit depth.

BUG=webm:1412

Change-Id: Ic5cfd410a69709f01e2924774356a108a349d273
2017-05-23 14:24:23 -07:00
Gregor Jasny bcfd9c9750 Add support for Visual Studio 2017
BUG=webm:1428

Change-Id: Iba98aef1159724d106cf39b94d7b69843d76cd48
2017-05-23 11:32:27 +02:00
Johann f6fcd3410d sub pel avg variance neon: 4x block sizes
BUG=webm:1423

Change-Id: Iaab2b9a183fdb54aae5f717aba95d90dc36a9e3b
2017-05-22 14:40:05 -07:00
Johann 188d58eaa9 sub pel variance neon: 4x block sizes
Add optimizations for blocks of width 4

BUG=webm:1423

Change-Id: Idfb458d36db3014d48fbfbe7f5462aa6eb249938
2017-05-22 14:40:01 -07:00
Johann 9b0d306a2f sub pel avg variance neon: add neon optimizations
These are missing an optimized version of vpx_comp_avg_pred

BUG=webm:1423

Change-Id: I31fa6ef842e98f7ff3ea079ffed51ae33178e2ed
2017-05-22 13:58:43 -07:00
Johann e0d294c3af sub pel variance neon: normalize variable names
match vpx_dsp/variance.c variable names

Change-Id: I228c6f296c183af147b079b7c8bcdf97bd09cf3a
2017-05-22 13:58:43 -07:00
Linfeng Zhang 27beada6d0 Merge "Add vpx_highbd_idct{4x4,8x8,16x16}_1_add_sse2" 2017-05-22 20:58:18 +00:00
Johann 67ac68e399 variance neon: assert overflow conditions
Change-Id: I12faca82d062eb33dc48dfeb39739b25112316cd
2017-05-22 11:25:06 -07:00
Linfeng Zhang c167345ffb Add vpx_highbd_idct{4x4,8x8,16x16}_1_add_sse2
BUG=webm:1412

Change-Id: Ia338a6057d36f9ed7eaa9cbd4dfbf0c3cbdc6468
2017-05-22 11:24:21 -07:00
Johann d217c87139 neon variance: special case 4x
The sub pixel variance uses a temp buffer which guarantees width ==
stride. Take advantage of this with the 4x and avoid the very costly
lane loads.

Change-Id: Ia0c97eb8c29dc8dfa6e51a29dff9b75b3c6726f1
2017-05-22 10:51:31 -07:00
Johann Koenig e7cac13016 Merge changes Ib8dd96f7,Ie9854b77
* changes:
  neon variance: process 4x blocks
  use memcpy for unaligned neon stores
2017-05-22 17:48:33 +00:00
Marco Paniconi b3bf91bdc6 Merge "vp9: Adjustments to cyclic refresh for high motion." 2017-05-22 06:27:30 +00:00
Marco 2adc0443dd vp9: Adjustments to cyclic refresh for high motion.
For aq-mode=3: refactor the condition for turning off
the refresh. Add some adjustments for high motion content.

No/little change in RTC metrics, only affects high motion case.

Change-Id: I7da8eabfb0e61db014be4562806f72ee5ef4a43b
2017-05-21 22:21:44 -07:00
Marco ff9395eb3b vp9: Speed >= 8: Modify condition for low-resoln.
No change on RTC metrics.

Change-Id: I5abc573cb56572188d900645d13ba479f55a1ea0
2017-05-21 22:14:38 -07:00
Johann Koenig b5055002d7 Merge "neon 4 byte helper functions" 2017-05-19 17:11:30 +00:00
Johann Koenig 3c603eadb4 Merge "neon fdct: 4x4 implementation" 2017-05-19 17:08:58 +00:00
Paul Wilkins a7977ece93 Merge "Changes to modified error." 2017-05-19 12:24:32 +00:00
Marco 1205e3207e vp9: SVC: Modify condition to allow for copy partition.
When temporal layers are used, only allow for copy partition
on the top temporal enhancement layer frames.

Change-Id: I5472abdc0f9f6c8dafa75a7a84c615e08ae22af8
2017-05-18 14:19:31 -07:00
Jerome Jiang 6b6ff9c969 Merge "vp9: Make copy partition work for SVC and dynamic resize." 2017-05-18 19:37:30 +00:00
Marco 2ba4729ef8 vp9: Make copy partition work for SVC and dynamic resize.
Only affects speed 8.

Make changes to copy partition to fix a bug in setting microblock
offset. Avg PSNR shows 0.02% gain on rtc_derf and 0.08% loss on rtc.

Change-Id: I61c3e5914dde645331344388e7437e5638acd4f3
2017-05-18 11:33:56 -07:00
paulwilkins 5680b4517f Changes to modified error.
The modified error was a derivative of the "coded_error"
that was used to allocate bits between different frames on the
assumption that the allocation should be linear in terms of this
modified error.  I.e. a frame with double the modified error score
should all things being equal get double the number of bits. The
code also included upper and lower caps derived from input
VBR parameters.

This patch improves the initial calculation of the clip mean error
(now called "mean_mod_score" as it is no longer a prediction error)
used as the midpoint for the rate distribution function and normalizes
the output "modified scores" scores such that 1.0 indicates a frame
in the middle of the distribution.  The VBR upper and lower caps are
then applied directly to a  frame's normalized score.

This refactoring is intended to make it easier to drop in alternative
distribution functions or to base the rate allocation on a corpus wide
midpoint (rather than the clip mean).

Change-Id: I4fb09de637e93566bfc4e022b2e7d04660817195
2017-05-18 12:56:02 +01:00
Johann 7b742da63e neon variance: process 4x blocks
Continue processing sets of 16 values. Plenty of improvement for 4x8
(doubles the speed) but only about 30% for 4x4.

BUG=webm:1422

Change-Id: Ib8dd96f75d474f0348800271d11e58356b620905
2017-05-17 17:35:01 -07:00
Johann 2057d3ef75 use memcpy for unaligned neon stores
Advise the compiler that the store is eventually going to a uint8_t
buffer. This helps avoid getting alignment hints which would cause the
memory access to fail.

Originally added as a workaround for clang:
https://bugs.llvm.org//show_bug.cgi?id=24421

Change-Id: Ie9854b777cfb2f4baaee66764f0e51dcb094d51e
2017-05-17 12:11:31 -07:00
Marco Paniconi a2dfbbd7d6 Merge "vp9: Modify ChangingDropFrameThresh unittest." 2017-05-17 18:42:51 +00:00
Linfeng Zhang 13918a9ccc Merge "Update partial idct testing code" 2017-05-17 17:53:03 +00:00
Yaowu Xu bde2c04fb7 Merge "Experiment. Store first pass errors as per MB values." 2017-05-17 17:38:15 +00:00
Marco 4733df333f vp9: Modify ChangingDropFrameThresh unittest.
Add another (lower) bitrate to the test, to cover
frame drop behavior at low bitrate range.

Change-Id: Iaad003974159daf3d2d65ef3a6575a3e72e498d6
2017-05-17 09:38:21 -07:00
Linfeng Zhang 3210ca6d60 Update partial idct testing code
Add PartialIDctTest::PrintDiff() to help debugging.
In RunQuantCheck, try all combinations of +/-mask_ input for 4x4 idct.
Update PartialIDctTest::InitInput().

Change-Id: I13fd163954a4c1a3a6cfeb5e4a4d3d0e7ff901f4
2017-05-17 09:28:32 -07:00
Johann 105503b839 neon fdct: 4x4 implementation
Approximately twice as fast as C implementation.

BUG=webm:1424

Change-Id: I3c0307fb08ddc23df42545cd089a78e2ed5c9d3f
2017-05-17 07:38:18 -07:00
paulwilkins 42e5073f94 Experiment. Store first pass errors as per MB values.
Most existing first pass stats are stored in a form normalized to a
macro-block scale. However the error scores for intra / inter etc were
stored as frame level values but mainly used as MB level values.

This change  fixes that. Normalized per MB values make comparisons
between different formats easier and in any case this is usually what is
wanted.

An change in results should be limited to slight differences in rounding.

*** Change after patch 8 +2 requiring new approval.

Final pre-submit testing showed  one 4K clip with above expected change.
Investigation showed this was due to a value used to test for ultra low intra
complexity in key frame detection. This was a per frame not per MB value but
also did not scale with frame size. Replacement with a small per MB value
(based on original per frame value and cif frame size) resolved the KF detection
problem.

Also converted kf_group_error_left to a double in line with other error values
to reduce rounding problems in KF group bit allocation

All clips and sets now show nominal (or 0) change as expected.

Change-Id: Ic2d57980398c99ade2b7380e3e6ca6b32186901f
2017-05-17 12:00:18 +01:00
Linfeng Zhang 18e8baa5c0 Add transpose_32bit_4x4() and rename transpose_4x4() for vpx_dsp/x86
Change-Id: Ib57377f6cf6573c04720d3cc5dea4285362b4220
2017-05-16 17:46:37 -07:00
Johann Koenig 31cb852a90 Merge "Revert "Add visibility="protected" attribute for global variables referenced in asm files."" 2017-05-16 23:39:37 +00:00
Johann Koenig 2300e16675 Revert "Add visibility="protected" attribute for global variables referenced in asm files."
This reverts commit 0d88e15454.

Reason for revert: chromium builds are failing to locate vpx_rv during dlopen()

dlopen failed: cannot locate symbol "vpx_rv" referenced by "libstandalonelibwebviewchromium.so"

Original change's description:
> Add visibility="protected" attribute for global variables referenced in asm files.
>
> During aosp builds with binutils-2.27, we're seeing linker error
> messages of this form:
> libvpx.a(subpixel_mmx.o): relocation R_386_GOTOFF against preemptible
> symbol vp8_bilinear_filters_x86_8 cannot be used when making a shared
> object
>
> subpixel_mmx.o is assembled from "vp8/common/x86/subpixel_mmx.asm".
> Other messages refer to symbol references from deblock_sse2.o and
> subpixel_sse2.o, also assembled from asm files.
>
> This change marks such symbols as having "protected" visibility. This
> satisfies the linker as the symbols are not preemptible from outside
> the shared library now, which I think is the original intent anyway.
>
> Change-Id: I2817f7a5f43041533d65ebf41aefd63f8581a452
>

TBR=jzern@google.com,johannkoenig@google.com,rahulchaudhry@chromium.org,builds@webmproject.org

Change-Id: I0c2ea375aa7ef5fda15b9d9e23e654bb315c941b
2017-05-16 15:54:33 -07:00
Marco Paniconi baef5486bf Merge "Revert "Revert "vp8: Real-time mode: reduce mode_check_freq thresh for speed 10.""" 2017-05-16 22:50:29 +00:00
Marco Paniconi 13d4a0d011 Revert "Revert "vp8: Real-time mode: reduce mode_check_freq thresh for speed 10.""
This reverts commit 3704807805.

Reason for revert: <INSERT REASONING HERE>
Does not look to be the cause of the test failures.

Original change's description:
> Revert "vp8: Real-time mode: reduce mode_check_freq thresh for speed 10."
> 
> This reverts commit 4a7424adba.
> 
> Reason for revert: <INSERT REASONING HERE>
> Possibly causing test failures in roll into chromium.
> 
> Original change's description:
> > vp8: Real-time mode: reduce mode_check_freq thresh for speed 10.
> > 
> > Reduces quality regression at speed 10 for real-time mode.
> > 
> > Change-Id: I9f624bea9ca262dab32ce9de7d6d91175d6becc8
> > 
> 
> TBR=marpan@google.com,builds@webmproject.org,jianj@google.com
> # Not skipping CQ checks because original CL landed > 1 day ago.
> 
> Change-Id: I1defcb74e78a5a3bd29b7d1b21a96a79fa26a457
> 

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true

Change-Id: I13d86a2a68b8aa8c0c7465e6e58cff0e00bc7862
2017-05-16 22:50:19 +00:00
Marco Paniconi b9987a7c25 Merge "Revert "vp8: Real-time mode: reduce mode_check_freq thresh for speed 10."" 2017-05-16 22:48:39 +00:00
Marco Paniconi 3704807805 Revert "vp8: Real-time mode: reduce mode_check_freq thresh for speed 10."
This reverts commit 4a7424adba.

Reason for revert: <INSERT REASONING HERE>
Possibly causing test failures in roll into chromium.

Original change's description:
> vp8: Real-time mode: reduce mode_check_freq thresh for speed 10.
> 
> Reduces quality regression at speed 10 for real-time mode.
> 
> Change-Id: I9f624bea9ca262dab32ce9de7d6d91175d6becc8
> 

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com
# Not skipping CQ checks because original CL landed > 1 day ago.

Change-Id: I1defcb74e78a5a3bd29b7d1b21a96a79fa26a457
2017-05-16 22:48:13 +00:00
Johann Koenig dac3b59721 Merge "'protected' visibility unsupported on macho" 2017-05-15 21:21:45 +00:00
Johann 7498fe2e54 neon 4 byte helper functions
When data is guaranteed to be aligned, use helper functions which
assert that requirement.

Change-Id: Ic4b188593aea0799d5bd8eda64f9858a1592a2a3
2017-05-15 13:42:31 -07:00
Johann 3fbc371e99 'protected' visibility unsupported on macho
Mac builds must not specify 'protected' visibility. Then only support
'default' and 'hidden'.

https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/CppRuntimeEnv/Articles/SymbolVisibility.html

Change-Id: I94eccfaa29af0ddcc4a5c1c0e14cf63ef7146462
2017-05-15 11:29:22 -07:00
Johann Koenig 8739a182c8 Merge "move neon load/stores to a new file" 2017-05-15 18:15:27 +00:00
Johann 1088b4f87c move neon load/stores to a new file
Move the tran_low_t helper functions to a new file. Additional
load/store functions will be added here.

Change-Id: I52bf652c344c585ea2f3e1230886be93f5caefc3
2017-05-15 08:29:43 -07:00
Marco 4a7424adba vp8: Real-time mode: reduce mode_check_freq thresh for speed 10.
Reduces quality regression at speed 10 for real-time mode.

Change-Id: I9f624bea9ca262dab32ce9de7d6d91175d6becc8
2017-05-14 18:19:06 -07:00
Alexandra Hájková bcbc3929ae ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx
Change-Id: Ic9639b1331d8c5cbc207c2a036891ff0137fc56f
2017-05-13 13:13:15 +00:00
Jerome Jiang 6b9d130214 Merge "vp9: speed 8: Fix seg fault in partition copy when drop frames." 2017-05-13 03:20:49 +00:00
Cheng Chen 4c0655f26b Merge "Speed up encoding by skipping altref recode" 2017-05-13 01:29:59 +00:00
Jerome Jiang 1fcd5cca3c vp9: speed 8: Fix seg fault in partition copy when drop frames.
BUG=webm:1433

Change-Id: I4f3984ef28660d3218d48007d7c977bdbdaf8af6
2017-05-12 15:57:23 -07:00
Rahul Chaudhry 0d88e15454 Add visibility="protected" attribute for global variables referenced in asm files.
During aosp builds with binutils-2.27, we're seeing linker error
messages of this form:
libvpx.a(subpixel_mmx.o): relocation R_386_GOTOFF against preemptible
symbol vp8_bilinear_filters_x86_8 cannot be used when making a shared
object

subpixel_mmx.o is assembled from "vp8/common/x86/subpixel_mmx.asm".
Other messages refer to symbol references from deblock_sse2.o and
subpixel_sse2.o, also assembled from asm files.

This change marks such symbols as having "protected" visibility. This
satisfies the linker as the symbols are not preemptible from outside
the shared library now, which I think is the original intent anyway.

Change-Id: I2817f7a5f43041533d65ebf41aefd63f8581a452
2017-05-12 11:11:16 -07:00
Marco Paniconi 9a66582604 Merge "vp9: Use INTERP_FILTER for filter_type in vp9_rtcd_defs.pl" 2017-05-12 17:02:50 +00:00
James Zern ac8f58f6ab Merge changes I1b54a7a5,I3028bdad,I59788cd9
* changes:
  ppc: Add get_mb_ss_vsx
  ppc: Add get4x4sse_cs_vsx
  ppc: Add comp_avg_pred_vsx
2017-05-12 15:24:59 +00:00
Luca Barbato 143b21e362 ppc: Add get_mb_ss_vsx
Change-Id: I1b54a7a5bb642e4b836d786ea1ae506eed025e3f
2017-05-12 17:23:00 +02:00
Luca Barbato 6d225eb5f9 ppc: Add get4x4sse_cs_vsx
Change-Id: I3028bdadf653665d18e781d28e9625f62804b3d8
2017-05-12 17:23:00 +02:00
Luca Barbato a7f8bd451b ppc: Add comp_avg_pred_vsx
Change-Id: I59788cd98231e707239c2ad95ae54f67cfe24e10
2017-05-12 17:22:55 +02:00
Alexandra Hájková f48532e271 ppc: Add vpx_sad64x32/64_vsx
Change-Id: I84e3705fa52f75cb91b2bab4abf5cc77585ee3e2
2017-05-12 16:10:16 +02:00
Alexandra Hájková 0b15bf1e54 ppc Add vpx_sad32x16/32/64_vsx
Change-Id: I3c4f9d595275669580413a71b3c3c810e7ddcacd
2017-05-12 16:10:11 +02:00
James Zern a12ea1d5e9 Merge "ppc: Add vpx_sad16x8/16/32_vsx" 2017-05-12 13:33:51 +00:00
Marco Paniconi 629279a45c Merge "vp9: Adjust speed features for speed 8 at low resoln." 2017-05-12 00:35:40 +00:00
Marco Paniconi c64667c338 Merge "vp9: SVC: Increase the partiiton and acskip thresholds" 2017-05-11 23:37:32 +00:00
Marco Paniconi 37cdd3bfc2 Merge "vp9; Adjust noise estimation thresholds." 2017-05-11 21:58:40 +00:00
Marco c5c31b9eb6 vp9: SVC: Increase the partiiton and acskip thresholds
Increase the partition and acskip thresholds for temporal
enhancement layers.

~1-2% speedup, with negligible loss in quality.

Change-Id: Id527398a05855298ad9ddac10ada972482415627
2017-05-11 12:28:19 -07:00
Marco c5a4376aed vp9: SVC: allow for setting the interp_filter in non-rd pickmode.
For SVC 1 pass non-rd pickmode, the interpolation filter for the
upsampling of the golden (spatial) reference was not being explicitly
set and instead was takin gwhatever value was set in the previous
mode/block (which would be either EIGHTTAP or EIGHTAP_SMOOTH).

Fix it to the default EIGHTTAP for now, to be updated/selected
adaptively in a later change.

Minor adjustmemt to rate targeting thresholds in datarate unittests.

Change-Id: I52085048674072c6cfb7163e11e9a2658d773826
2017-05-11 11:45:09 -07:00
Paul Wilkins 3caaf21c5b Merge "Tuning of factor used to calculate Q range in two pass." 2017-05-11 18:25:45 +00:00
Jerome Jiang d35541fe29 Merge "vp9: Fix ubsan failure in denoiser." 2017-05-11 16:38:59 +00:00
paulwilkins 9a7625652c Tuning of factor used to calculate Q range in two pass.
A more detailed explanation of the experimentation
leading to this change can be found in:-

https://docs.google.com/a/google.com/document/d/13lsYhxgPyxUHvEess6wg9nikaonIZKY9Ak_Lpafv5Mo/edit?usp=sharing

This change gives gains across all our standard test sets for
overall psnr, ssim, fast ssim and psnr-HVS.

Values expressed as % reduction in bitrate.

Low res set     -0.257, -0.192, -0.173, -0.101
Mid res set     -0.233, -0.336, -0.367, -0.139
High res set    -0.999, -1.039, -1.111, -0.567
NetFlix 2K set -0.734, -0.174, -0.389, -0.820
Netflix 4K set  -0.814, -0.485, -0.796, -0.839

Change-Id: Ie981fb3c895c9dfcfc8682640d201a86375db5c8
2017-05-11 16:19:59 +01:00
Cheng Chen 76567d84ce Speed up encoding by skipping altref recode
Speed up for speed 0.
Reduce 10+% of encoding time for hdres in speed 0,
with less than 0.1% PSNR loss.
Compute total difference of previous and current frame context probability
model. If the diff is less than the threshold, skip recoding the frame.

Borg test (positive number means performance loss):
		lowres    midres    hdres
PSNR:		0.030     0.032     0.065

Local speed test: bitrate set at 1200
		blue_sky  pedestrian  rush_hour
Encoding time:	 -10.0%     -16.5%      -16.5%

Change-Id: I4e2d200ea3115d48b2c3e890143596b31b8ef9e9
2017-05-10 22:15:01 -07:00
Marco Paniconi f7e767d8ee Merge "vp9: SVC: Fix setting in sample encoder." 2017-05-10 23:51:44 +00:00
Marco 2f11a65c99 vp9; Adjust noise estimation thresholds.
Change-Id: Ia41a11df18e5a58d2b8bbecd11c249d357de2a8f
2017-05-10 16:48:10 -07:00
Marco eaa6715b02 vp9: SVC: Fix setting in sample encoder.
For 1 spatial layer case, scaling_num/den was not set properly.

Change-Id: I139bf70c6dffde89eed24e435bcb5d98d2029bcd
2017-05-10 16:19:23 -07:00
Jerome Jiang 597d1f4c03 vp9: Fix ubsan failure in denoiser.
Fix the overflow for subtraction between two unsigned integers.

BUG=webm:1432

Change-Id: I7b665e93ba5850548810eff23258782c4f5ee15a
2017-05-10 13:43:17 -07:00
Linfeng Zhang 8477a66fc8 Merge "Update specializations of idct functions" 2017-05-10 20:31:13 +00:00
Alexandra Hájková cc7f0c0f3e ppc: Add vpx_sad16x8/16/32_vsx
Change-Id: I60619d28fffd9809f93b1af510a50e1aa02519a9
2017-05-10 19:57:30 +00:00
Linfeng Zhang 764b3b8090 Update specializations of idct functions
Introduced append situation in Commit 0178d97 which could be
confusing. Clean a little bit and add some comments.

Change-Id: I69ad336f805aca7ce9d45515b8cd237423fadbb2
2017-05-10 12:51:18 -07:00
Jerome Jiang 2574573fea vp9: Wrap threshold tuning for HD only when denoiser is enabled.
Fixes a speed regression.

Change-Id: I23d942e4af17fa81fe4a366c7369b3ad537e59b0
2017-05-10 12:15:41 -07:00
Marco d3aebeee4e vp9: Use INTERP_FILTER for filter_type in vp9_rtcd_defs.pl
Change-Id: I259d152c62864b365490368051f3c3b7d7f2f1c5
2017-05-10 12:06:44 -07:00
Johann Koenig d713ec3c46 Merge changes I92eb4312,Ibb2afe4e
* changes:
  subpel variance neon: add mixed sizes
  sub pixel variance neon: use generic variance
2017-05-10 18:19:52 +00:00
Marco Paniconi db2fad7516 Merge "vp9: Adjustment to noise estimation." 2017-05-10 17:11:18 +00:00
Marco Paniconi fcd6e4a1c2 Merge "vp9: SVC: Add option to set downsampling filter type." 2017-05-10 17:10:52 +00:00
Marco 1b59964162 vp9: Adjustment to noise estimation.
When the noise estimate is forced off due to large motion,
reset the counter and set smaller window for next estimate.

Change-Id: Ifa4ec95396134173a00d48353ad52f1b6a40c217
2017-05-10 09:39:08 -07:00
Marco 4e23998fb4 vp9: SVC: Add option to set downsampling filter type.
Add option in SVC to set the filter type and phase for
the frame level downsampling filters.

For 3 spatial layers: set downsampling filter type to bilinear
and set phase to 8, for lowest spatial layer.

Change-Id: Id81f4b1ba93db19c1cd37b6a46d1281a2c61bc43
2017-05-09 17:22:44 -07:00
Linfeng Zhang 870cf4356c Update test/partial_idct_test.cc
Makes more sense to call the corresponding partial idct C function
instead of the full idct C function as the reference.

Change-Id: Ibb7681dd063edd6307ba582c10c26c4c6a4b78c6
2017-05-09 13:07:47 -07:00
Linfeng Zhang f532504864 Clean 32x32 idct C code
Change-Id: I73b8104a9e7a70ffe827c1b7ff43618f24f5d7bd
2017-05-09 11:05:51 -07:00
Linfeng Zhang ecd1eb2162 Update 4x4 idct sse2 functions
It's a bit faster to call idct4_sse2() in vpx_idct4x4_16_add_sse2()

Change-Id: I1513be7a895cd2fc190f4a8297c240b17de0f876
2017-05-08 16:16:52 -07:00
Marco Paniconi 8053dba5a1 Merge "vp9: SVC: Modify conditon for setting downsample filter type." 2017-05-08 21:45:58 +00:00
Marco 9586d5e682 vp9: SVC: Modify conditon for setting downsample filter type.
Base the condition on the resolution of the spatial layer.
And remove restriction on scaling factor.

Change-Id: Iad00177ce364279d85661654bff00ce7f48a672e
2017-05-08 14:13:49 -07:00
Johann f7d1486f48 neon variance: process 16 values at a time
Read in a Q register. Works on blocks of 16 and larger.

Improvement of about 20% for 64x64. The smaller blocks are faster, but
don't have quite the same level of improvement. 16x32 is only about 5%

BUG=webm:1422

Change-Id: Ie11a877c7b839e66690a48117a46657b2ac82d4b
2017-05-08 18:48:55 +00:00
Johann Koenig 1814463864 Merge changes Id602909a,Ib0e85608
* changes:
  neon variance: process two rows of 8 at a time
  neon variance: add small missing sizes
2017-05-08 17:34:20 +00:00
Linfeng Zhang 2c3a2ad6f1 Merge changes I0cfe4117,I3581d80d,Ida62c941
* changes:
  Split dsp/x86/inv_txfm_sse2.c
  Update highbd idct functions arguments to use uint16_t dst
  Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct
2017-05-08 16:15:57 +00:00
Marco Paniconi f4653c1efc Merge "vp9: SVC: Set downsample filtertype for lowest spatial layer." 2017-05-06 02:31:00 +00:00
Marco 9b729748ac vp9: SVC: Set downsample filtertype for lowest spatial layer.
For lowest spatial layer, in 3 layer SVC, set the
downsampling filtertype to get averaging filter.
Needed for reducing aliasing on low-res layer,
small increase in overall encoder time.

Change-Id: Ia31460123bd91b72eca49b46dd924b9f226d4563
2017-05-05 19:29:09 -07:00
Jerome Jiang 3453c8d6c4 Merge "vp9: Neon optimization for denoiser. Add unit tests." 2017-05-06 01:28:32 +00:00
Jerome Jiang 83a2bfd7dc Merge "Change target bitrate thresh in denoiser test." 2017-05-06 01:28:15 +00:00
Jerome Jiang fff358fb06 Change target bitrate thresh in denoiser test.
An intended behavior change disabling exhaustive searches in speed
feature causes VP9/DatarateTestVP9LargeDenoiser.4threads test failure.
Change the threshold to make it pass.

BUG=webm:1429

Change-Id: Ibcbe2314c6b2525799894f5d7204fc8eb4ec2a1e
2017-05-05 16:50:19 -07:00
Jerome Jiang 069eedb3a0 vp9: Neon optimization for denoiser. Add unit tests.
Denoiser on Neon is 5x faster than C code.

BUG=webm:1420

Change-Id: I805ab64f809ff2137354116be6213e7ec29c1dcb
2017-05-05 16:40:52 -07:00
Marco Paniconi a082c562d0 Merge "vp9: Adjust some thresholds for noise estimation." 2017-05-05 20:02:41 +00:00
Marco 34cce144d8 vp9: Adjust some thresholds for noise estimation.
Adjust thresholds for noise estimation, for resolutions above VGA.
Tends to push cleaner/low noise clips to LowLow state.

No change in RTC metrics.

Change-Id: I739ca6b797d0a60ccd1c6c6a2775269b1f007e5e
2017-05-05 12:00:12 -07:00
Johann Koenig 38f440120c Merge "fdct 8x8 neon: minor comment cleanup" 2017-05-05 18:22:45 +00:00
Jerome Jiang af69ed20c4 vp9: Enable noise estimation on low res.
Set noise level to kLowLow for high motion low res clips.
Change the normalization in noise metric for low res.
Reduce the initial time-window for all resolutions.

Change-Id: Iaed39dbb50b205cd9c735dc5b84822304fb01987
2017-05-04 15:38:23 -07:00
Johann 2346a6da4a subpel variance neon: add mixed sizes
Add support for everything except block sizes of 4.

Performance is better but numbers will improve again when the variance
optimizations land.

BUG=webm:1423

Change-Id: I92eb4312b20be423fa2fe6fdb18167a604ff4d80
2017-05-04 15:30:01 -07:00
Johann 19e1ec8359 sub pixel variance neon: use generic variance
When a neon version is available it will be called. This allows
decoupling the variance implementations and has no real downside. For
most configurations, the call will be #define'd to the neon
implementation.

Change-Id: Ibb2afe4e156c5610e89488504d366b3e6d1ba712
2017-05-04 15:30:01 -07:00
Johann 462e29703c fdct 8x8 neon: minor comment cleanup
Simplify HBD/non distinction in test.

Document why transpose_neon.h is not used

Change-Id: I17659414206ddbb8c2f1ef0d9f4a17f1745d5a52
2017-05-04 15:14:23 -07:00
Johann d6a7489dd5 neon variance: process two rows of 8 at a time
When the width is equal to 8, process two rows at a time. This doubles
the speed of 8x4 and improves 8x8 by about 20%.

8x16 was using this technique already, but still improved a little bit
with the rewrite.

Also use this for vpx_get8x8var_neon

BUG=webm:1422

Change-Id: Id602909afcec683665536d11298b7387ac0a1207
2017-05-04 08:59:46 -07:00
Johann cb9133c72f neon variance: add small missing sizes
Some of the mixed sizes were missing. They can be implemented trivially
using the existing helper function.

When comparing the previous 16x8 and 8x16 implementations, the helper
function is about 10% faster than the 16x8 version. The 8x16 is very
close, but the existing version appears to be faster.

BUG=webm:1422

Change-Id: Ib0e856083c1893e1bd399373c5fbcd6271a7f004
2017-05-04 08:59:42 -07:00
Yi Luo a24e1e8027 Merge "High bit depth inter prediction horizontal/vertical filters AVX2" 2017-05-04 15:43:21 +00:00
Linfeng Zhang 2231669a83 Split dsp/x86/inv_txfm_sse2.c
Spin out highbd idct functions.

BUG=webm:1412

Change-Id: I0cfe4117c00039b6778c59c022eee79ad089a2af
2017-05-03 15:43:02 -07:00
Linfeng Zhang d5de63d2be Update highbd idct functions arguments to use uint16_t dst
BUG=webm:1388

Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5
2017-05-03 13:59:16 -07:00
Linfeng Zhang 081b39f2b7 Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct
BUG=webm:1388

Change-Id: Ida62c941f2b836d6c9e27b427a7d5008ab6dc112
2017-05-03 13:58:31 -07:00
Hui Su 5048d6e7ee Merge "vp9 level: add tentative max cpb values for high levels" 2017-05-03 20:51:03 +00:00
Hui Su f701a44305 Merge "Adjust alt-ref selection in define_gf_group()" 2017-05-03 20:50:29 +00:00
Yi Luo a3452996a1 High bit depth inter prediction horizontal/vertical filters AVX2
User level speed improvement on i7-6700, cpu-used=1,
  x86_64 Linux, bitrate, 1080p, 8Mbps, 4K, 16Mbps:
- Decoder:
  1080p: ~4%
  4K: ~5%
- Encoder:
  1080p: ~1%
  4K: ~3%

Change-Id: I51b48f9c5de0d62487d5a11aa579c97bd03dd640
2017-05-03 12:18:01 -07:00
Linfeng Zhang a10a5cb356 Merge changes I8bb660de,Ica51d780,I6037525d
* changes:
  Clean specializes of idct functions
  Clean add_protos of highbd idct functions
  Clean add_protos of idct functions
2017-05-03 19:17:55 +00:00
James Zern 5599e4275a Merge changes Ia5293d94,I90d481d3,Ia509d622,I54549b03,I89b635d6
* changes:
  ppc: Add convolve8_vsx and convolve8_avg_vsx
  ppc: Add convolve8_avg_vert_vsx
  ppc: Add convolve8_vert
  ppc: Add convolve8_horiz_avg
  ppc: Add convolve8_horiz
2017-05-03 03:31:19 +00:00
Luca Barbato e2ad89092d ppc: Add convolve8_vsx and convolve8_avg_vsx
Change-Id: Ia5293d948003a7fff5a7cbad6e83d8a72717c857
2017-05-02 20:27:47 -07:00
Luca Barbato e6ca81ee67 ppc: Add convolve8_avg_vert_vsx
Only the generic one again, speedups for 8x8 and larger blocks to
come later.

Change-Id: I90d481d3a602d1e277ead8f3934eca126b86b72d
2017-05-02 20:27:42 -07:00
Luca Barbato a65f1771ad ppc: Add convolve8_vert
Only the generic one again, speedups for 8x8 and larger blocks
to come later.

Change-Id: Ia509d6225984b4930ec03928c9bcbf51486da99f
2017-05-02 20:27:33 -07:00
Luca Barbato 77772350f3 ppc: Add convolve8_horiz_avg
The 8x8 and larger blocks cases can be sped up further.

Change-Id: I54549b03ac6c7a4e3f485738b100c3cac7ac2e15
2017-05-02 20:27:28 -07:00
Luca Barbato 08edb85bd0 ppc: Add convolve8_horiz
The 8x8 and larger blocks cases can be sped up further.

Change-Id: I89b635d6b01c59f523f2d54b1284ed32916c5046
2017-05-02 20:27:16 -07:00
Linfeng Zhang 0178d974e5 Clean specializes of idct functions
Change-Id: I8bb660de47b5f97263ec381dc428db96e9c9a4b2
2017-05-02 18:01:19 -07:00
Linfeng Zhang 4412996d59 Clean add_protos of highbd idct functions
Change-Id: Ica51d780b92b316ce9112740c56cdf7670816371
2017-05-02 17:59:38 -07:00
Linfeng Zhang a7a57d9756 Clean add_protos of idct functions
Change-Id: I6037525d92ec172810edab720389eb1865ed3b1a
2017-05-02 17:58:40 -07:00
Johann Koenig 240a5a15ef Merge "block error sse2: sum in 32 bits when possible" 2017-05-02 14:16:47 +00:00
Johann cd94d5f68e block error avx2: rename variables
Change-Id: I2b8a9253f2c3d1fd85304c2970ebe70213870fe9
2017-05-01 17:54:29 -07:00
Johann Koenig b1a31f8066 Merge "block error avx2: sum in 32 bits when possible" 2017-05-02 00:52:59 +00:00
Marco Paniconi 1e112bce37 Merge "vp9: SVC: Early exit on golden ref in non-rd pickmode." 2017-05-01 21:04:52 +00:00
Linfeng Zhang e8655d49f5 Merge "Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()" 2017-05-01 19:54:40 +00:00
Johann Koenig 3d33a462b3 Merge "move vp9_error_intrin_avx2.c" 2017-05-01 19:52:36 +00:00
Kyle Siefring 760c214519 block error avx2: sum in 32 bits when possible
Add 31bit pairs before unpacking in x86 block error code

AVX2 code provides a very minor performance improvement.

BUG=webm:1210

Change-Id: I4c82308eaf65741dca2f5c6db9be9c85f905073a
2017-05-01 12:51:33 -07:00
James Zern ee3df31d74 Merge "vpx_scale_test: fix segfault on alloc failure" 2017-05-01 19:22:22 +00:00
Marco ae0215f945 vp9: SVC: Early exit on golden ref in non-rd pickmode.
For SVC 1 pass real-time: add condition to skip the
golden (spatial) reference mode in non-rd pickmode.
Condition is to skip golden if the sse of zeromv-last mode
is below threshold. And change order in ref_mode_set_svc
to make sure golden zeromv is tested after last-nearest.

Speedup ~3-4% with little/negligible quality loss.

Change-Id: I6cbe314a93210454ba2997945f714015f1b2fca3
2017-05-01 10:36:54 -07:00
Kyle Siefring 8394990b27 block error sse2: sum in 32 bits when possible
Add 31bit pairs before unpacking in x86 block error code

BUG=webm:1210

Change-Id: I5ca8c7f7775585a17fe09d6bbfc25e1f2955eb0a
2017-05-01 09:59:18 -07:00
Johann 2ff01aa1e4 move vp9_error_intrin_avx2.c
There is only one avx2 implementation. Drop '_intrin'

Change-Id: I887a0d27d58567eaad49f749f127eca61313f312
2017-05-01 09:13:01 -07:00
James Zern 2930903d51 vpx_scale_test: fix segfault on alloc failure
check the return of ResetImage() before continuing

Change-Id: Iff0b038f7b9761113b8cf33a511a5306640d1273
2017-04-29 13:12:53 -07:00
Luca Barbato d51d3934f5 ppc: Add convolve_avg
Change-Id: Ib203c444c708f42072e38301ee3db97b5b53d014
2017-04-29 15:47:25 +02:00
Luca Barbato 63860ba7b8 ppc: Add convolve_copy
Change-Id: Ie26d6dbe090e711d84bac01ba7da270db983f405
2017-04-29 15:47:25 +02:00
Johann Koenig ef5918098d Merge "Use uint32_t for accumulator" 2017-04-28 18:32:09 +00:00
Jerome Jiang ce2e278059 Merge "vp9: Fix condition for disabling adaptive_rd_thresh." 2017-04-28 18:10:36 +00:00
Jerome Jiang 04de501229 vp9: Fix condition for disabling adaptive_rd_thresh.
Add speed constrains for disabling adaptive_rd_thresh when
row_mt_bit_exact is set.

Change-Id: I2445115c2f9a2e46b8a0966031a0fea488d4964e
2017-04-28 10:26:20 -07:00
Jerome Jiang bea27a5809 Merge "Generalize vp9 sse2 denoiser test for other platforms." 2017-04-28 15:45:52 +00:00
Johann 657f3e9f14 Use uint32_t for accumulator
Be specific about the data type size.

Use convenience macro vp9_zero_array.

Change-Id: I5fadf7dbd408befb73820d85db0be4832e8cfcbd
2017-04-28 06:36:59 -07:00
Johann Koenig 94ebdba71d Merge "vp9 temporal filter: sse4 implementation" 2017-04-28 13:22:41 +00:00
Jerome Jiang 26aebd77b8 Generalize vp9 sse2 denoiser test for other platforms.
Renamed to vp9_denoiser_test.

Change-Id: I0d8f4c94bcb81a60949a13d9fe839cee95d03f77
2017-04-27 22:47:41 -07:00
Yaowu Xu 0e8fea6c13 Merge "VP9: enable trellis for high bitdepth intra" 2017-04-28 00:16:56 +00:00
James Zern ef15d38df0 Merge "webm_read_frame: avoid NULL dereference" 2017-04-27 21:47:10 +00:00
Johann 6dfeea6592 vp9 temporal filter: sse4 implementation
Approximates division using multiply and shift.

Speeds up both sizes (8x8 and 16x16) by 30 times.

Fix the call sites to use the RTCD function.

Delete sse2 and mips implementation. They were based on a previous
implementation of the filter. It was changed in Dec 2015:
ece4fd5d22

BUG=webm:1378

Change-Id: I0818e767a802966520b5c6e7999584ad13159276
2017-04-26 22:03:05 -07:00
Jerome Jiang 43e0e082d1 vp9: Don't force disabling of adaptive_rd_thresh for realtime.
Don't force disabling of adaptive_rd_thresh for realtime when
row_mt_bit_exact is set.

Row based adaptive rd is made usable in CL
454882(https://chromium-review.googlesource.com/c/454882) for REALTIME.

Change-Id: Ief023414f0fd6eb86f299dd46ae58f4436875af5
2017-04-26 13:17:57 -07:00
Yunqing Wang b68f14d0ed Merge "Make the row based multi-threaded encoder deterministic" 2017-04-26 16:12:14 +00:00
Linfeng Zhang 54c4e0f7a5 Merge "Update highbd convolve functions arguments to use uint16_t src/dst" 2017-04-26 15:50:46 +00:00
Marco Paniconi 004fab120a Merge "vp9: SVC: Adjust some speed settings for temporal layers." 2017-04-26 15:45:06 +00:00
Peter de Rivaz 66117b97c5 VP9: enable trellis for high bitdepth intra
BUG=webm:1409

Change-Id: I5236595aac1c09386c60ffe8ad621e01422ed5a7
2017-04-26 11:43:01 +01:00
hui su d01c9febe9 vp9 level: add tentative max cpb values for high levels
Add tentative max cpb size values for levels 5.2 and up. Otherwise
encoding will fail when targeting for these levels.

Change-Id: Ib7e0ba4b9836ea1ac900b6822543812843d48463
2017-04-25 18:03:55 -07:00
hui su 8069f31076 Adjust alt-ref selection in define_gf_group()
107de19698 changes the encoder alt-ref selection behavior. Assuming
min_gf_interval = max_gf_interval = 4, the frame order would be
frm_1  arf_1  frm_2  frm_3  frm_4  frm_5  arf_2 before 107de19698;
frm_1  arf_1  frm_2  frm_3  frm_4  arf_2  frm_5 after 107de19698.

This patch reverts such alt-ref placement change.

Change-Id: I93a4a65036575151286f004d455d4fcea88a1550
2017-04-25 18:03:47 -07:00
Jerome Jiang 15ee8a8c45 Merge "Fix the decoder seg fault when frame is corrupted." 2017-04-26 00:09:29 +00:00
Jerome Jiang 997e54ea43 Merge "vp9: speed >= 8: Skip uv variance in model_rd_sb_y_large" 2017-04-26 00:09:22 +00:00
Marco c614164cb6 vp9: SVC: Adjust some speed settings for temporal layers.
Make some speed setting changes for temporal enhancement layers,
and remove the switch in subpel_force_stop for the aggressive_base_mv
in non-rd pickmode.

Gain some 2-3% speed with little/negligible quality loss.

Change-Id: I3e2a7f80ff45f38c0a6ceb01b34dbca2f53edbf0
2017-04-25 16:27:01 -07:00
Jerome Jiang 69b0242e9a vp9: speed >= 8: Skip uv variance in model_rd_sb_y_large
For speed >= 8 and color_sensitivity not set, skip the transform
skipping test in UV planes.
Add a new condition to check noise level to skip chroma check
for speed >= 8 if y_sad is high.

1~2% speedup on ARM for speed 8.

Borg tests show neutral results in both rtc and rtc_derf.

Change-Id: Idecd3ff6e28c97757a43bb6f3a7082c85f72109c
2017-04-25 16:21:36 -07:00
Linfeng Zhang 4758d20227 Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()
BUG=webm:1388

Change-Id: I7ee32e0c08f0fb41712a8cc640b2c5bba872421d
2017-04-25 14:32:20 -07:00
Linfeng Zhang 51dc998f3a Update highbd convolve functions arguments to use uint16_t src/dst
BUG=webm:1388

Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42
2017-04-25 14:22:19 -07:00
James Zern 0be513e8e8 webm_read_frame: avoid NULL dereference
block may be NULL with block_entry_eos or from return of GetBlock()

Change-Id: Ia0dd3ffa46305ee70efcdc55c05c2ad24efc993b
2017-04-25 12:34:23 -07:00
Marco 92ec0674fd vp9; Reduce artifact in non-rd pickmode for lighting changes.
Add a low-variance high-sumdiff to the superblock content state
and use it to limit the mv and bias some decisions in non-rd pickmode.
Only affects speed >= 6.

Reduces artifact for lighting changes.
Small/no difference in metrics on RTC set.

Change-Id: Ic84b2379fe0ae3fa71ae826ee6bae3eaf551a25b
2017-04-24 17:08:43 -07:00
Yunqing Wang 10a497bd38 Make the row based multi-threaded encoder deterministic
This patch followed allow_exhaustive_searches feature modification and
continued to modify the encoder to achieve the determinism in the row
based multi-threaded encoding. While row-mt = 1 and using multiple
threads, the adaptive feature in encoder was disabled, which gave
BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%),
but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at
speed 2). These speed losses were acceptable considering the speed
gains obtained from row-mt.

Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb
2017-04-24 16:28:27 -07:00
Yunqing Wang c530208ae3 Merge "Make allow_exhaustive_searches feature no longer adaptive" 2017-04-24 17:41:10 +00:00
Marco Paniconi b35f64241f Merge "vp9: SVC: fix condition for partition/skip threshold when denoising." 2017-04-21 21:28:17 +00:00
Yunqing Wang bca4564683 Make allow_exhaustive_searches feature no longer adaptive
A previous patch turned on allow_exhaustive_searches feature only for
FC_GRAPHICS_ANIMATION content. This patch further modified the feature
by removing the exhaustive search limit, and made it no longer adaptive.
As a result, the 2 counts that recorded the number of motion searches
were removed, which helped achieve the determinism in the row based
multi-threading encoding. Tests showed that this patch didn't cause
the encoder much slower.

Used exhaustive_searches_thresh for this speed feature, and removed
allow_exhaustive_searches. Also, refactored the speed feature code
to follow the general speed feature setting style.

Change-Id: Ib96b182c4c8dfff4c1ab91d2497cc42bb9e5a4aa
2017-04-21 11:14:02 -07:00
Jerome Jiang 58fe1bde59 Merge "vp9: Non-rd pickmode: Avoid computation duplication." 2017-04-21 00:51:47 +00:00
Marco 5de0e9ed08 vp9: SVC: fix condition for partition/skip threshold when denoising.
The more aggressive settings should only be used when denoise_svc
condition is satisfied (which means top spatial layer).

Change-Id: Ia8e3515b27f31bf21b1976ca80a2fa826daece3a
2017-04-20 16:36:55 -07:00
Jerome Jiang 7ae1e321a1 vp9: Non-rd pickmode: Avoid computation duplication.
In non-rd pickmode (speed >= 5), avoid duplication of computations in
model_rd_for_sb_y when the speed feature use_simple_block_yrd is
enabled (or for high bitdepth build under certain conditions).

QVGA, VGA and HD have 1.23%, 2.68% and 1.7% speedup on ARM for speed 8,
respectively.

Encoding results are bitexact for speed >= 5.

Change-Id: I3f9130810c21439f5ad7e159e21cb2243dcd05f1
2017-04-20 16:20:59 -07:00
Jerome Jiang 25c1bada72 Fix the decoder seg fault when frame is corrupted.
BUG=webm:1399

Change-Id: I1e006e0260d9b56a4d2273659ca19b86c69c474b
2017-04-20 14:55:42 -07:00
Marco 29938b3a5a vp9: 1 pass SVC: Fix comment and condition for up-sampling reference.
No change in behavior.

Change-Id: I218fb30289091da623acb23324027435b8510d0e
2017-04-20 14:21:05 -07:00
Yunqing Wang 30ef50b522 Merge "Only allow allow_exhaustive_searches for FC_GRAPHICS_ANIMATION content" 2017-04-20 19:57:46 +00:00
Marco Paniconi 17559cd8b5 Merge "vp9: Re-enable SVC datarate tests." 2017-04-20 19:53:20 +00:00
Marco 85ca2e8a8b vp9: Re-enable SVC datarate tests.
Re-enable the SVC tests, wrap the non-zero expectation
in GetMismatchFrames around #if CONFIG_VP9_DECODER.

Change-Id: I0e8a2d78b868c32f18fe597540f397d3a1b303b5
2017-04-20 12:08:08 -07:00
Marco 3134a52d26 vp9: SVC: Redefine the source downsample filter choice.
Rename the source downsampling filter, and define it
per spatial layers. Used 1 pass CBR SVC.

Change-Id: I8135f2ab89c535c53429b9c58b586f746bb668c7
2017-04-20 10:17:13 -07:00
Luca Barbato 8975436466 ppc: Add the intra predictor tests
Change-Id: Idea15b916044ab3d8e74519337880a484ecfd87e
2017-04-19 20:21:40 -07:00
Luca Barbato 914b160fb5 ppc: h predictor 8x8
Slightly faster with the current compiler.

Change-Id: Iae225fac08395eb430c97a2abec69c60f5cf5c47
2017-04-19 19:57:51 -07:00
Luca Barbato 0b9be93205 ppc: d63 predictor 8x8
10x faster.

Change-Id: I7cedbf4df2ce7df5b6f1108b11815d088fdb9ba8
2017-04-19 19:57:51 -07:00
Luca Barbato ee9325b0bd ppc: tm predictor 4x4
Slightly faster.

Change-Id: I0ca43f309b3d9b50435d69bd5be64b53a99bd191
2017-04-19 19:57:51 -07:00
Luca Barbato 2904eb5800 ppc: h predictor 4x4
2x faster.

Change-Id: I0583dec353299c6797401b646099f18db4e0420d
2017-04-19 19:57:51 -07:00
Luca Barbato 58245d7050 ppc: dc predictor 8x8
Slightly faster, the other dc predictors cannot be faster since
the computation speedup is overwhelmed by the time spent reading
dst to write just the 8x8 part.

Change-Id: I94a0b50500adf8b7b6bb919dbf5c7adf5b9fba66
2017-04-19 19:57:51 -07:00
Luca Barbato 6b4a65e8b1 ppc: d45 predictor 8x8
11x faster.

Change-Id: I5b8f39213ee1f5260724fc254e3fb5c462435798
2017-04-19 19:57:51 -07:00
Luca Barbato 92e33c7b31 ppc: d63 predictor 32x32
About 10x faster.

Change-Id: If7d0645f75c5d7deb9751edd0bf47e2f9068e9e7
2017-04-19 19:57:51 -07:00
Luca Barbato a5469a00a8 ppc: d63 predictor 16x16
About 18x faster.

Change-Id: Id043bf76c011e03e992085bb5e20f330d3e98cd4
2017-04-19 19:57:51 -07:00
Luca Barbato cc868da526 ppc: d45 predictor 32x32
About 12x faster.

Change-Id: I22c150256aefb4941861ab1f6c17d554fb694bed
2017-04-19 19:57:51 -07:00
Luca Barbato 7a7dc9e624 ppc: d45 predictor 16x16
About 16x faster.

Change-Id: Ie5469fb32d5fd11bb6cb06318cea475d8a5b00b9
2017-04-19 19:57:51 -07:00
Luca Barbato c08baa2900 ppc: dc predictor 32x32
10x and 5x faster.

Change-Id: I7913c58c768334d818f541a5e219f1035791eeaf
2017-04-19 19:57:47 -07:00
Luca Barbato 22ca468c7c ppc: dc top and left predictor 32x32
6x faster.

Change-Id: I717995b4056e5579c68191d11b495372971fe1ae
2017-04-19 19:49:31 -07:00
Luca Barbato ad9dea1f6d ppc: dc top and left predictor 16x16
13x faster.

Change-Id: I1771ac39fda599153f933cb3f0506c9f97a6cbe6
2017-04-19 19:49:31 -07:00
Luca Barbato d68d37872c ppc: dc_128 predictor 32x32
6x faster.

Change-Id: I1da8f51b4262871cb98f0aa03ccda41b0ac2b08b
2017-04-19 19:49:31 -07:00
Luca Barbato f9d20e6df2 ppc: dc_128 predictor 16x16
20x faster.

Change-Id: I05f0deb2d38ae7966eae6b71fbc0aa51880e5709
2017-04-19 19:49:31 -07:00
Luca Barbato 0d9417de4a ppc: tm predictor 32x32
About 8x faster.

Change-Id: I9bad827ccbdf47ec95406e961c74ac2ff45f80cf
2017-04-19 19:49:26 -07:00
James Zern a81f037f15 Merge changes I1f5a3752,I95123051,I3bb724e0,Ie81077fa,Ic80f3c05, ...
* changes:
  ppc: tm predictor 16x16
  ppc: tm predictor 8x8
  ppc: horizontal predictor 32x32
  ppc: horizontal predictor 16x16
  ppc: vertical intrapred 16x16 and 32x32
  configure: Workaround clang not enabling altivec on -mvsx
  configure: Match power*64* as ppc64
2017-04-20 02:45:45 +00:00
Yunqing Wang e96e49c2f9 Only allow allow_exhaustive_searches for FC_GRAPHICS_ANIMATION content
The allow_exhaustive_searches feature improves the encoding quality
of FC_GRAPHICS_ANIMATION content a lot. For non-FC_GRAPHICS_ANIMATION
content, the quality test result is almost neutral. This patch makes
this feature to be used only for FC_GRAPHICS_ANIMATION content.

The motivation of doing that is to make this feature no longer adaptive,
which will be implemented in the following patch.

Change-Id: Ic911df6dd757402b6480789cc247801e99840369
2017-04-20 00:03:27 +00:00
Linfeng Zhang fbbdba3b04 Merge changes I9e18a73b,Ie47c8cd4
* changes:
  Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve
  Create CAST_TO_BYTEPTR/SHORTPTR
2017-04-19 23:55:58 +00:00
Linfeng Zhang bf8a49abbd Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve
Replace by CAST_TO_BYTEPTR/SHORTPTR.
The rule is: if a short ptr is casted to a byte ptr, any offset
operation on the byte ptr must be doubled. We do this by casting to
short ptr first, adding offset, then casting back to byte ptr.

BUG=webm:1388

Change-Id: I9e18a73ba45ddae58fc9dae470c0ff34951fe248
2017-04-19 12:13:49 -07:00
Marco Paniconi 977356a72b Merge "vp9: Add phase to get averaging filter for 1:2 downsampling." 2017-04-19 15:27:55 +00:00
Marco f34be01190 vp9: Fix the disabling of a SVC 3TL datarate test.
Change-Id: Ib42d23ab5ee39ab3c85e1d9a84e36249e59fe74e
2017-04-19 08:01:44 -07:00
Marco 348bdc0195 vp9: Add phase to get averaging filter for 1:2 downsampling.
The scaling filter with zero shift will give sub-sampling for
2x downsampling. Allow for a phase shift to get an averaging filter.

Usage is for source scaling in 1 pass SVC mode for 1:2 downscale.
Reduces aliasing in downsampled image.

Keep the phase to 0/off for now.

Change-Id: Ic547ea0748d151b675f877527e656407fcf4d51e
2017-04-18 16:56:15 -07:00
Luca Barbato 479443a570 ppc: tm predictor 16x16
About 10x faster.

Change-Id: I1f5a3752d346459df3b45f92963208bf3e520f06
2017-04-19 01:48:10 +02:00
Luca Barbato c8f5a55df4 ppc: tm predictor 8x8
About 5x faster.

Change-Id: I951230517f49c0dca9ac9eac2efa8916a303b85a
2017-04-19 01:48:09 +02:00
Luca Barbato 7b0e12934e ppc: horizontal predictor 32x32
About 5x faster.

Change-Id: I3bb724e07baffd901aa2d0f65060ba48882cc9b8
2017-04-19 01:48:09 +02:00
Luca Barbato a7a2d1653b ppc: horizontal predictor 16x16
About 10x faster.

Change-Id: Ie81077fa32ad214cdb46bdcb0be4e9e2c7df47c2
2017-04-19 01:48:09 +02:00
Luca Barbato 7ad1faa6f8 ppc: vertical intrapred 16x16 and 32x32
Change-Id: Ic80f3c050cfbe7697e81a311b4edaaa597b85cab
2017-04-19 01:48:09 +02:00
Luca Barbato a39b723eb3 configure: Workaround clang not enabling altivec on -mvsx
The flag `-mvsx` implies `-maltivec`.

Change-Id: I7544553eba131a533467b387f8bf329d57f5af5c
2017-04-19 01:48:04 +02:00
Luca Barbato 3252e6b63d configure: Match power*64* as ppc64
Change-Id: Ie640dff50a5db935bb57c5a2570b423ce8946f2c
2017-04-19 01:47:56 +02:00
Linfeng Zhang a02f391cbe Create CAST_TO_BYTEPTR/SHORTPTR
They will replace CONVERT_TO_BYTEPTR/SHORTPTR module by module.

BUG=webm:1388

Change-Id: Ie47c8cd4897696481b9cbbf9e2d439dc22dc85ec
2017-04-18 14:48:11 -07:00
Marco 15afee1938 vp9: Disable some SVC tests for now.
Disable the 1 pass CBR SVC tests with temporal_layers > 1.
Issue with the commit 863f860, which will cause encoder/decoder
mismatch due to skipping encoder loopfilter for non-reference frames.

Will re-enable the tests when fixed.

Change-Id: I74918a0045a17976b069c4be63fbeb921974df0d
2017-04-18 09:51:42 -07:00
Marco ad2e3598d2 vp9: Add key_frame condition to is_reference check for loopfilter.
This condiiton is not needed as key_frame should set the refresh
of the reference frames, but good to have for clarity in condition.

Change-Id: Icf9838e7e4f0ff5cf0a9562ae3b5d6c7e6f78702
2017-04-17 15:18:46 -07:00
Johann Koenig a6095333a7 Merge "re-enable vpx_comp_avg_pred_sse2" 2017-04-17 22:07:34 +00:00
Marco Paniconi 9aa429a66d Revert "Revert "vp9: Avoid encoder loopfilter for non-reference frames.""
This reverts commit e9b7f98c56.

Reason for revert:
Commit d578bdad fixes the issue (encoder/decoder mismatch
in 3TL datarate test) that causes the original revert.

Original change's description:
> Revert "vp9: Avoid encoder loopfilter for non-reference frames."
>
> This reverts commit 863f860bfc.
>
> This causes encoder / decoder mismatches in various
> VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers tests
>
> BUG=webm:1408
>
> Change-Id: Ic200c39d7ed9c0b0247ef562f5d6f7b2625f7e14
>

TBR=jzern@google.com,marpan@google.com,builds@webmproject.org,jianj@google.com
BUG=webm:1408

Change-Id: Ifeb81460856d1d56482d4e0477a70ee98f8bfaa6
2017-04-17 11:02:02 -07:00
Marco d578bdad02 vp9: Datarate test: modify frame flags for 3 TL.
Modify the frame flags to update the ARF on top layer,
for the tests:
VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers
VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayersFrameDropping

This is needed to fix the encode/decoder mismatches caused by 863f860,
and removed in the revert e9b7f98.

Change-Id: I6b9fecfdd17315fc0179e29949338c77636026c0
2017-04-17 09:33:20 -07:00
Johann 9fa24f03b5 re-enable vpx_comp_avg_pred_sse2
Buffers on 32 bit x86 builds only guaranteed 8 byte alignment. Fixed
with "AvgPred test: use aligned buffers" and "sad avg: align
intermediate buffer"

Also re-enable asserts on the C version.

BUG=webm:1390

Change-Id: I93081f1b0002a352bb0a3371ac35452417fa8514
2017-04-17 08:40:43 -07:00
Johann Koenig 9e19102972 Merge "AvgPred test: use aligned buffers" 2017-04-17 15:36:41 +00:00
Johann 069b772915 sad avg: align intermediate buffer
comp_avg_pred has started declaring a requirement for aligned buffers.

BUG=webm:1390

Change-Id: Idaf6667498ea343e8d49b32bc9d8b9d0aa43ef5c
2017-04-17 14:26:33 +00:00
James Zern 4ba20da8b1 Merge "Add AVX2 optimization to copy/avg functions" 2017-04-15 00:26:08 +00:00
Yi Luo aa5a941992 Add AVX2 optimization to copy/avg functions
Change-Id: Ibcef70e4fead74e2c2909330a7044a29381a8074
2017-04-14 16:50:10 -07:00
Johann Koenig 7178e68bbe Merge "Disable vpx_comp_avg_pred_sse2" 2017-04-14 22:01:39 +00:00
Johann e3b2710b04 AvgPred test: use aligned buffers
BUG=webm:1390

Change-Id: Idb6d1ce119a09c5e7c9f3c58bbbae3de63463d1d
2017-04-14 12:49:56 -07:00
James Zern e9b7f98c56 Revert "vp9: Avoid encoder loopfilter for non-reference frames."
This reverts commit 863f860bfc.

This causes encoder / decoder mismatches in various
VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers tests

BUG=webm:1408

Change-Id: Ic200c39d7ed9c0b0247ef562f5d6f7b2625f7e14
2017-04-14 11:50:06 -07:00
Marco 5f39262dcc vp9: Adjust speed features for speed 8 at low resoln.
For low resolutions (<= CIF): use quarter-pixel and simple_block_yrd.

~5% gain on RTC_derf.
~6-7% slowdown on ARM.

Change-Id: I4439ebd1116b9decac04786503f978840b68a60c
2017-04-14 11:35:47 -07:00
Marco Paniconi b937f1c839 Merge "vp9: SVC: fix to allow use_base_mv to be used for 3 layers." 2017-04-14 17:12:58 +00:00
Johann eaa7cdf05d Disable vpx_comp_avg_pred_sse2
Failures on windows:
unknown file: error: SEH exception with code 0xc0000005 thrown in the
test body.

Alignment check errors on linux:
test_libvpx: ../libvpx/vpx_dsp/variance.c:230: void
vpx_comp_avg_pred_c(uint8_t *, const uint8_t *, int, int, const uint8_t
*, int): Assertion `((intptr_t)comp_pred & 0xf) == 0' failed.

BUG=webm:1390

Change-Id: I5eed5381c0f1a8fe594a128eb415e77232f544ea
2017-04-14 08:43:06 -07:00
Johann Koenig bdb593ab20 Merge "vpx_comp_avg_pred: sse2 optimization" 2017-04-14 04:10:56 +00:00
Marco adb9b4eddf vp9: SVC: fix to allow use_base_mv to be used for 3 layers.
Allow use_base_mv to be used for 3 spatial layers where
base is 4x4 scale from the top layer.

Change-Id: If6641baf8b8e4d0fd5dc67619d873c6d75065f43
2017-04-13 20:43:43 -07:00
Marco Paniconi f0ccaff553 Merge "vp9: Avoid encoder loopfilter for non-reference frames." 2017-04-14 00:45:42 +00:00
Marco 6bff6cb5a9 vp9: 1 pass VBR: Fix to rate control at low min-q.
Fix to avoid getting stuck at very low Q even
though content is changing, which can happen for --min-q=0.

Fix is to more aggressively increase active_worst_quality
when detecting significant rate_deviation at very low Q.

Change will only affect 1 pass VBR for --min-q < 4, so no
change in ytlive metrics for --min-q >= 4.

Change-Id: I4dd77dd7c08a30a4390da0ff2c8bda6fccfa76d7
2017-04-13 11:44:35 -07:00
Marco 863f860bfc vp9: Avoid encoder loopfilter for non-reference frames.
Useful for SVC, where the top layer enhancement frames may
not update any reference buffers, as is the case for the
patterns in the 1 pass CBR SVC when #temporal_layers > 1.

~3% encoder speedup for SVC patterns with temporal layers
in 1 pass CBR mode.

Updated the SVC datarate tests for the mismatch frames.
Set the frame-dropper off in some tests with #temporal_layers > 1
so we can correctly set #mismatch frames. Adjusted rate target
threshold for tests where frame-dropper was turned off.

Change-Id: Ia0c142f02100be0fed61cd2049691be9c59d6793
2017-04-13 09:51:55 -07:00
Johann 28a8622143 vpx_comp_avg_pred: sse2 optimization
Provides over 15x speedup for width > 8.

Due to smaller loads and shifting for width == 8 it gets about 8x
speedup.

For width == 4 it's only about 4x speedup because there is a lot of
shuffling and shifting to get the data properly situated.

BUG=webm:1390

Change-Id: Ice0b3dbbf007be3d9509786a61e7f35e94bdffa8
2017-04-13 08:44:52 -07:00
Matt Oliver 77559e1120 Add repo readme.markdown. 2017-04-14 00:43:50 +10:00
Yunqing Wang f22b828d68 Fix an integer overflow in vp9_mcomp.c
The MV unit test revealed an integer overflow issue in vp9_mcomp.c.
This was caused if the MV was very large. In mv_err_cost(), when
mv->row = 8184, mv->col = 8184 and ref_mv is 0, mv_cost = 34363
and error_per_bit = 132412, causing the overflow.

BUG=webm:1406

Change-Id: I35f8299f22f9bee39cd9153d7b00d0993838845e
2017-04-10 18:09:50 -07:00
Jerome Jiang 2420f44342 Merge "vp9: speed >= 8: Adjust speed settings on ARM." 2017-04-11 00:45:21 +00:00
Jerome Jiang f16f08e55b vp9: speed >= 8: Adjust speed settings on ARM.
Set adaptive_rd_thresh to 2 when simple block yrd is not used.

Fix regression caused by computing y sad without
int_pro_motion_estimation on low res motion clips.

Overall 0.07% quality loss on rtc_derf.

Change only affects low res on speed 8.

Change-Id: Ic6a188a56529f1034d6431005fb4b0e24e8a7e27
2017-04-11 00:26:56 +00:00
Marco 6557baf336 vp9: 1 pass CBR: avoid nonrd_pick_partition on segment.
For speed 5, 1 pass CBR: Don't use the nonrd_pick_partition
on the segment, rather use choose_partitioning followed by
nonrd_select_partition (as is done on base segment).

Little/no quality loss on RTC and RTC_derf (< 0.3%),
speedup of at least 5%.

Change-Id: I5273d5f950e60adf5e437b4ca8c4f63964641e83
2017-04-10 15:02:49 -07:00
Marco Paniconi ff1fef9607 Merge "vp9: Fix to noise estimation for temporal denoising." 2017-04-07 17:13:22 +00:00
Yunqing Wang f496032686 Merge "VP9 motion vector unit test" 2017-04-07 16:46:22 +00:00
Marco 349c3118bd vp9: Fix to noise estimation for temporal denoising.
If the noise estimation is avoided due to large motion,
the last_source for denoising should still be updated.

Change-Id: I67155ea7dbe9ac2785978e64a27bdafd7d57aac0
2017-04-07 09:23:30 -07:00
Marco 18b54ef468 vp9: Adjust consec_zeromv threshold for aq-mode=3.
To reduce refresh on partial super-blocks on boundary,
for noisy input. Reduces some artifacts on noisy input.

Change-Id: I10b5808a296874e08c7f378b3df58466591d8dbe
Edit
2017-04-07 08:54:09 -07:00
James Zern 04e9456567 Merge changes from topic 'Wshorten'
* changes:
  configure: enable -Wshorten-64-to-32 for hbd
  vp9_encodeframe: resolve -Wshorten-64-to-32 in hbd
  Resolve -Wshorten-64-to-32 in highbd variance.
2017-04-07 07:32:14 +00:00
Jerome Jiang 6af42f5102 Merge "Fix compile warnings with enable-internal-stats flag." 2017-04-07 03:34:55 +00:00
Jerome Jiang b82b574e76 Fix compile warnings with enable-internal-stats flag.
BUG=webm:1402

Change-Id: Ibe9ecb1b559a4b989f6ccedbd097e369f6edde1e
2017-04-06 14:00:01 -07:00
Marco 3227a9be5f vp9; Move the denoising condition for speed 5.
Move the condition for effectively disabling the denoising
for speed 5 into the vp9_denoiser_denoise().

This is cleaner, and also moving the condition into vp9_denoiser_denoise
will keep the denoiser buffer updated with the current source.
This allows for more consistent behavior if speed is changed midstream.

Change-Id: Ia001f591c56e454bf724c3ae73c024badb183ef8
2017-04-06 11:03:04 -07:00
Jerome Jiang c9fbb1881a Merge "vp9: speed 8: Compute y sad without int_pro_motion_estimation." 2017-04-06 02:57:16 +00:00
Jerome Jiang 705fc9f107 Merge "Refactor: Clean memory allocation for copy partition." 2017-04-06 02:57:08 +00:00
Yunqing Wang 1aa46abbdf VP9 motion vector unit test
To prevent the motion vector out of range bug, added a motion vector unit
test in VP9. In the 4k video encoding, always forced to use extreme motion
vectors and also encouraged to use INTER modes. In the decoding, checked if
the motion vector was valid, and also checked the encoder/decoder mismatch.

The tests showed that this unit test could reveal the issue we saw before.

Change-Id: I0a880bd847dad8a13f7fd2012faf6868b02fa3b4
2017-04-06 00:50:56 +00:00
James Zern 7ac3acf762 configure: enable -Wshorten-64-to-32 for hbd
Change-Id: I4041f3fc4aabc7a2251c44c75a477b659284c3cf
2017-04-05 17:34:06 -07:00
James Zern b3e2eb14c5 vp9_encodeframe: resolve -Wshorten-64-to-32 in hbd
vp9_high_get_sby_perpixel_variance the variance operated on in is
already in 32-bits

Change-Id: I97006eb9c08dbd0f88ee35e1a1ca205737508296
2017-04-05 17:34:06 -07:00
James Zern 47b9a09120 Resolve -Wshorten-64-to-32 in highbd variance.
For 8-bit the subtrahend is small enough to fit into uint32_t.

This is the same that was done for:
c0241664a Resolve -Wshorten-64-to-32 in variance.

For 10/12-bit apply:
63a37d16f Prevent negative variance

Change-Id: Iab35e3f3f269035e17c711bd6cc01272c3137e1d
2017-04-05 17:34:02 -07:00
Jerome Jiang 288d73c861 vp9: speed 8: Compute y sad without int_pro_motion_estimation.
Little change in overall PSNR in rtc. 2-4% speedup on VGA on ARM.

Change-Id: I3395806d7afd456deacd4077c330adca13ab0645
2017-04-05 17:25:47 -07:00
Marco Paniconi 511a207444 Merge "vp9: Temporal denoising: avoid denoising for speed <= 5." 2017-04-06 00:25:45 +00:00
Marco 2136de9374 vp9: Temporal denoising: avoid denoising for speed <= 5.
Temporal denoiser runs in non-rd pickmode, so it is only used
for speed >= 5. Regression exists for speed 5, due to use of
reference_partition (which use non-rd pickmode for partitioning).
Avoid denoising for now at speed 5.

Change-Id: I74a74d2e1404d7cfd33dcf4ec06dd2e503256cf0
2017-04-05 16:43:39 -07:00
Jerome Jiang 58ba880b94 Refactor: Clean memory allocation for copy partition.
Move the memory allocation from setting speed features.

Change-Id: I2e89dfaeb46daee63effe5a5df62feed732aa990
2017-04-05 15:33:24 -07:00
Linfeng Zhang 6fc2e57c2c Update 32x32 high bitdepth idct NEON optimization
Preparation of CONVERT_TO_BYTEPTR/SHORTPTR clean up.

BUG=webm:1388

Change-Id: I928d30a5698023bb90888d783cf81c51ec183760
2017-04-05 15:28:11 -07:00
Jerome Jiang fb60204d4c vp9: Remove legacy comments for avg_source_sad.
Change-Id: Ia6e8614535a097f17f37fc382cef8e22e03b70f6
2017-04-04 16:28:27 -07:00
Marco 8097b49997 vp9: Adjust condition of golden update with cyclic refresh.
Base the low_content_frame metric on the motion vectors,
and adjust the logic for preventing golden update.

Small change in behavior: small positive gain (~0.2-1%) on clips
with high activity.

Change-Id: I0b861c8e9666cd82b45cde5ee57ee8a1e5ab453c
2017-04-04 09:55:24 -07:00
Marco 6b3f4bc794 vp9: 1 pass CBR: cleanup to cyclic refresh.
Code cleanup: merged two functions that were doing postencode
update for cylic refresh, remove some unused code and fix comments.

No change in behavior.

Change-Id: I9be0d7e346d34dec29bf4e5bb380a7bf81c8480a
2017-04-03 16:37:45 -07:00
Matt Oliver 86db6f4c66 project: Convert to using custom yasm build customization. 2017-04-04 04:30:22 +10:00
Yunqing Wang 41fac44707 Merge "Fix for out of range motion vector bug in sub-pel motion estimation" 2017-04-03 18:27:57 +00:00
Marco Paniconi 9d403d6f48 Merge "vp9: SVC: Fix issue with artifact for svc-denoising." 2017-04-03 16:23:25 +00:00
Ranjit Kumar Tulabandu bf15ca1091 Fix for out of range motion vector bug in sub-pel motion estimation
BUG=webm:1397

(yunqingwang)
To verify that this patch wouldn't cause much performance change,
the Borg tests were run. Here was the result:
       avg_psnr   overall_psnr  ssim
hdres: -0.002     0.006         0.013
midres:   0         0             0
lowres:   0         0             0

Change-Id: Iae395ae7b741e0513cf5bab9dcace110b792a67d
2017-04-03 16:16:49 +00:00
Yunqing Wang 002cf38837 Merge "Enhance the row mt sync read to accept the sync_range greater than 1" 2017-04-03 15:59:51 +00:00
Matt Oliver 83ca90e070 asm: Update to support compilation with nasm. 2017-04-03 18:53:36 +10:00
Yunqing Wang f1600db3e4 Enhance the row mt sync read to accept the sync_range greater than 1
The row mt sync read uses sync_range = 1, and wouldn't work if we want
to use a sync_range that is greater than 1. To make it work, this sync
read code is modified. Pass in col instead of col - 1 to make it
consistent with other row mt code in VP9, and then add 1 in "while"
codition.

Change-Id: I4a0e487190ac5d47b8216368da12d80fec779c1a
2017-03-31 10:48:38 -07:00
Marco c824eda6cc vp9: SVC: Fix issue with artifact for svc-denoising.
Issue/bug happens for denoising with spatial layers, where
the golden (spatial) reference is used in pickmode, but
denoising is only done wrt to last (temporal).

Fix is to make sure set_ref_ptrs is set before build predictors
in denoiser.

Change-Id: I793cf441341edf7c4a88b8ab1e1b22b3cb0eb508
2017-03-31 10:05:32 -07:00
James Zern 8e7c5a3c8d Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  vpxdec: silently ignore -frame-parallel
  vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op
2017-03-30 19:07:44 +00:00
James Zern 6fed5692d2 vpxdec: silently ignore -frame-parallel
BUG=webm:1395

Change-Id: Ibf47cc931e51b71e49067c6d7b7a39ab57c11c96
2017-03-29 23:39:12 -07:00
James Zern 01d23109ab vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op
this is unmaintained and due to be removed

BUG=webm:1395

Change-Id: Iaffa6aa057c820fd1a182b93ebb45d4286e1306e
2017-03-29 23:38:49 -07:00
James Zern 7fde349b25 rtcd,unit tests: fix ppc64 build
match ppc* in rtcd to ensure vsx is included

Change-Id: I331a5d35e7160eeb69ebd14b98ba03ec5be6c600
2017-03-29 22:51:22 -07:00
Marco fc83fcb7c4 vp9: SVC: fix to allow output of denoised result.
Change-Id: Iaf55cfb5e9621d074eb33d6a32f184e4777968f8
2017-03-29 14:02:54 -07:00
Marco 32b3d2f174 vp9: 1 pass SVC: Modify condition for intra-mode search.
Temporary override to condition for disallowing intra-search in SVC,
since golden (spatial) reference is currently suppressed due to
artifact issue.

Change-Id: I28ed7fdddc9fcdbcc0a4175a247a3ecc94c11767
2017-03-29 09:24:50 -07:00
James Zern fe432cacf8 Merge "rate_hist: add parameter validation" 2017-03-29 02:41:07 +00:00
James Zern ddd75573a7 Merge changes from topic 'sync-highbd-intrapred'
* changes:
  intrapred: sync highbd_d135_predictor w/d135_
  intrapred: specialize highbd 4x4 predictors
  intrapred: rename d63f to d63e
  remove CONFIG_MISC_FIXES
2017-03-29 02:39:21 +00:00
James Zern 3cc57e67a9 Merge ".mailmap: add an additional entry for Yaowu Xu" 2017-03-29 01:56:57 +00:00
Johann Koenig eec92e8a5b Merge "vpx_comp_avg_pred: add test" 2017-03-28 21:50:01 +00:00
Johann 6e99ed72a5 vpx_comp_avg_pred: add test
BUG=webm:1389

Change-Id: I23cd65f1939db026958ccb5d70b8c5cc9aa5bc51
2017-03-28 14:11:14 -07:00
Marco 0169a985d9 vp9: Speed >= 8: avoid chrome check under some condition.
For non-rd variance partition, avoid the chrome check
unless y_sad is below some threshold.

Small decrease in avgPSNR (~0.3) on RTC set.
Small/negligible decrease on RTC_derf.

Change-Id: I7af44235af514058ccf9a4f10bb737da9d720866
2017-03-27 13:18:21 -07:00
Marco 66c6b4d6fc vp9: 1 pass: Move source sad computation into encodeframe loop.
Refactor to split the 1 passs source sad computation into scene
detection (currently used for VBR and screen-content mode), and
superblock based source sad computation (used in non-rd CBR mode).

This allows the source sad computation for CBR mode to be
multi-threaded.

No change in compression.

Change-Id: I112f2918613ccbd37c1771d852606d3af18c1388
2017-03-27 11:11:05 -07:00
James Zern aefc1088a2 intrapred: sync highbd_d135_predictor w/d135_
previously:
05437805f intrapred/d135: flatten border results before storing

BUG=webm:1316

Change-Id: I3b8bd89117ad7f2f4560b57f7c148da781e86f85
2017-03-24 20:45:44 -07:00
James Zern 67cde46dd7 intrapred: specialize highbd 4x4 predictors
d207/d63/d45/d117/d135/d153

~9-45% better depending on the predictor on 32-bit ARM, similar range on
x86-64

this matches the non-highbitdepth implementation

BUG=webm:1316

Change-Id: Iddebdf7c58c6f31c47cae04da95c6e5318200e4c
2017-03-24 20:45:36 -07:00
James Zern e05f4cf8f4 intrapred: rename d63f to d63e
this is consistent with he/ve/d45e

Change-Id: I75641ae5667430b0ecd370db86fff6e666cb577d
2017-03-24 20:41:39 -07:00
James Zern d45617c702 remove CONFIG_MISC_FIXES
this belonged to vp10 with the changes now migrated to av1.

Change-Id: Ie30ead3e7b71f465bc14136e1b6f156ea978c43f
2017-03-24 20:41:39 -07:00
Marco 07ad5a15c2 vp9: Fix to condition on using source_sad for 1 pass real-time.
Make the source_sad feature work properly for cases of VBR or
screen_content with SVC.

Added unittest for SVC with screen-content on.

Change-Id: Iba5254fd8833fb11da521e00cc1317ec81d3f89b
2017-03-24 10:21:47 -07:00
James Zern 4ffdf60b85 rate_hist: add parameter validation
tolerate a NULL hist being passed as a result of invalid parameters
passed to init_rate_histogram(). this fixes a divide by zero in
init_rate_histogram() with an invalid fps.

BUG=webm:1383

Change-Id: Id203e0f3b18d67a4a09aaf206abcce4708f966ec
2017-03-23 14:50:00 -07:00
Johann Koenig 10164407fb Merge "vp9 temporal filter: additional test" 2017-03-23 18:38:36 +00:00
Alex Converse d7b220b467 Merge changes Ie989e60c,Ifc110b12
* changes:
  Backport "Optimize the use case of token_cost table" to VP9
  Drop vp9_get_token_extracost
2017-03-23 18:05:13 +00:00
Marco Paniconi d856157388 Merge "vp9: Non-rd partition: avoid unneeded call to chrome_check" 2017-03-23 14:17:48 +00:00
Kaustubh Raste 8ee9b855a0 Merge "Fix mips msa fwd xform mismatch" 2017-03-23 07:44:16 +00:00
Marco 4863e07c01 vp9: Non-rd partition: avoid unneeded call to chrome_check
Since y_sad is not computed yet (on the early exit due to source_sad),
no need to check for setting color_sensitiviy.

Only affects speed >=8. No change in behavior.

Change-Id: I3a6f2d20fed38d8b8ec51b75bcacf9a21f2db916
2017-03-22 22:40:28 -07:00
James Zern f16ea6a6eb Merge "vp9_rdopt: correct size to vpx_sum_squares_2d_i16" 2017-03-23 00:53:22 +00:00
Marco Paniconi ff0e0a76e8 Merge "vp9: Adjust some speed settings for speed 8." 2017-03-22 22:56:17 +00:00
Marco 4d50991320 vp9: Adjust some speed settings for speed 8.
Allow for simple_block_rd for VGA resoln, and reduce
adaptive_rd_thresh to 1.

On average no loss on RTC set, ~4% speedup on mac.

Change-Id: Ib549c4061c853776062b5e34040f839d470fbebc
2017-03-22 15:16:15 -07:00
Jerome Jiang dcd6c87b80 Merge "vp9: Enable adaptive_rd_threshold for row mt for realtime speed 8." 2017-03-22 22:02:24 +00:00
Johann 83dd9b36f4 vp9 temporal filter: additional test
Change tests to reflect use. Input sizes will be 8 or 16 (but not
necessarily square).

filter_weight is capped at 2 and filter_strength at 6

Speed test, disabled by default.

Change-Id: Idfde9d6c4b7d93aaf0e641b0f4862c15e2a2af7a
2017-03-22 19:37:04 +00:00
Johann Koenig c099d6be1c Merge "vp9 temporal filter: add const to function prototype" 2017-03-22 19:36:40 +00:00
James Zern e097bb1d39 Merge "idct_neon: prefix non-static functions w/'vpx_'" 2017-03-22 19:30:11 +00:00
James Zern 5661cd8ff4 vp9_rdopt: correct size to vpx_sum_squares_2d_i16
the current implementations expect pixel size, not the block type

BUG=webm:1392

Change-Id: Ib91e9f30a1f56e13566b1fb76f089dae9bb50cdc
2017-03-22 12:04:33 -07:00
James Zern f91c3bb3ab idct_neon: prefix non-static functions w/'vpx_'
Change-Id: I94fcdeae18468e6ef0cb7119b8142d982a048031
2017-03-22 11:49:23 -07:00
Johann 36d732c22b vp9 temporal filter: add const to function prototype
The input frames are not modified.

Change-Id: Ideb810e3c5afeb4dbdc4c7d54024c43a8129ad39
2017-03-22 18:14:21 +00:00
Kaustubh Raste e45c1f55b4 Fix mips msa fwd xform mismatch
Change-Id: I32a6df11463144aa1a562256ee7d57a41fd678d6
2017-03-22 14:01:03 +05:30
Jerome Jiang 20c2892693 vp9: Enable adaptive_rd_threshold for row mt for realtime speed 8.
Change it to row based array to avoid the slow down cause by sync.
row-mt on, speed 8, 2 threads: ~4% speedup for VGA on ARM benefited
from adaptive_rd_threshold.

Change-Id: I887e65a53af20a6c4f48d293daaee09dab3512cf
2017-03-21 18:49:47 -07:00
Marco Paniconi 2fac50fa0e Merge "vp9: Modify datarate tests to cover denoising with multi-threading." 2017-03-21 23:44:05 +00:00
Jerome Jiang 74f4c5cd12 Merge "Fix the data race caused by vp9 denoiser." 2017-03-21 23:27:48 +00:00
Marco 4ddde47d8c vp9: Modify datarate tests to cover denoising with multi-threading.
Change-Id: I6ed48a630edf9923c25a05deaca50e0afec43918
2017-03-21 15:57:33 -07:00
Jerome Jiang dbed479d79 Fix the data race caused by vp9 denoiser.
BUG=webm:1391

Change-Id: I9669ae62fe9c695d4c6f9973094cb0f39bed51c7
2017-03-21 15:46:25 -07:00
Yi Luo cb9b277b2f Merge "Make butterfly_self() signature consistent with butterfly()" 2017-03-21 22:32:20 +00:00
Yunqing Wang 1935dfb294 Code refactoring in the partition search
Computed the partition search early termination score in a separate
function.

Change-Id: I1894b517ff179a38b1c05e054d373ac4b7f4cbb4
2017-03-21 10:00:44 -07:00
Yi Luo 266868a40b Make butterfly_self() signature consistent with butterfly()
- Refer to patch: 48fca113d inv_txfm_ssse3,butterfly: fix win32 abi
  compatibility.
- Change four butterfly() calls to butterfly_self(), to simplify the
  operations.

Change-Id: Ib2a8cfe6cddcaf0a59e6e6270d8380055ea42ef3
2017-03-21 09:36:35 -07:00
James Zern 6f13620761 .mailmap: add an additional entry for Yaowu Xu
Change-Id: I26b928848a7e72ff5ca25001ed6991c95f5992a5
2017-03-20 22:39:40 -07:00
James Zern e0b4c4d1ae Merge "Add vpx_highbd_idct32x32_1024_add_neon()" 2017-03-21 03:27:35 +00:00
James Zern 6d71d33d55 Merge "Add vpx_highbd_idct32x32_34_add_neon()" 2017-03-21 03:02:51 +00:00
Marco Paniconi 05c7259525 Merge "vp9: Nonrd variance partition: improve split to 16x16." 2017-03-21 00:17:35 +00:00
Yunqing Wang bf43b4c4b4 Merge "Record the sum of tx block eobs in the partition block" 2017-03-20 23:20:12 +00:00
Marco 3135b85423 vp9: Nonrd variance partition: improve split to 16x16.
Add additional condition to split to 16x16, for resolutions <= 360p,
reduces dragging artifact near moving boundary.

Small/no change on RTC metrics.

Change-Id: I314694f2166435d918f74e7ab42f002b07f40dae
2017-03-20 15:44:46 -07:00
Marco Paniconi 8dc1523057 Merge "vp9: Use sb content measure to bias against golden." 2017-03-20 21:35:12 +00:00
Marco 06c8713e89 vp9: Use sb content measure to bias against golden.
For each superblock, keep track of how far from current frame
was the last significant content change, and use that (along
with GF distance), to turnoff GF search in non-rd pickmode.

Only enabled for speed >= 8.

avgPNSR on RTC/RTC_derf down by ~0.9/1.2.
Speedup on mac: ~3-5%.
Speedup on arm: 3.6% for VGA and 4.4% for HD.

Change-Id: Ic3f3d6a2af650aca6ba0064d2b1db8d48c035ac7
2017-03-20 12:42:26 -07:00
Johann Koenig d642dd4311 Merge "temporal filter test: update types" 2017-03-20 19:05:55 +00:00
Yunqing Wang 9c2552a1c1 Record the sum of tx block eobs in the partition block
The sum of tx bloxk eobs is needed in the machine learning based partition
early termination. The eobs are first accumulated during tx search, and
then the value associated with the best tx_size is copied to ctx for later
use.

After the sum of eobs are calculated correctly, re-enabled
ml_partition_search_early_termination speed feature.

Re-did the quality/speed test to check the impact of the fix.

1. Borg test BDRATE result:
4k set:     PSNR: +0.183%; SSIM: +0.100%;
hdres set:  PSNR: +0.168%; SSIM: +0.256%;
midres set: PSNR: +0.186%; SSIM: +0.326%;

2.Average speed gain result:
4k clips: 21%;
hd clips: 26%;
midres clips: 15%.

The result is in line with the original result.

Change-Id: I4209a95c89be03b4cbfb6a95b16885f89feddbda
2017-03-20 17:12:15 +00:00
Jingning Han ca9bedd538 Backport "Optimize the use case of token_cost table" to VP9
cherry picked from nextgenv2 90ea281f29df747282e56d3068a3ddbdde30cdd0

Change-Id: Ie989e60c6479ac3251cadaac9c7e795ccba52f4e
2017-03-17 16:54:22 -07:00
Alex Converse ab71181545 Drop vp9_get_token_extracost
vp9_get_token_cost does the same thing with one fewer lookup.

Change-Id: Ifc110b12403cb1a04a3f91357ab435c67b4815d6
2017-03-17 16:53:09 -07:00
James Zern 36533e8c5a Merge "inv_txfm_sse2: clear conversion warning in hbd build" 2017-03-17 21:48:20 +00:00
Johann 775569473d temporal filter test: update types
Use 'int' for w/h since it is that way everywhere else.

Pass Buffer pointers

Change-Id: I9eef6890af657baba171c6bcfcc85fc976173399
2017-03-17 13:22:28 -07:00
Johann Koenig 9675affae0 Merge "test: add vp9_temporal_filter_apply test" 2017-03-17 18:18:06 +00:00
Alex Converse 0842daa24e Merge "vp9_optimize_b: Combine extrabits cost with token lookup" 2017-03-17 16:18:21 +00:00
James Zern 5da2e500d7 inv_txfm_sse2: clear conversion warning in hbd build
tran_high -> tran_low in return from dct_const_round_shift()

Change-Id: I2fe06c4b604823b1d1fe40a487017c3c2819a440
2017-03-17 01:16:38 -07:00
Linfeng Zhang 27530d484e Add vpx_highbd_idct32x32_1024_add_neon()
BUG=webm:1301

Change-Id: Ib90af0c1712e56b301d0e981dbe9a641e15e36ca
2017-03-17 00:27:46 -07:00
Linfeng Zhang 50b13f75b8 Add vpx_highbd_idct32x32_34_add_neon()
BUG=webm:1301

Change-Id: I74dd16c6c64e7bb71aa991cedccddf0663ef5e06
2017-03-17 00:27:46 -07:00
James Zern 2882778310 Merge "Add vpx_highbd_idct32x32_135_add_neon()" 2017-03-17 07:26:52 +00:00
Linfeng Zhang 65e9fb65e8 Add vpx_highbd_idct32x32_135_add_neon()
BUG=webm:1301

Change-Id: I58c2d65d385080711c3666d6d8f9d241dac7b21a
2017-03-16 22:37:55 -07:00
James Zern 68efc64b72 Merge "Clean vpx_idct32x32_1024_add_neon()" 2017-03-17 05:24:58 +00:00
Marco 02975a604c vp9: Fix speed 8 condition for enabling copy_partition.
Change-Id: I2c090e6ba853a30fef1957b620853315f9471753
2017-03-16 17:08:37 -07:00
Alex Converse 3a6ec9ea72 vp9_optimize_b: Combine extrabits cost with token lookup
About 0.6% fewer cycles spent in vp9_optimize_b.

Change-Id: I2ae62a78374c594ed81d4e3100a5848e2f6f2c4e
2017-03-16 17:03:22 -07:00
Gabriel Marin 976ddb61d3 Add a vector form of routine vp9_model_rd_from_var_lapndz
Add routine vp9_model_rd_from_var_lapndz_vec and call it from model_rd_for_sb
to model the rate and distortion for MAX_MB_PLANE Laplacian sources in
parallel. The caller ensures that all sources have non-zero variance.

Measured a 18% to 25% reduction in retired instructions, and 17% to 24%
reduction in instruction execution cost with different compilers for the
Laplacian modeling.

No change in behavior.

TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225

Change-Id: I6b76947f21c659a349adb896e13e99f6e3f951e6
2017-03-16 22:19:44 +00:00
Marco Paniconi 83ba1880bf Merge "vp9: Fixes in non-rd pickmode for denoising with SVC." 2017-03-16 21:53:38 +00:00
Johann Koenig eeeb71ed97 Merge "Remove ppc-linux-gcc target" 2017-03-16 21:53:17 +00:00
Johann Koenig cd3d7cf4ac Merge "Add Hadamard for Power8" 2017-03-16 21:52:15 +00:00
Marco bc7d4935bb vp9: Fixes in non-rd pickmode for denoising with SVC.
Don't denoise spatial layer frames whose base layer is a key frame.

Disallow golden reference for SVC with denoising on frames
that will be denoised (highest layer), as this removes bad artifact.
Will re-enable when issue is resolved.

Change-Id: I87a6597812330500966458172acfce54af65f70f
2017-03-16 12:59:41 -07:00
Marco ba8bfaafa7 vpx_codec.h: include vpx/*.h -> ./*.h
This matches the other includes and also fixes a compile issue in
chromium.

Change-Id: I45e00a1454f7ed948aa3b96b04cc5946b1d02985
2017-03-16 16:55:56 +00:00
Jerome Jiang bf40776aa4 Merge "Refactor: Change cpi->resize_state to enum values." 2017-03-16 16:43:42 +00:00
Marco Paniconi ec73bf53a5 Merge "vp8: Fix compiler warning in vp8 pickinter.c" 2017-03-16 05:13:38 +00:00
Rafael de Lucena Valle 405b94c661 Add Hadamard for Power8
Change-Id: I3b4b043c1402b4100653ace4869847e030861b18
Signed-off-by: Rafael de Lucena Valle <rafaeldelucena@gmail.com>
2017-03-15 23:46:18 -03:00
Marco Paniconi cd47c1942e Merge "vp9: Fix some issues with denoiser and SVC." 2017-03-16 02:42:55 +00:00
Marco a340c64a79 vp9: Fix some issues with denoiser and SVC.
Fix the update of the denoiser buffer when the base
spatial layer is a key frame. And allow for better/lower
QP on high spatial layers when their base layer is key frame.

Change-Id: I96b2426f1eaa43b8b8d4c31a68b0c6d68c3024a2
2017-03-15 17:19:17 -07:00
Jerome Jiang b5f7f7737a Refactor: Change cpi->resize_state to enum values.
Change-Id: Iab1409b0fc1175bc5a14afc4749a08c536c98c41
2017-03-15 17:16:17 -07:00
Marco 2c8430e223 vp9: Turn off ml_partition_search_early_termination.
Fails on nightly ubsan, valgrind tests.
Enabled on commit:6701014

Change-Id: Ied3f5cb38e39cba54ac134f4514107cdfdfce159
2017-03-15 15:00:38 -07:00
Marco deea4ede59 vp8: Fix compiler warning in vp8 pickinter.c
Change-Id: I0e5714538fe53d885a2201d808846901ae8fc288
2017-03-15 11:50:14 -07:00
Linfeng Zhang e54231d613 Clean vpx_idct32x32_1024_add_neon()
Change-Id: I05921e16d6a3e4e7e5b00a90624735050a186636
2017-03-15 11:24:31 -07:00
Yi Luo 8440cc4817 Merge "Improve idct32x32_1024_add SSSE3 intrinsics performance" 2017-03-15 02:32:52 +00:00
Linfeng Zhang d9a9a4ffea Merge "Fix overflow issue in 32x32 idct NEON intrinsics" 2017-03-15 00:38:17 +00:00
Jerome Jiang 27d5a57072 Merge "vp9: Using source sad for speedup for dynamic resizing." 2017-03-15 00:03:52 +00:00
Linfeng Zhang c756eb01c8 Fix overflow issue in 32x32 idct NEON intrinsics
Similar issue as Change bc1c18e.

The PartialIDctTest.ResultsMatch test on vpx_idct32x32_135_add_neon()
in high bit-depth mode exposes 16-bit overflow in final stage of pass
2, when changing the test number from 1,000 to 1,000,000.

Change to use saturating add/sub for vpx_idct32x32_34_add_neon(),
vpx_idct32x32_135_add_neon and vpx_idct32x32_1024_add_neon() in high
bit-depth mode.

Change-Id: Iaec0e9aeab41a3fdb4e170d7e9b3ad1fda922f6f
2017-03-14 16:59:14 -07:00
Jerome Jiang 2fa7092808 Merge "vp9: Enable row multithreading for SVC in real-time mode." 2017-03-14 23:29:46 +00:00
Jerome Jiang 02463273c9 vp9: Using source sad for speedup for dynamic resizing.
Only for speed >= 7.

Change-Id: I3ac85fbb4023cf7e6f8333806b345b0174382a09
2017-03-14 15:47:19 -07:00
Yi Luo fedcf83f33 Improve idct32x32_1024_add SSSE3 intrinsics performance
- Function level speed improves ~12%.

Change-Id: I9b7dbddabf08c7d0f6b25264e6074d5ccbe39290
2017-03-14 14:04:08 -07:00
James Zern 1b91f41935 Merge "vp9/encoder: fix segfault on win32 using vs < 2015" 2017-03-14 19:21:42 +00:00
Yunqing Wang c3e290963d Merge "Apply machine learning-based early termination in VP9 partition search" 2017-03-14 18:07:05 +00:00
Marco Paniconi 78a6946904 Merge "vp9: Speed >= 8: Enable simple_block_yrd speed feature." 2017-03-14 17:50:17 +00:00
Marco c0c789ab50 vp9: Adjust copy partition threshold, for speed 8.
Reduce it from 5 to 4, small/no change in metrics or speed.
Small reduction in dragging artifact near moving head.

Change-Id: Ic3bc5ca67c70bf0c89fc2ed14454840a28ae5b6a
2017-03-14 09:18:53 -07:00
Marco c216c8d6f2 vp9: Speed >= 8: Enable simple_block_yrd speed feature.
Enable speed feature for resolutions > VGA.
avgPSNR on RTC down by ~1.7%.
Speedup on ARM: ~5%.

Change-Id: I7a3fe5f7425aa8df3f4a2eced1afa355bc0d4c95
2017-03-14 09:10:28 -07:00
Johann a14a987c82 test: add vp9_temporal_filter_apply test
Add an independent implementation of the filter.

BUG=webm:1379

Change-Id: I309c459b493c3011273b78b127a786bb23c59f9c
2017-03-13 15:26:26 -07:00
Marco Paniconi 507204316a Merge "vp9: Fix to source_sad feature for SVC." 2017-03-13 19:18:31 +00:00
Linfeng Zhang b0bfcc368c Merge "Add vpx_highbd_idct32x32_135_add_c()" 2017-03-13 18:49:01 +00:00
Marco f0a22b23fe vp9: Fix to source_sad feature for SVC.
Allow speed feature sf->use_source_sad to be used
on highest spatial layer for SVC.

Change-Id: I260eb0478902764f49f83e43b17024fe86ff3b22
2017-03-13 11:00:40 -07:00
Yunqing Wang 670101439f Apply machine learning-based early termination in VP9 partition search
This patch was based on Yang Xian's intern project code. Further modifications
were done.
1. Moved machine-learning related parameters into the context structure.
2. Corrected the calculation of sum_eobs.
3. Removed unused parameters and calculations.
4. Made it work with multiple tiles.
5. Added a speed feature for the machine-learning based partition search
early termination.
6. Re-organized the code.

The patch was rebased to the top-of-tree.

Borg test BDRATE result:
4k set:     PSNR: +0.144%; SSIM: +0.043%;
hdres set:  PSNR: +0.149%; SSIM: +0.269%;
midres set: PSNR: +0.127%; SSIM: +0.257%;

Average speed gain result:
4k clips: 22%;
hd clips: 23%;
midres clips: 15%.

Change-Id: I0220e93a8277e6a7ea4b2c34b605966e3b1584ac
2017-03-13 09:54:18 -07:00
Marco Paniconi b39f7c3364 Merge "vp9: Fix condition for intra search in non-rd pickmode." 2017-03-13 06:11:13 +00:00
Marco 8c18df7fcd vp9: Fix condition for intra search in non-rd pickmode.
Fixes an issue when the LAST and golden is not used as a reference,
in which case its possible no encoding mode is set (since intra may be
skipped under certain codtions). Fix is to make sure intra is searched
if no inter mode is checked.

Issue can happen for temporal layer pattern#7 in vpx_temporal_svc_encoder.c

Change-Id: I5ab4999b2f9dbd739044888e0916b5ec491d966b
2017-03-12 22:30:39 -07:00
James Zern 48fca113d1 inv_txfm_ssse3,butterfly: fix win32 abi compatibility
only the first 3 parameters can be aligned to 16 as required by __m128i,
make them all pointers for consistency.

since:
07c48ccfe Improve idct32x32_34_add SSSE3 intrinsics performance

BUG=webm:1384

Change-Id: I0324f701e723a27cb470036a180693ba8829d01d
2017-03-10 19:57:17 -08:00
James Zern c09b290cea vp9/encoder: fix segfault on win32 using vs < 2015
shift the bsse[] member of the macroblock struct to the front to avoid
an incorrect offset (0) to the upper half of bsse[0] which leads to a
negative resulting in a crash. restrict this to visual studio versions
before 2015 (the bug was observed with 2013, fixed in 2015) to avoid any
potential cache impact on other platforms.

https://connect.microsoft.com/VisualStudio/feedback/details/2396360/bad-structure-offset-in-32-bit-code

BUG=webm:1054

Change-Id: I40f68a1d421ccc503cc712192263bab4f7dde076
2017-03-10 17:37:17 -08:00
Marco Paniconi 0af189c00d Merge "vp9: Sample encoder vpx_temporal_svc_encoder: enable row-mt" 2017-03-10 18:26:06 +00:00
Marco 169c846575 vp9: Sample encoder vpx_temporal_svc_encoder: enable row-mt
Enable row-mt in the sample encoder vpx_temporal_svc_encoder.c,
under certain condiitons.

Change-Id: Ic103ee81a9d80be5bf6e5778cc21fc3199db909d
2017-03-10 10:11:39 -08:00
Yi Luo 018290a344 Merge "Improve idct32x32_135_add SSSE3 intrinsics performance" 2017-03-10 17:14:30 +00:00
Marco ffb3c50da1 vp9: Enable row multithreading for SVC in real-time mode.
Enable row-mt for SVC for real-time mode, speed >=5.

Add the controls to the sample encoders, but keep it off for now.
Add the control and enable it for the 1 pass CBR unittests.

For speed 7, 3 layer SVC, 2 threads, row-mt enabled gives about ~5% speedup.

Change-Id: Ie8e77323c17263e3e7a7b9858aec12a3a93ec0c1
2017-03-10 01:01:07 +00:00
Yi Luo 327add990f Improve idct32x32_135_add SSSE3 intrinsics performance
- Split the inv txfm into three parts to avoid stack spillover.
- Function level speed improves ~12%.
- Use function and macro to remove some repeated code.

Change-Id: I14f5f072334fd766808cb52bf648df792e7379ee
2017-03-09 16:17:54 -08:00
Johann Koenig f951881e8c Merge "ppc: include ppc.h for ppc_simd_caps()" 2017-03-09 23:12:37 +00:00
James Zern cb60e66085 Merge "move vp9_scale_and_extend_frame_c to vp9_frame_scale.c" 2017-03-09 22:51:08 +00:00
Johann 94655569fe Remove ppc-linux-gcc target
Change-Id: Iec2430966f54e2e5ba79f6bb703f47adde46479f
2017-03-09 11:33:33 -08:00
Johann ccd23215ed ppc: include ppc.h for ppc_simd_caps()
Change-Id: Idc829eb066cf4e905d062cb9c08424e0f1b7e1a7
2017-03-09 09:26:45 -08:00
James Zern 2f31a16445 move vp9_scale_and_extend_frame_c to vp9_frame_scale.c
this is similar to the x86 configuration and helps mitigate an issue
with a circular dependency between this function and the ssse3 variant
causing an outsized increase in binary size (~300K for chrome)
chrome.dll:
.text 255B000 -> 252B000
.data 7B000 -> 75000
-221184 bytes

BUG=chromium:697956

Change-Id: Ic95b142ecd62dd4f1795788aa27dd8fab59b708c
2017-03-08 21:13:50 -08:00
Marco Paniconi 04aa9e28d5 Merge "vp9: Enable two speed features for SVC real-time mode." 2017-03-09 03:58:14 +00:00
Marco ea3c817ac2 vp9: Enable two speed features for SVC real-time mode.
Enable short_circuit_low_temp_var and limit_newmv_early_exit
for SVC, 1 pass CBR mode.

Change-Id: I77df2b2c6cc40657bb8ea76e19dfc2fdaad6389e
2017-03-08 16:13:59 -08:00
Marco 97b6a6f037 vp9: Add control to vpx_temporal_svc_encoder for row-mt.
Keep it off as default for now.

Change-Id: Ia2518a8ce96c9735c3fe67215dde25a35e8620af
2017-03-08 16:03:44 -08:00
Jerome Jiang 834f26c3b9 Merge "Shift speed 2 from non-large VP9 tests to large ones." 2017-03-08 23:14:27 +00:00
Johann Koenig 42a1b310e1 Merge "Add support for POWER8/VSX" 2017-03-08 22:38:21 +00:00
Yunqing Wang 6a86492adf Merge "Make the partition search early termination feature to be frame size dependent" 2017-03-08 22:31:30 +00:00
Yunqing Wang 099e9bf1ff Make the partition search early termination feature to be frame size dependent
The 2 thresholds(i.e. partition_search_breakout_dist_thr and
partition_search_breakout_rate_thr) are used as the partition search
early termination speed feature. This refactoring patch made this
feature to be frame size dependent consistently throughout the code.

Change-Id: Idaa0bd8400badaa0f8e2091e3f41ed2544e71be9
2017-03-08 12:56:41 -08:00
Linfeng Zhang 77311e0dff Update vpx_idct32x32_1024_add_neon()
Most are cosmetics changes.
Speed has no change with clang 3.8, and about 5% faster with gcc 4.8.4

Tried the strategy used in 8x8 and 16x16 (which operations' orders are
similar to the C code), though speed gets better with gcc, it's worse
with clang.

Tried to remove store_in_output(), but speed gets worse.

Change-Id: I93c8d284e90836f98962bb23d63a454cd40f776e
2017-03-08 12:39:04 -08:00
Rafael de Lucena Valle 51289302ab Add support for POWER8/VSX
Add ppc, ppc64 and ppc64le on all_platforms and ARCH_LIST

Add VSX flags and check for -mvsx

Define empty setup_rtcd_internal

Add Altivec detection based on:
http://freevec.org/function/altivec_runtime_detection_linux

Detect VSX at runtime when enabled

Change-Id: I304f4d8c5fee0ff19b6483cd2e9cc50d6ddec472
Signed-off-by: Rafael de Lucena Valle <rafaeldelucena@gmail.com>
2017-03-08 20:28:08 +00:00
Linfeng Zhang 48f5886605 Add vpx_highbd_idct32x32_135_add_c()
When eob is less than or equal to 135 for high-bitdepth 32x32 idct,
call this function.

BUG=webm:1301

Change-Id: I8a5864f5c076e449c984e602946547a7b09c9fe6
2017-03-08 10:46:33 -08:00
Marco Paniconi 2fa710aa6d Merge "vp9: Fix for denoising with SVC." 2017-03-08 18:26:12 +00:00
Marco 45de35fc58 vp9: Fix for denoising with SVC.
Fix the conditon for getting last_source when denoising is on.
This avoids unneeded scaling in the case of SVC.

No change in quality.

Change-Id: I32c1c2c9085104da51af8535716bcc4d55fb0f42
2017-03-08 09:45:58 -08:00
Linfeng Zhang c4e5c54d69 cosmetics,dsp/arm/: vpx_idct32x32_{34,135}_add_neon()
No speed changes and disassembly is almost identical.

Change-Id: Id07996237d2607ca6004da5906b7d288b8307e1f
2017-03-08 08:58:32 -08:00
Linfeng Zhang 3cf5c213f1 cosmetics,dsp/arm/: rename a variable
Rename cospi_6_26_14_18N to cospi_6_26N_14_18N for consistency.

Change-Id: I00498b43bb612b368219a489b3adaa41729bf31a
2017-03-08 08:55:41 -08:00
Jerome Jiang c4c0331f65 Shift speed 2 from non-large VP9 tests to large ones.
This may fix the time out failure of valgrind tests in nightly
since more coverages were added on row-mt.

Change-Id: Id9414e66d1a266602c7495243d9f5cb69e17ccdc
2017-03-07 13:58:11 -08:00
James Bankoski 88a888f022 Merge "tiny_ssim.c : adds y4m support to tiny_ssim." 2017-03-07 18:49:14 +00:00
Jim Bankoski 393d9d0195 tiny_ssim.c : adds y4m support to tiny_ssim.
Change-Id: I7a13b7e3a1e11ddbe4be3009edf03528e1bc7647
2017-03-07 08:37:00 -08:00
James Zern 47cf7c25a2 Merge "vp8_create_decoder_instances: correct pbi[] memset" 2017-03-04 00:47:18 +00:00
Alex Converse 15dac923b9 Merge "Narrow cat6_high_cost tables to uint16_t" 2017-03-03 23:45:39 +00:00
James Zern 9c3c1f3725 vp8_create_decoder_instances: correct pbi[] memset
clear the entire array on error. the size used previously was equal to
the number of elements.

BUG=webm:1364

Change-Id: I2f2e16ed6e867f41d4774a5a8ac9cedaee11ce46
2017-03-03 15:23:32 -08:00
Alex Converse bcd12de6c3 Narrow cat6_high_cost tables to uint16_t
Saves 2688 bytes of rodata.

Change-Id: I46633b6e50c2845181c70fff6273a8e58fdd1e56
2017-03-03 23:09:12 +00:00
Vignesh Venkatasubramanian 9e7140b451 Merge "vp9,realtime: Enable row multithreading for non-rd" 2017-03-03 19:05:52 +00:00
Marco Paniconi 1bb63bf669 Merge "vp9: Speed 8: reduce the adaptive_rd_thresh level." 2017-03-02 22:25:03 +00:00
Marco b60617f5ff vp9: Speed 8: reduce the adaptive_rd_thresh level.
Reduce the level from 4 to 2.
This gives ~1-2% quality gain on RTC set, with small decreaee in speed (~1-2% on mac).

Change-Id: I7d959731badcee3d45b2f4a08efe378765016a13
2017-03-02 13:34:10 -08:00
Vignesh Venkatasubramanian 453f18040f vp9,realtime: Enable row multithreading for non-rd
Enable row level multithreading for realtime encodes where non-rd
path is used (speed >= 5).

Change-Id: I5439cb49a02171166d8e1de06c7d5e6f8e819a41
2017-03-02 11:03:56 -08:00
Yi Luo 07c48ccfe0 Improve idct32x32_34_add SSSE3 intrinsics performance
- Split the transform into first half and second half.
- Reschedule the instructions to avoid stack spillover.
- Function level speed improves ~16%.

Change-Id: I166889840d23aa8a273eca00f6fbdae8b4566f35
2017-03-01 11:14:48 -08:00
Chrome Cunningham b71245683b Merge "VPX_CODEC_CAP_HIGHBITDEPTH for decoder interface" 2017-03-01 18:01:14 +00:00
Chris Cunningham bcd0c49af3 VPX_CODEC_CAP_HIGHBITDEPTH for decoder interface
Moves the def from vpx_encoder.h -> vpx_codec.h. The defined value
is changed as part of this move.

Adds the value to decoder capabilities when CONFIG_VP9_HIGHBITDEPTH.

Change-Id: I7d61fc821cda29f1e32bb9b2b9ffd3d83966e419
2017-02-28 17:10:34 -08:00
James Zern 8697d14ec8 Revert "Fix for max qindex calculation of a gf interval"
This reverts commit d3db846cc5.

This change causes a large drop in psnr (4-5db) on low framerate
difficult content (tested at 360/480p)

BUG=b/35804225

Change-Id: I8e90012d3b9c8a0cddb062ba93b01b36c0e0c0a0
2017-02-28 16:26:13 -08:00
James Zern 66919e370b vp9_ethread_test,cosmetics: s/new-mt/row-mt/
Change-Id: I8c145337adf49d30b88a17ff31501b8751ed1fa0
2017-02-28 15:13:11 -08:00
James Zern 3ab8a05b37 stress.sh: add vp9_stress_test_row_mt
vp9_stress_test now forces --row-mt=0 to cover both versions

Change-Id: I8d134879435bf1d8e76ab3fd89e698efba0e86b2
2017-02-28 15:09:30 -08:00
James Zern b58a8ccb02 stress.sh: parameterize thread count
Change-Id: Iae45266cea86585f0935af4012335198cf93719f
2017-02-28 15:09:30 -08:00
James Zern 4684d286de stress.sh: add one pass encodes
Change-Id: I38e6c988f17c56fbfacd95378b27ef8d77c75f90
2017-02-28 15:09:30 -08:00
Yunqing Wang 3833905ff2 Add a comment in encoder thread test
Added a comment.

Change-Id: I82f71c72598ad6f1eaa0b57b0b8ec56ab9658e81
2017-02-28 11:13:09 -08:00
Yunqing Wang 3fa7e5c62c Set row_mt to 0 by default
Set row_mt to 0 for now.

Change-Id: I922536a6d71a765e435daeaf4d932ef14363d19a
2017-02-28 11:00:56 -08:00
Marco defe094e9e vp9: Fix an issue with setting variance thresholds.
From commit:
https://chromium-review.googlesource.com/c/441393/

On non-segment the set_vbp_thresholds() should be called
again to adjust thresholds based on content_state of superblock.
This was the intended behavior from 441393.

Small change in RTC metrics and speed.

Change-Id: I45e5fbdc4af74db76b3cb4f13074fcae0eb2219e
2017-02-27 12:09:51 -08:00
Vignesh Venkatasubramanian ddfe906be2 vp9_ethread_test: Rename new_mt to row_mt
Rename left over occurences of new_mt.

Change-Id: Ib884e84c801fcd366ca4b57ec912ac5972023375
2017-02-27 10:50:02 -08:00
Vignesh Venkatasubramanian 5881601488 vp9: Rename new_mt to row_mt
new_mt is a very generic name that will get obsolete soon enough.
Since this is exposed as a codec control, renaming it to row_mt to
signify row level paralellism. Also renaming the ETHREAD_BIT_MATCH
codec control to ROW_MT_BIT_EXACT.

Change-Id: Ic7872d78bb3b12fb4cf92ba028ec8e08eb3a9558
2017-02-27 09:43:26 -08:00
Yunqing Wang 8121f85473 Remove an old leftover comment
Removed an old comment that wasn't true anymore.

Change-Id: I286ad8d7cb2843070a55e45a599d26bc226d6bd7
2017-02-24 18:31:21 -08:00
James Zern 47d6f16a04 get_prob(): rationalize int types
promote the unsigned int calculation to uint64_t rather than int64_t for
type consistency

Change-Id: Ic34dee1dc707d9faf6a3ae250bfe39b60bef3438
2017-02-24 15:36:52 -08:00
Yunqing Wang af9002dd16 Merge "Improve VP9 encoder threading test for better coverage" 2017-02-24 23:26:23 +00:00
Yunqing Wang cc168054a8 Improve VP9 encoder threading test for better coverage
Re-organized the encoder threading tests and grouped tests into
4 parts. Added PSNR checking test to make sure the PSNR variation
is within a small range.

BUG=webm:1376

Change-Id: I09edb990236a87a4d2b2b0e1ceaf6c6435a35eff
2017-02-24 09:48:29 -08:00
Jerome Jiang e96ab22462 Merge "Make vp9_scale_and_extend_frame_ssse3 work for hbd when bitdepth = 8." 2017-02-24 16:56:33 +00:00
Johann 904b957ae9 consolidate block_error functions
vp9_highbd_block_error_8bit_c was a very simple wrapper around
vp9_block_error_c. The SSE2 implemention was practically identical to
the non-HBD one. It was missing some minor improvements which only
went into the original version.

In quick speed tests, the AVX implementation showed minimal
improvement over SSE2 when it does not detect overflow. However, when
overflow is detected the function is run a second time. The
OperationCheck test seems to trigger this case and reverses any
speed benefits by running ~60% slower. AVX2 on the other hand is
always 30-40% faster.

Change-Id: I9fcb9afbcb560f234c7ae1b13ddb69eca3988ba1
2017-02-24 05:25:26 +00:00
Johann Koenig aa911e8b41 Merge "block error sse2: use tran_low_t" 2017-02-24 05:24:34 +00:00
Jerome Jiang 0998a146d4 Make vp9_scale_and_extend_frame_ssse3 work for hbd when bitdepth = 8.
Only works for bitdepth = 8 when compiled with high bitdepth flag.
4x speed ups for handling 1:2 down/upsampling.

Validated manually for:
1) Dynamic resize for a single layer encoding
2) SVC encoding with 3 spatial layers
Results are bitexact with the patch and the speed gain (~4x) in the
scaling was verified.

BUG=webm:1371

Change-Id: I1bdb5f4d4bd0df67763fc271b6aa355e60f34712
2017-02-23 20:40:28 -08:00
Johann 3c16bbb73b block error sse2: use tran_low_t
Change-Id: Ib04990e4a7bda9fbf501f294da2057a2b2595deb
2017-02-24 01:33:35 +00:00
Johann Koenig 57e987576f Merge "vp8_fdct4x4 test: fix segfault again" 2017-02-23 07:41:21 +00:00
Marco Paniconi 1d12a125e7 Merge "vp9: 1pass CBR: modify condition for reducing loop filter." 2017-02-23 03:24:26 +00:00
Jerome Jiang a6b6258284 Merge "vp9: Non-rd pickmode: use simple block_yrd under some conditons." 2017-02-22 23:19:29 +00:00
Marco 84f106f198 vp9: 1pass CBR: modify condition for reducing loop filter.
The reduction showed improvement on RTC when aq-mode=3 is on.
Add that (cyclic refresh enabled) to the condition.

Only affects 1 pass CBR.

Change-Id: I5d0843002d8e31d7c165098a62e7a71146b08664
2017-02-22 15:09:45 -08:00
Marco 7e7d820d5b vp9: Non-rd pickmode: use simple block_yrd under some conditons.
For speed 8 only.
3% speed up for QVGA and 6.3% for VGA on Nexus 6.
~3% avgPSNR decrease on rtc_derf and 2.9% on rtc.

Disabled for now.

Change-Id: I70133f1f6c804d663d594df437bfe7fdb0030d6a
2017-02-22 13:22:53 -08:00
Marco Paniconi 0acc270830 Merge "vp9: aq-mode=3: On key frame reset cr->reduce_refresh to 0." 2017-02-22 19:52:24 +00:00
Marco 7e79831016 vp9: aq-mode=3: On key frame reset cr->reduce_refresh to 0.
This prevent possible reduction of cyclic refresh after key frame.

Change-Id: Idd4e49b69cd95476e7eccfa31b2bd8669569e9e8
2017-02-22 10:50:08 -08:00
Johann 672100a84e vp8_fdct4x4 test: fix segfault again
The output needs to be aligned. Input is read with 'movq' not 'movqda'
so it is not expected to be aligned.

Change-Id: Ibd48a84c1785917a6a97c3689a05322abba486b4
2017-02-22 18:29:11 +00:00
Jerome Jiang 3d1fa00fce vp9: Only compute y_sad for golden in variance partition for speed < 8.
Only affects speed 8. No obvious quality regression. Systematic speed
ups by ~1% on Nexus 6.

Change-Id: Ia904ca28ea041c3281c532911ec38fb7d7f46a17
2017-02-22 10:19:09 -08:00
Yunqing Wang 66f36f4735 Merge "Refactored the row based multi-threading code" 2017-02-22 16:55:04 +00:00
Jerome Jiang b1dcaf7f1e Merge "Fix segmentation fault caused by denoiser working with spatial SVC." 2017-02-22 04:44:55 +00:00
Marco 7f2daa74a0 vp9: Incorporate source sum_diff into non-rd partition thresholds.
Increase the variance partition thresholds for superblocks that
have low sum-diff (from source analysis prior to encoding frame).
Use it for now only for speed >= 7 or for denoising on.

Small change on metrics for rtc set: less than ~0.1 avgPNSR decrease
on RTC set, for both speed 7 and 8.

Change-Id: I38325046ebd5f371f51d6e91233d68ff73561af1
2017-02-21 17:22:11 -08:00
Yi Luo 6036a0d24f Following SSSE3 intrinsics functions also work for HBD
- vpx_idct8x8_12_add_ssse3
  vpx_idct8x8_64_add_ssse3
  vpx_idct32x32_34_add_ssse3
  vpx_idct32x32_135_add_ssse3
  vpx_idct32x32_1024_add_ssse3
- turn on unit tests.

Change-Id: I788b2b3b2074a6f3ab6a0e6f469c1327a123eff7
2017-02-21 12:37:53 -08:00
Johann Koenig 1e224dcb83 Merge "Drop zbin_ptr and quant_shift_ptr" 2017-02-21 18:16:38 +00:00
Jerome Jiang 0d1e5a21c4 Fix segmentation fault caused by denoiser working with spatial SVC.
Re-enable the affected test.
BUG=webm:1374

Change-Id: I98cd49403927123546d1d0056660b98c9cb8babb
2017-02-21 09:38:28 -08:00
Yi Luo 62a332160f Merge "Fix idct8x8 SSSE3 SingleExtremeCoeff unit tests" 2017-02-21 16:36:06 +00:00
Paul Wilkins 4d4231352c Merge "Change to prediction decay calculation." 2017-02-21 09:42:38 +00:00
Marco Paniconi f091752a9c Merge "vp9: Fix for non-rd pickmode for high-bitdepth build." 2017-02-21 05:37:23 +00:00
Marco 4e1ba35458 vp9: Fix for non-rd pickmode for high-bitdepth build.
Use the simple block_yrd under certain conditions.
The optimization code is completed but the speed is still slower
(~6% on 720p) than the low-bitdepth build.

For now, use the more complex block_yrd under certain conditions
(always use it for speed <= 5, otherwise use it on key frames and for
bsize >= 32x32).

This gives about ~2-3% gain in quality for speed 7 on RTC set
(over high bitdepth build), with about the same encoder fps as the
low bitdepth build.

Change-Id: Ibe92a1945d0bd635f880befb4c815727df62d754
2017-02-20 20:25:36 -08:00
Ranjit Kumar Tulabandu 97d6a4cbd1 Refactored the row based multi-threading code
Modified the code to facilitate bit-match tests in first pass
Added unit-tests to test the row based multi-threading behavior for bit-exactness

Change-Id: Ieaf6a8f935bb1075597e0a3b52d9989c8546d7df
2017-02-20 16:13:45 +05:30
James Zern bf6fcebfed vp8_fdct4x4_test: align input and output buffers
fixes segfault in 32-bit builds

Change-Id: I5b3cc5a335cb236a6ec4cb11fa8feb54ae0182c7
2017-02-18 13:30:28 -08:00
James Zern 52b3e1a633 datarate_test: disable OnePassCbrSvc2SpatialLayersDenoiserOn
segfaults

BUG=webm:1374

Change-Id: I3790c6cb8a539d13dee6a8225ef09b1575dea26c
2017-02-17 16:23:22 -08:00
Johann Koenig 9cb470eba7 Merge "vp8_short_fdct4x4: verify optimized functions" 2017-02-17 22:11:08 +00:00
Yi Luo 1f8e8e5bf1 Fix idct8x8 SSSE3 SingleExtremeCoeff unit tests
- In SSSE3 optimization, 16-bit addition and subtraction would
  overflow when input coefficient is 16-bit signed extreme values.
- Function-level speed becomes slower (unit ms):
  idct8x8_64: 284 -> 294
  idct8x8_12: 145 -> 158.

BUG=webm:1332

Change-Id: I1e4bf9d30a6d4112b8cac5823729565bf145e40b
2017-02-17 14:05:05 -08:00
James Zern 3e7025022e Merge "Add vpx_highbd_idct16x16_10_add_neon()" 2017-02-17 20:29:37 +00:00
paulwilkins a63adac604 Change to prediction decay calculation.
This change subtracts out low complexity intra regions that are also low
error in the inter domain, in the calculation of the frame prediction decay.
The rationale here his that low complexity regions (such as sky) do not imply
high prediction decay in the same way as high error intra or neutral blocks.

The effect of this is small in most clips but in a few clips it can be > 10%.
(E.g. In to tree)

Change-Id: If67ac23d17fca14285cad2defa464c61c9ea861c
2017-02-17 09:29:24 +00:00
Johann bf05cd3c99 vp8_short_fdct4x4: verify optimized functions
Change-Id: I7c7f5dfabde65c09f111fb0ced0e3ad231ee716e
2017-02-16 19:34:50 -08:00
Johann c7342f35c8 tiny_ssim: clean up on failure
Clears up clang static analysis warnings about memory leaks.

Change-Id: I60d4d0f3794735a8b81d9da4a30d19e7a9cba9cf
2017-02-17 03:28:34 +00:00
Yi Luo f62dcc9c33 Replace idct32x32_1024_add_ssse3 assembly with intrinsics
- Encoding/decoding test, BQTerrace_1920x1080_60.y4m, on
  i7-6700, no obvious user-level speed performance downgrade.
- Passed unit tests.

Change-Id: I20688e0dd3731021ec8fb4404734336f1a426bfc
2017-02-16 16:10:40 -08:00
James Zern b5bc9ee02d Merge "cosmetics: Fix spelling mistake in compile flag name." 2017-02-17 00:04:42 +00:00
Johann Koenig a9b81da575 Merge "block error avx2: use tran_low_t" 2017-02-16 23:51:14 +00:00
Linfeng Zhang 0620081731 Add vpx_highbd_idct16x16_10_add_neon()
BUG=webm:1301

Change-Id: If686c8144764c4162458f0bc4bb1bbf6555c48ab
2017-02-16 15:13:50 -08:00
James Zern 0f014c97e5 Merge "Fix mips vpx_post_proc_down_and_across_mb_row_msa function" 2017-02-16 23:02:10 +00:00
James Zern e9d07c0c2a Merge "disable VP9MultiThreadedFrameParallel tests" 2017-02-16 22:56:02 +00:00
paulwilkins d218b0914e cosmetics: Fix spelling mistake in compile flag name.
agressive -> aggressive

after:
ce7b38459 Aggressive VBR method.

Change-Id: Ie0f30b1bbc77ed9f32bec047b4a9b3d0cf4853f5
2017-02-16 14:51:31 -08:00
Johann Koenig 06a82af0de Merge "correct bitdepth_conversion_sse2.h header guard" 2017-02-16 21:41:28 +00:00
Johann ca4e27f5da Drop zbin_ptr and quant_shift_ptr
vp9[_highbd]_quantize]_fp[_32x32] and vp9_fdct8x8_quant do not make use
of these parameters.

scan is used for C code and iscan is used for SIMD implementations.

Change-Id: I908a0ff7d3febac33da97e0596e040ec7bc18ca5
2017-02-16 13:20:32 -08:00
James Zern 6ab0870d45 disable VP9MultiThreadedFrameParallel tests
these are flaky and cause TSan warnings with clang-3.9.1

BUG=webm:1372

Change-Id: I8a7047552ba2ccd2d8c45f8795818c74562e5990
2017-02-16 12:56:04 -08:00
Johann 6c2d732bf4 correct bitdepth_conversion_sse2.h header guard
Change-Id: Ic4ffd861608e67fe59bcb3a86010ce3ef11a5519
2017-02-16 12:43:33 -08:00
Yi Luo 1cb44945fb Merge "Add idct32x32_135_add SSSE3 intrinsics" 2017-02-16 20:43:29 +00:00
Johann 2104454607 block error avx2: use tran_low_t
Change-Id: Ic5f3a1f569d6f82afeaf4fcd7235374bb460db3c
2017-02-16 12:39:02 -08:00
Johann Koenig cc43012674 Merge changes I267050a5,Iebade0ef,Id96a8df3
* changes:
  quantize_fp_32x32 highbd ssse3: enable existing function
  quantize_fp highbd ssse3: use tran_low_t for coeff
  quantize_fp highbd sse2: use tran_low_t for coeff
2017-02-16 20:34:48 +00:00
Yi Luo 72a43e2378 Add idct32x32_135_add SSSE3 intrinsics
- Replace the corresponding assembly code.
- No user level speed performance degrade.
- Unit tests passed.

Change-Id: Idd0c5a4bad4976f1617c34100cb46e75e3b961e5
2017-02-16 11:29:34 -08:00
Yunqing Wang 0bf6b51572 Merge "Structured the mode ordering code to avoid redundant memcpy" 2017-02-16 16:22:54 +00:00
Johann ff37a911ce quantize_fp_32x32 highbd ssse3: enable existing function
This was created as part of the quantize_fp_ssse3 change. Both
functions use the same source file with different macro parameters.

Change-Id: I267050a559426a85955d215aa0aaca270439c5ab
2017-02-16 07:40:56 -08:00
Johann 4682130b60 quantize_fp highbd ssse3: use tran_low_t for coeff
Change-Id: Iebade0efc0efbb0a80a0f3adbef4962e3a2f25e8
2017-02-16 07:40:56 -08:00
Johann ac3996a6d1 quantize_fp highbd sse2: use tran_low_t for coeff
Change-Id: Id96a8df33354a7987ce890a3d6798c7375ffa4aa
2017-02-16 07:40:55 -08:00
Johann 44600442dc bitdepth conversion: really use num elements
The previous implementation confused bit/bytes/elements. It was using
'32' as the multiplier but that was mistakenly adopted because a 32x32
transform embedded the stride.

Change-Id: Ieeb867a332416b9a40580b5e7c9b20088e9e691a
2017-02-16 15:02:48 +00:00
Ranjit Kumar Tulabandu 5127e58dab Structured the mode ordering code to avoid redundant memcpy
Change-Id: I4f5d6b54018bd1928cd9e5e42619e6f55b334803
2017-02-16 14:12:33 +00:00
Paul Wilkins 60a10116d1 Merge "Disconnect ARF breakout from frame boost." 2017-02-16 10:02:09 +00:00
Paul Wilkins 543ebc900f Merge "Remove unnecessary factor." 2017-02-16 10:01:58 +00:00
Paul Wilkins 9216ba58d8 Merge "Bug in scale_sse_threshold()" 2017-02-16 10:01:46 +00:00
Paul Wilkins e6c1993f1b Merge "Additional first pass stats." 2017-02-16 09:39:29 +00:00
Kaustubh Raste fddf66b741 Fix mips vpx_post_proc_down_and_across_mb_row_msa function
Added fix to handle non-multiple of 16 cols case for size 16

Change-Id: If3a6d772d112077c5e0a9be9e612e1148f04338c
2017-02-16 13:17:00 +05:30
Johann Koenig b63e88e506 Merge "Use 'packssdw' for loading tran_low_t values" 2017-02-16 02:41:00 +00:00
Johann Koenig 61d05c1e67 Merge "vp8_dx_iface: remove unused 'else' condition" 2017-02-16 01:00:45 +00:00
James Zern cc04ae1565 Merge "vpx_temporal_svc_encoder.sh: remove FUNCNAME bashism" 2017-02-16 00:21:19 +00:00
Marco Paniconi e6cf741ae6 Merge "vp9: Some code cleanup for aq-mode = 3." 2017-02-15 23:03:27 +00:00
Marco 158b300952 vp9: Some code cleanup for aq-mode = 3.
The weight segment needs to only be computed once per frame,
so remove it from the funciton vp9_cyclic_refresh_rc_bits_per_mb(),
which is called within a loop inside vp9_rc_regulate_q.

Change-Id: Ia0e18b89abb97e42c466d4dbc47700d7f76555db
2017-02-15 14:07:04 -08:00
Jerome Jiang 2865de86ec vpx_temporal_svc_encoder: Expose error resilient control to cmd line.
Change-Id: Ic74a8690b136ffbc370080f70b2d5a6b1572bf63
2017-02-15 21:45:52 +00:00
Linfeng Zhang d12f25f216 Merge "cosmetics,dsp/inv_txfm.c: reorder functions" 2017-02-15 20:18:23 +00:00
Marco Paniconi 725606a678 Merge "vp9. Use same source_sad threshold for all speeds." 2017-02-15 20:07:19 +00:00
Linfeng Zhang 106c342659 cosmetics,dsp/inv_txfm.c: reorder functions
Change-Id: Ie0f7689ebe230c68eadb22a32b14838c1a7543a6
2017-02-15 11:40:35 -08:00
Linfeng Zhang d5edf56bb5 Merge "Add vpx_highbd_idct16x16_38_add_neon()" 2017-02-15 19:34:18 +00:00
Marco f82280820a vp9. Use same source_sad threshold for all speeds.
Only affects real-time mode.

Change-Id: Iba836f110c4da936f5173cc0f54424d5b6121bff
2017-02-15 11:28:26 -08:00
Marco 716c1d5ff5 Vp9: Speed 8 aq-mode=3: Reduce computation in estimating bits per mb.
vp9_compute_qdelta_by_rate has almost 2% overhead in profiling on Nexus 6.
Reduce the calling of that function in speed 8 by estimating the delta-q.
Both rtc and rtc_derf show little/no change in avg psnr/ssim.
Encoding speed is 2~3% faster on Nexus 6.

Change-Id: If25933715783f31104a18a5092ea347b1221b5f5
2017-02-15 09:28:16 -08:00
Linfeng Zhang 81914ce68a Add vpx_highbd_idct16x16_38_add_neon()
BUG=webm:1301

Change-Id: Ic6cd8c1e63e1b7a997cbed221e20fff4c599e0fe
2017-02-15 09:12:02 -08:00
Linfeng Zhang ccada0636b Merge "Add vpx_highbd_idct16x16_38_add_c()" 2017-02-15 17:06:17 +00:00
paulwilkins cfc79a357a Disconnect ARF breakout from frame boost.
This small change replaces the frame boost check in the arf group
length break out clause with a test against a prediction decay value.

The boost value is in fact partly dependent on the decay value but
this change means that the per frame boost calculation can be adjusted
without influencing the group length calculation.

The value chosen gives a close match on all the test sets with the previous
code (on average) but it was noted that a lower threshold was slightly better
for 1080P and up and a slightly higher value for small image sizes.

Change-Id: I4d5b9f67d5b17b0d99ea3f796d3d6202fd61ee0c
2017-02-15 10:46:14 +00:00
paulwilkins b89ba05ab4 Remove unnecessary factor.
Removed unnecessary scaling factor to simplify.

Change-Id: I3fc9c5975a2597e72f1324e09dd586dea1facfa7
2017-02-15 10:45:43 +00:00
paulwilkins 76550dfdc0 Bug in scale_sse_threshold()
The function scale_sse_threshold() returns a threshold scaled
if necessary for use with 10 and 12 bit from an 8 bit baseline.

SSE error values would be expected to rise for the 10 and 12
bit cases where there are more bits of precision.

Hence the threshold used for the test should also be scaled up.

Change-Id: I4009c98b6eecd1bf64c3c38aaa56598e0136b03d
2017-02-15 10:45:03 +00:00
paulwilkins 945ccfee59 Additional first pass stats.
Added counts that split the intra coded blocks into low and high variance.

Change-Id: Ic540144b34d5141659081bb22f7ee16fd6861f14
2017-02-15 10:44:37 +00:00
Paul Wilkins 7635ee0f37 Merge "Aggressive VBR method." 2017-02-15 10:37:02 +00:00
James Zern 1cd926d665 vpx_temporal_svc_encoder.sh: remove FUNCNAME bashism
replace with an explicit output file prefix that matches the function
name

Change-Id: I7f6a4105adb34327b1099a5fbf132aa8d1ad5b90
2017-02-14 23:44:00 -08:00
Johann Koenig 61927ba4ac Merge "vp9 fdct higbd neon: connect existing highbd calls" 2017-02-15 01:33:00 +00:00
Linfeng Zhang e07e74fb0f Add vpx_highbd_idct16x16_38_add_c()
When eob is less than or equal to 38 for high-bitdepth 16x16 idct,
call this function.

BUG=webm:1301

Change-Id: I09167f89d29c401f9c36710b0fd2d02644052060
2017-02-14 17:25:52 -08:00
Yunqing Wang f2c1aea118 Merge "Row based multi-threading of encoding stage" 2017-02-15 00:54:10 +00:00
Ranjit Kumar Tulabandu 71061e9332 Row based multi-threading of encoding stage
(Yunqing Wang)
This patch implements the row-based multi-threading within tiles in
the encoding pass, and substantially speeds up the multi-threaded
encoder in VP9.

Speed tests at speed 1 on STDHD(using 4 tiles) set show that the
average speedups of the encoding pass(second pass in the 2-pass
encoding) is 7% while using 2 threads, 16% while using 4 threads,
85% while using 8 threads, and 116% while using 16 threads.

Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
2017-02-15 00:49:34 +00:00
Linfeng Zhang 615566aa81 Merge "Replace 14 with DCT_CONST_BITS in idct NEON functions' shifts" 2017-02-15 00:46:29 +00:00
Johann 86fed469ec vp8_dx_iface: remove unused 'else' condition
Clears up static clang analysis warning regarding a dead store.

Change-Id: If4fe7a9a7f94c6e2001d46136944f90712e543b4
2017-02-15 00:05:41 +00:00
Johann 327a02d77e Use 'packssdw' for loading tran_low_t values
This matches bitdepth_conversion_sse2.asm and produces substantially
better assembly. The old way had lots of 'movzwl' and 'shl' and storing
back to memory before loading into an xmm register.

Change-Id: Ib33e35354dfd691a4f8b1e39f4dbcbb14cd5302b
2017-02-14 22:39:49 +00:00
Johann 3e7aa8fda9 vp9 fdct higbd neon: connect existing highbd calls
Change-Id: Ia8f822bd6e70b3911bc433a5a750bfb6f9a3a75c
2017-02-14 22:11:49 +00:00
Johann Koenig 9c2bb7f342 Merge "quantize_fp highbd neon: use tran_low_t for coeff" 2017-02-14 21:28:23 +00:00
Linfeng Zhang 429e652809 Replace 14 with DCT_CONST_BITS in idct NEON functions' shifts
Change-Id: I2a39a3bb87516b04d273bc1c0f4a634e3fb6f0f6
2017-02-14 13:08:41 -08:00
clang-format 4b402746ca apply clang-format
Change-Id: I75e4a9e0b37bd4586f26c8d6c1fa27f3f6ff1bce
2017-02-14 12:45:52 -08:00
James Zern f670628ca5 .clang-format: update to 3.9.1
Change-Id: Ia51f2201df897651067d09122075953382b59139
2017-02-14 12:39:54 -08:00
Yi Luo c1a90dc160 Merge "Replace idct32x32_34_add_ssse3 assembly with intrinsics" 2017-02-14 20:13:27 +00:00
Yi Luo bd86de1ac8 Replace idct32x32_34_add_ssse3 assembly with intrinsics
- No user-level speed performance change.
- Pass unit tests.

Change-Id: Idfc598e00f354265e41f6b3219f4734216c115c6
2017-02-14 10:38:36 -08:00
Johann 2b24aa87d9 quantize_fp highbd neon: use tran_low_t for coeff
Change-Id: I90fd815f15884490ad138f35df575a00d31e8c95
2017-02-14 10:26:10 -08:00
Johann 25301a84a8 vp8 onyx_if: assert divide by zero
Clears up static clang analysis warning regarding divide by zero.

Trying to explain to the compiler how it's impossible to avoid
incrementing num_blocks at least once is difficult.

Change-Id: Ibaae43be572e5cd7a689b440dcd341c17d33443b
2017-02-14 04:27:31 +00:00
Johann Koenig eeb288d568 Merge "Remove UNINITIALIZED_IS_SAFE" 2017-02-14 03:02:51 +00:00
Linfeng Zhang de9ae32b93 Merge "Add vpx_highbd_idct16x16_256_add_neon()" 2017-02-14 01:15:34 +00:00
Johann 8a1fb40273 Remove UNINITIALIZED_IS_SAFE
Where clang static analysis or gcc -Wmaybe-uninitialized warns of
uninitialized values, assign 0 to ints, MB_MODE_COUNT to
MB_PREDICTION_MODE, and B_MODE_COUNT to B_PREDICTION_MODE.

Assert that the modes have been changed from the invalid value by
the end of the function.

Change-Id: Ib11e1ffb08f0a6fe4b6c6729dc93b83b1c4b6350
2017-02-14 00:56:08 +00:00
Linfeng Zhang 5ad4159ebb Add vpx_highbd_idct16x16_256_add_neon()
BUG=webm:1301

Change-Id: I6bb755552a39bdd26eef3f449601f6a9766c65ec
2017-02-13 15:50:33 -08:00
Johann Koenig 4526ec7907 Merge "fdct8x8 highbd neon: use tran_low_t for output" 2017-02-13 23:11:30 +00:00
Johann 5ecde212a8 fdct8x8 highbd neon: use tran_low_t for output
Change-Id: I100c4a1955d80bec4d28e82796b3e7f57e84d0ba
2017-02-13 22:16:14 +00:00
Yunqing Wang 318ca07657 The bitstream bit match test in multi-threaded encoder
While the new-mt mode is enabled(namely, allowing to use row-based
multi-threading in encoder), several speed features that adaptively
adjust encoding parameters during encoding would cause mismatch
between single-thread encoded bitstream and multi-thread encoded
bitstream. This patch provides a set_control API to disable these
features, so that the bit match bitstream is obtained in the unit
test.

Change-Id: Ie9868bafdfe196296d1dd29e0dca517f6a9a4d60
2017-02-13 13:02:26 -08:00
Yunqing Wang e7db593a46 Merge "Minor code style refactoring" 2017-02-13 21:01:41 +00:00
James Zern 45664383f1 Merge "cosmetics,vp9_ratectrl: apply clang-format" 2017-02-13 21:01:18 +00:00
James Zern 7a48bfab47 Merge "vpx_usec_timer_elapsed: use 64-bit math" 2017-02-13 21:00:33 +00:00
Yunqing Wang f024518387 Minor code style refactoring
Change-Id: I20107693d0a87e08a10520bfb573ff3dcef69fdb
2017-02-13 12:59:01 -08:00
James Zern 3c4ea94210 cosmetics,vp9_ratectrl: apply clang-format
broken since:
c3f095c8b Merge "Fix to avoid abrupt relaxation of max qindex in recode path"
5f21aba4b Fix to avoid abrupt relaxation of max qindex in recode path

the original change pre-dated the addition of .clang-format

Change-Id: If5e399d9a805bcad9147360b13b36fbc8c560a7c
2017-02-13 11:29:39 -08:00
Linfeng Zhang 016933ad48 Add vpx_highbd_idct{16x16,32x32}_1_add_neon()
and update vpx_highbd_idct8x8_1_add_neon()

BUG=webm:1301

Change-Id: I18d1a0cbe98ba822d5194c1b4e13a4c29c5c75f4
2017-02-13 10:25:22 -08:00
paulwilkins ce7b38459a Aggressive VBR method.
VBR method that allows a wider Q range for the first normal frame
in each ARF group and then centers the min - max range for the rest of
the arf group on the chosen Q value for that first frame.

This allows for quite rapid adjustment of the active Q range even if the
initial estimate is poor.

In some cases where the ARF frames themselves are tending to
undershoot but the normal frames are overshooting this can still give
net undershoot. This can be corrected by allowing a larger Q delta for
arf frames but is usually is a sign that the allocation to the arfs was to
high.

Change-Id: Icec87758925d8f7aeb2dca29aac0ff9496237469
2017-02-13 15:42:11 +00:00
James Zern 91f87e7513 Merge "Add vpx_idct16x16_38_add_neon()" 2017-02-11 03:42:36 +00:00
Marco 22dcfa80aa vp9: Non-rd mode: use simple block_yrd for 8 bit high bitdepth builds
Temporary fix until optimization work for block_yrd is completed.
This essentially reverts back to the state before the change:
https://chromium-review.googlesource.com/c/433821/

Compression loss is about ~5-6% on RTC set.
Speed-up (from using this simple/model-based block_yrd) over the low
bitdepth builds (which uses more complex block_yrd) is ~5% on 720p.

Change-Id: Ie0af9eb0d111e5595f587870c44f08317403b8d8
2017-02-10 10:15:35 -08:00
James Zern 943f9c0356 vpx_usec_timer_elapsed: use 64-bit math
this prevents a rollover when tv_sec is a long:
signed integer overflow: 2776 * 1000000 cannot be represented in type
'long'

Change-Id: I03dc4476ee122b02e2856dad28358a20cf16a9f8
2017-02-09 19:28:59 -08:00
Paul Wilkins c3f095c8b3 Merge "Fix to avoid abrupt relaxation of max qindex in recode path" 2017-02-09 17:17:55 +00:00
Paul Wilkins 82b88a7fd0 Merge "Fix for max qindex calculation of a gf interval" 2017-02-09 17:17:44 +00:00
Linfeng Zhang bc1c18e18c Add vpx_idct16x16_38_add_neon()
The RunQuantCheck() test on it exposes 16-bit overflow in stage 7 of
pass 2. Change to use saturating add/sub for both
vpx_idct16x16_38_add_neon() and vpx_idct16x16_256_add_neon() for high
bitdepth.

Change-Id: Ibf4c107a887553a52852cc582e28d38a5a5a2712
2017-02-08 12:15:22 -08:00
Yi Luo ac04d11abc Replace idct8x8_12_add_ssse3 assembly code with intrinsics
- Performance achieves the same as assembly.
- Unit tests pass.

Change-Id: I6eacfbbd826b3946c724d78fbef7948af6406ccd
2017-02-08 10:07:45 -08:00
Linfeng Zhang 0fefc6873a Merge "Add vpx_idct16x16_38_add_c()" 2017-02-08 17:20:19 +00:00
Johann Koenig b73f99745b Merge "block_error_fp highbd sse2: use tran_low_t for coeff" 2017-02-07 23:26:10 +00:00
Marco Paniconi 71f5314993 Merge "vp9: Denoiser speed-up: increase partition and ac skip thresholds." 2017-02-07 22:25:00 +00:00
Yunqing Wang b106abe570 Merge "Row based multi-threading of ARNR filtering stage" 2017-02-07 19:55:41 +00:00
Marco Paniconi 259e835b1b Merge "vp9: Adjust rate_err threshold for setting active_worst factor." 2017-02-07 19:25:47 +00:00
Marco 1a5482d4d8 vp9: Denoiser speed-up: increase partition and ac skip thresholds.
Add factor to increase varianace partition and ac skip thresholds,
under certain conditions (noise level and sum_diff), to increase
denoiser speed.

Change-Id: I7671140ef3598bf5f114a72623d68792bcd7b77b
2017-02-07 10:33:13 -08:00
Linfeng Zhang cf76ee2cb7 Add vpx_idct16x16_38_add_c()
When eob is less than or equal to 38 for 16x16 idct, call this function.

Change-Id: Ief6f3fb16a49ace3c92cebf4e220bf5bf52a6087
2017-02-07 09:40:51 -08:00
Marco 3c2f076ad0 vp9: Adjust rate_err threshold for setting active_worst factor.
Only affects 1 pass vbr.
Small improvement on ytlive set.

Change-Id: I09a7456fe658fbea82ece1035cf683bd8bd8bd14
2017-02-07 09:38:16 -08:00
Linfeng Zhang 66695533a8 Merge "Update 16x16 8-bit idct NEON intrinsics" 2017-02-07 16:52:40 +00:00
Johann 537949a9df block_error_fp highbd sse2: use tran_low_t for coeff
BUG=webm:1365

Change-Id: Id2ed3ebaaaa6a4b68628c23e08b64ea5f1341761
2017-02-07 15:03:28 +00:00
Ranjit Kumar Tulabandu 91f01a2060 Row based multi-threading of ARNR filtering stage
Change-Id: Ic238d32c7e10b730342224ab56712a89a6026a8f
2017-02-07 14:03:19 +05:30
Johann Koenig 85f3a82355 Merge "highbd x86: consolidate tran_low_t conversions" 2017-02-07 02:49:58 +00:00
Jerome Jiang aa327a1ed4 vp9: speed 8: Tune threshold of ac skip and partitioning.
Threshold for partitioning only affects VGA and lower res.
0.07% quality regression is observed in borg tests on rtc_derf
and 0.2% regression on rtc.
5.6% speed up for low res and 6.8% for VGA on Nexus 6.

Change-Id: If85a2919b48c991de66059c90f32ed06980452be
2017-02-06 16:27:53 -08:00
Johann 641fda79bb highbd x86: consolidate tran_low_t conversions
Create new helper files specifically for converting tran_low_t types.

Change-Id: I7c4c458ef910f3b3d10a3cfbf9df4de7682fd905
2017-02-06 10:43:26 -08:00
Yunqing Wang dbc5090b5e Merge "Changes to facilitate multi-threading of encoding stage" 2017-02-04 01:02:29 +00:00
Yunqing Wang 2a21b45fdc Fix visual studio build failure
Fixed the following issue.
..\test\vp9_ethread_test.cc(69): warning C4805: '|=' : unsafe mix of type 'bool' and type 'int' in operation [C:\src\buildbot\test-libvpx\tests\dveCPjwhBE\.build-x86_64-win64-vs10\test_libvpx.vcxproj]
..\test\vp9_ethread_test.cc(69): warning C4800: 'int' : forcing value to bool 'true' or 'false' (performance warning) [C:\src\buildbot\test-libvpx\tests\dveCPjwhBE\.build-x86_64-win64-vs10\test_libvpx.vcxproj]

Change-Id: I37f897cf12a0b7500d2fcbac9e4615f08a83fdb4
2017-02-03 08:36:55 -08:00
Jerome Jiang a16ca80b09 Merge "Add unit tests for vp9_block_error_fp." 2017-02-02 22:20:42 +00:00
Jingning Han bb40844e32 Merge "Add SSSE3 intrinsic 8x8 inverse 2D-DCT" 2017-02-02 22:18:32 +00:00
Jerome Jiang 0b60d3ffa5 Add unit tests for vp9_block_error_fp.
BUG=webm:1365

Change-Id: I004e5cd7ca331d14b31b7fc3edeee45fce064026
2017-02-02 12:41:51 -08:00
Johann Koenig 8d5d21aaec Merge "Update third_party/googletest to 1.8.0" 2017-02-02 20:15:46 +00:00
Johann d89b4f5ece Update third_party/googletest to 1.8.0
Change-Id: If61137e28291f2a0911e9260eb58f234e0d8594c
2017-02-02 07:27:11 -08:00
Ranjit Kumar Tulabandu 12ec948490 Changes to facilitate multi-threading of encoding stage
Modified the encoding stage to have row level entry points with relevant
initializations and to access the token information at row level

Change-Id: Ife10e55a7c1a420ee906d711caf75002688d9e39
2017-02-02 14:47:13 +05:30
Kaustubh Raste 5b10674b5c Merge "Add mips msa sum_squares_2d_i16 function" 2017-02-02 08:09:21 +00:00
Johann Koenig 726556dde9 Merge "Remove neon assembly for idct 16x16 and 8x8" 2017-02-02 03:25:31 +00:00
Johann Koenig ce6318f254 Merge changes I43521ad3,I013659f6
* changes:
  satd highbd neon: use tran_low_t for coeff
  satd highbd sse2: use tran_low_t for coeff
2017-02-02 03:03:58 +00:00
Linfeng Zhang e4985cf619 Update 16x16 8-bit idct NEON intrinsics
Remove redundant memory accesses.

Change-Id: I8049074bdba5f49eab7e735b2b377423a69cd4c8
2017-02-01 17:04:33 -08:00
Jingning Han 8f95389742 Add SSSE3 intrinsic 8x8 inverse 2D-DCT
The intrinsic version reduces the average cycles from 183 to 175.

Change-Id: I7c1bcdb0a830266e93d8347aed38120fb3be0e03
2017-02-01 14:47:53 -08:00
Yunqing Wang 770c6663d6 Merge "Changes to facilitate row based multi-threading of ARNR filtering" 2017-02-01 22:04:15 +00:00
Johann Koenig dc90501ba3 Merge changes I374dfc08,I7e15192e,Ica414007
* changes:
  hadamard highbd ssse3: use tran_low_t for coeff
  hadamard highbd neon: use tran_low_t for coeff
  hadamard highbd sse2: use tran_low_t for coeff
2017-02-01 21:56:36 +00:00
Ranjit Kumar Tulabandu 359a6796da Changes to facilitate row based multi-threading of ARNR filtering
Change-Id: I2fd72af00afbbeb903e4fe364611abcc148f2fbb
2017-02-01 13:03:52 -08:00
Johann Koenig 5cc0a364ae Merge "vp9_rdopt: declare 'c' closer to use" 2017-02-01 20:55:12 +00:00
Johann bfd62cdaff vp9_rdopt: declare 'c' closer to use
Clears up static clang analysis warning regarding a dead store. Only
declare 'c' when it will be used.

Change-Id: I1ac0fc7f94bc44da63938c63cd1efcd6b95e0eb3
2017-02-01 19:58:24 +00:00
Johann Koenig f60171bb4f Merge "deblock: annotate postproc parameters" 2017-02-01 19:57:29 +00:00
Johann f8d744d91a satd highbd neon: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I43521ad32b6c96737a8ef2b8c327f901fd7eaf84
2017-02-01 11:55:47 -08:00
Johann 2ba383474d satd highbd sse2: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I013659f6b9fbf9cc52ab840eae520fe0b5f883fb
2017-02-01 11:55:16 -08:00
Johann 0f751ecee3 hadamard highbd ssse3: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I374dfc08732932382043905f128e928b08cb4f57
2017-02-01 11:51:15 -08:00
Johann 1eb8a718bf hadamard highbd neon: use tran_low_t for coeff
BUG=webm:1365

Change-Id: I7e15192ead3a3631755b386f102c979f06e26279
2017-02-01 11:50:46 -08:00
Johann 2dac808dd1 hadamard highbd sse2: use tran_low_t for coeff
BUG=webm:1365

Change-Id: Ica414007d8412ceebfffa9e58e8416226a3fe934
2017-02-01 11:46:57 -08:00
Johann Koenig 3bda634576 Merge "quantize ssse3: remove unused pxor" 2017-02-01 19:41:41 +00:00
Jingning Han a7949f2dd2 Make satd unit test support all bit-depth settings
Turn on satd unit test for c function in both regular and high
bit-depth settings.

Change-Id: I4b0c56addfb84964ede0da3ab760fe0ee640cfd0
2017-01-31 23:21:32 -08:00
Jingning Han 59917dd18e Unify the hadamard transform unit test for bit-depth settings
Unify the 8x8 and 16x16 Hadamard unit test system for both 8-bit
and high bit-depth settings.

Change-Id: I53373c1d43f3ced514ad1e53e03f0fb9b25d9ead
2017-01-31 23:21:32 -08:00
Jingning Han 969957f9f2 Fix real-time compression regression in hbd mode
This commit resolves the compression performance regression in
real-time encoding setting when high bit-depth mode is enabled.

The current solution temporarily disables the SIMD implementations
of vpx_satd, hadamard8x8, and hadamard16x16 in high bit-depth mode.

The commit makes the coding results bit-wise identical between
regular coding pipeline and high bit-depth at profile 0.

BUG=webm:1365

Change-Id: Icfb900821733749685370460a1a5a7e07f76f4bf
2017-01-31 23:17:09 -08:00
Johann 32f68cc58c deblock: annotate postproc parameters
Clears a clang static analyzer warning where 'cols' is assumed to be
less than 0, preventing the for loop from executing.

The assembly already requires that the size be 8 or 16 (U/V or Y plane)
and cols is a multiple of 8.

Change-Id: Ica4612690ead1638c94cfe56b306e87f8ce644f9
2017-01-31 15:58:57 -08:00
Johann Koenig 9efc42f4f8 Merge "Use Buffer class for post proc tests" 2017-01-31 15:28:28 +00:00
Kaustubh Raste 750e753134 Add mips msa sum_squares_2d_i16 function
average improvement ~4x-5x

Change-Id: I8d91b71d0677009be52b412e4f52b40b98573a53
2017-01-31 12:22:43 +00:00
Kaustubh Raste df7e1fecc1 Add mips msa vpx_minmax_8x8 function
average improvement ~4x-5x

Change-Id: I83aee9977534fddb8a9b80d31af646c0b6b1a8c3
2017-01-31 10:00:43 +05:30
Kaustubh Raste 280ad35553 Merge "Add mips msa vpx_vector_var function" 2017-01-31 02:34:51 +00:00
Johann dcfff3ccc8 quantize ssse3: remove unused pxor
Change-Id: Ifa22d77fd530827de0b32ae71810dc2213ab2937
2017-01-30 17:02:57 -08:00
Marco d47f257484 vp9: Modify bsize condition for using model_rd_large for speed 7.
In non-rd pickmode: Allow speed 7 to also use larger block size in
model_rd. Small change in behavior for speed 7.

Change-Id: I8c5523e424308e8f0bc71b3f6324dec42a464cc8
2017-01-30 11:16:51 -08:00
Yunqing Wang 106c620a23 Merge "Disable multi-threading in first pass for SVC encoding" 2017-01-28 19:29:01 +00:00
Kaustubh Raste 4ce20fb3f4 Add mips msa vpx_vector_var function
average improvement ~4x-5x

Change-Id: I2f63ef83d816052ca8dc42421e7e9d42f7a7af6b
2017-01-28 08:53:20 +00:00
Marco Paniconi 164db8278f Merge "vp9: Fix to pick_filter_level for highbitdepth build." 2017-01-27 22:47:45 +00:00
Jerome Jiang 9b1a2aa82b Merge "Add macOS Sierra support in configure" 2017-01-27 21:15:52 +00:00
Marco d94d0ed12f vp9: Fix to pick_filter_level for highbitdepth build.
Change-Id: I53b3fa8bfc0a0717eb1b730c29f2b70060b1b1b7
2017-01-27 10:44:07 -08:00
Jerome Jiang 6d38ad4146 Add macOS Sierra support in configure
BUG=webm:1367

Change-Id: I3000b6d9f93ec49ca86d08151348d33d86bf0034
2017-01-27 10:41:38 -08:00
Ranjit Kumar Tulabandu 6985a0f516 Disable multi-threading in first pass for SVC encoding
BUG=webm:1366

Change-Id: I204ef8496884ba7c4debe64f23f50d298b4090c3
2017-01-27 09:41:53 -08:00
Marco Paniconi ad1aad69fb Merge "vp9: Modify bsize condition for using model_rd_large." 2017-01-27 15:15:36 +00:00
Marco Paniconi d0495132aa Merge "vp9: Fixes for usage of skin_map for high bit depth." 2017-01-27 15:15:15 +00:00
Marco b16c77cdc4 vp9: Modify bsize condition for using model_rd_large.
In non-rd pickmode: small change in behavior for speed 6 and 7.
Remove condition on HIGHBITDEPTH flag.

Change-Id: I360a13fcc313d72612fe9b918162ef4bb278cdea
2017-01-26 22:45:27 -08:00
Kaustubh Raste 407fad2356 Add mips msa vpx Integer projection row/col functions
average improvement ~4x-5x

Change-Id: I17c41383250282b39f5ecae0197ef1df7de20801
2017-01-27 11:11:42 +05:30
Kaustubh Raste c1553f859f Merge "Add mips msa vpx satd function" 2017-01-27 04:08:51 +00:00
Marco db99840bf6 vp9: Fixes for usage of skin_map for high bit depth.
Also avoid noise_estimation and source_sad if use_highbitdepth is set.

Change-Id: I5fea396b8f8380ea377045d99ba22a52b92daa46
2017-01-26 19:57:59 -08:00
Johann f380a1658d Use Buffer class for post proc tests
Add Buffer features for:
Setting the buffer to the output of an ACMRandom function.
Copying a buffer.
Comparing two buffers.
Printing two buffers.

Change-Id: Ib53fb602451a3abdcee279ea2b65b51fbc02d3df
2017-01-26 09:50:49 -08:00
Jerome Jiang eacc3b4ccf Merge "vp9: Refactor copy partitioning to reduce duplication." 2017-01-26 17:46:11 +00:00
Jerome Jiang fe4791b0d5 vp9: Refactor copy partitioning to reduce duplication.
Change-Id: Ia1b3c118adec5eccbd2900c8e4b9ea6b1e3e9b7c
2017-01-25 17:33:04 -08:00
Yunqing Wang 4d50dc5ab5 Merge "Remove marco MVC in mcomp.c" 2017-01-26 00:32:55 +00:00
Hui Su 37cd112b0f Merge "Fix an overflow warning in optimize_b()" 2017-01-25 22:49:30 +00:00
Marco 3b2d08a93b vp9-denoiser: Modify skip denoising condition for small blocks.
Skip denoising for blocks < 16x16, and for block = 16x16
skip denoising for low noise levels and width > 480 for now.
Allow for some speed-up in denoiser.

Change-Id: Ib46cefe4741962d145fa08775defea3a9c928567
2017-01-25 11:48:09 -08:00
hui su 519b2e48a8 Fix an overflow warning in optimize_b()
BUG=webm:1361

Change-Id: Ib840bf3b39f7b3c8c017d3488a83434e9a0f45f5
2017-01-25 10:54:39 -08:00
Jerome Jiang 70a3652693 Merge "vp9: Adjust threshold for y sad used in copying partition." 2017-01-25 17:54:15 +00:00
Yunqing Wang a762cef917 Merge "Initialize errorperbit and sabperbit in ARNR filtering" 2017-01-25 16:43:02 +00:00
Yunqing Wang 633dbcb458 Merge "Multi-threading of first pass stats collection" 2017-01-25 16:40:32 +00:00
Jerome Jiang 3a7ad43fb8 vp9: Adjust threshold for y sad used in copying partition.
Visual quality improvement is observed for noisy clips. Little effects
on speed tests on Nexus 6.

Change-Id: Ib38e04002220708c34102de7b5c36e9940775d89
2017-01-24 17:20:05 -08:00
Ranjit Kumar Tulabandu 8b0c11c358 Multi-threading of first pass stats collection
(yunqingwang)
1. Rebased the patch. Incorporated recent first pass changes.
2. Turned on the first pass unit test.

Change-Id: Ia2f7ba8152d0b6dd6bf8efb9dfaf505ba7d8edee
2017-01-24 15:48:02 -08:00
Marco 8d0c8c5e6b vp9: Adjust some parameters in aq-mode=3 mode.
Increase the qp-delta, mainly for low resolutions,
excluding case of very low bitrates.

avgPSNR/SSSIM gain of ~3-5% on rtc_derf set.
Small change on rtc set.

Change-Id: Ice03d04bd0340404d1957666ef154fd64fed0606
2017-01-24 14:18:02 -08:00
Jerome Jiang 8e8e2d11bf Merge "vp9: Copy partition using avg_source_sad." 2017-01-24 20:58:09 +00:00
Jerome Jiang ac1358cd56 vp9: Copy partition using avg_source_sad.
Affecting only speed 8.
Speed tests on Nexus 6 show 4% faster for QVGA and 2.4% faster for VGA.
Little/negligible quality regression observed on both rtc and rtc_derf sets.

Change-Id: I337f301a2db49a568d18ba7623160f7678399ae1
2017-01-24 10:31:22 -08:00
Yunqing Wang 91aa1fae2a Merge "Add the multi-threaded first pass encoder unit test" 2017-01-24 17:14:07 +00:00
Ranjit Kumar Tulabandu 75d2443bf0 Initialize errorperbit and sabperbit in ARNR filtering
(Yunqing)
This patch added the missing initialization in temporal filter.
Borg test BDRate results:
PSNR: -0.019%(lowres); -0.013%(hdres);
SSIM: -0.001%(lowres); -0.010%(hdres).
Other q values gave comparable but no better results.

Change-Id: I7ad0c18b39e6f558342688e2fe1e12fdb133ce9b
2017-01-24 08:58:17 -08:00
Kaustubh Raste 182ea677a0 Add mips msa vpx satd function
average improvement ~4x-5x

Change-Id: If8683d636fe2606d4ca1038e28185bca53bbe244
2017-01-24 10:44:22 +05:30
Jerome Jiang d82b9f62a9 Merge "vp9: Adjust the threshold to set avg_source_sad_sb flag." 2017-01-24 03:43:12 +00:00
Yunqing Wang b987bc36af Remove marco MVC in mcomp.c
Removed MVC so that mv_err_cost() is always called while calculating
the mv cost.

Change-Id: I28123e05fbfc2352128e266c985d2ab093940071
2017-01-23 17:03:12 -08:00
Jerome Jiang 40ffa2839f vp9: Adjust the threshold to set avg_source_sad_sb flag.
Affect only speed 8. Small/Negligible regression on rtc set.

Change-Id: I67a6b6b4008a22ed798bd980336d95bb799f64b4
2017-01-23 16:11:28 -08:00
Johann 270fadc135 PartialIDctTest: reduce number of RunQuantCheck iterations
This currently runs 1000 * 1000 = one *million* times which is quite
unnecessary. It's one of the slowest items in Jenkins and takes over an
hour for each of the larger transforms.

Change-Id: I01653b5e610683e1a2d778ec60cf5065562ab8db
2017-01-23 13:32:09 -08:00
Marco f38ed0c560 vp9: Non-rd pickmode: fix to add ARF mode entries to THR_MODES.
BUG=webm:1359

Change-Id: Ie0c66efa2e19d1ec9c744d14e3fa8f1e6214cdd6
2017-01-23 10:56:29 -08:00
Marco b71ff28a1a vp9: Small threshold adjustment to unittest BasicRateTargeting444
Due to recent change to speed >=7 from commit:219cdab.

Change-Id: I366e7750ec91119881050ff6c05849504c7959e8
2017-01-21 18:19:45 -08:00
Kaustubh Raste 881bef00c7 Merge "Add mips msa vpx hadamard functions" 2017-01-21 03:16:39 +00:00
Jerome Jiang f4169936ee Merge "vp9: Add feature to use block source_sad for realtime mode." 2017-01-20 20:35:07 +00:00
Marco 219cdab676 vp9: Add feature to use block source_sad for realtime mode.
Only for speed >= 7, and affects skipping of intra modes.
Threshold is set low for now, needs to be tuned.
Small/no difference in metrics on rtc clips.

Change-Id: If9bdbd43f08d1f80407cdd2e9e5e96780dcd2424
2017-01-20 11:57:02 -08:00
Yunqing Wang b0d8a75e48 Add the multi-threaded first pass encoder unit test
Added the multi-threaded first pass encoder unit test in VP9. The test is
to check if the new multi-threaded first pass encoder(namely, new-mt = 1)
still generates matching stats. In the unit test, the new-mt mode will be
turned on once the multi-threaded first pass implementation is checked in.

Change-Id: Ic21bb1a55c454f024cfd2b397a4c148cfe638218
2017-01-20 10:06:24 -08:00
James Zern b608c09781 tools_common.h: add missing ';' in generic branch
missed in:
380a26112 Fix compile warnings for target=armv7-android-gcc

Change-Id: I2820fff00858a19f7dcf6e0fff189d455b7d640f
2017-01-19 15:09:59 -08:00
Johann 13234d3c43 Remove neon assembly for idct 16x16 and 8x8
Tested using test/partial_idct_test.cc:DISABLED_Speed

Both gcc 4.9 and clang 3.8 from the r13 Android NDK offer improvements
using the intrinsics:
<function>    <clang asm> <gcc asm> <clang intrin> <gcc intrin>
idct16x16_256  1720ms      1703ms    1546ms         1554ms
idct16x16_10   1320ms      1247ms     518ms          488ms
idct16x16_1     107ms       108ms      64ms           68ms
idct8x8_64      924ms       931ms     866ms          989ms
idct8x8_12      826ms       824ms     519ms          514ms
idct8x8_1       172ms       166ms     110ms          125ms

idct8x8_64 isn't quite perfect (slight regression with gcc intrinsics)
but as a counter example idct16x16_10 goes from ~1300ms to ~500ms

On a sample clip, clang improved from 48.5 to 49fps and gcc stayed roughly
stable.

BUG=webm:1303

Change-Id: I9d4fd2b41b46ea6174a887b40a82c8e6e4769ed4
2017-01-19 12:27:31 -08:00
Marco 0f9760ab6f vp9: Modify usage of force_skip under low temporal variance in non-rd pickmode.
For short_circuit set to level 1, skip newmv for 64x64 blocks if the
low temporal variance flag is set. Also modify threshold for 64x64 split
in variance partitioning.

Overall speed-up on noisy clips of 2-4%.
Only affect speed >= 7.

Change-Id: I384b3772007e84de6f8707e480d2ddf1fe1f907d
2017-01-19 11:21:15 -08:00
Kaustubh Raste e0c0e65378 Add mips msa vpx hadamard functions
average improvement ~4x-5x

Change-Id: I167132d894c04fa85dda8dde7906ff9c61b3a65d
2017-01-19 14:44:03 +05:30
Jerome Jiang ee5b29ae30 vp9: Stop copying partition every a fixed number of frames.
Avoid quality loss when copying partition of superblock with large motions.
Maximum consecutively copied frames can be set (currently 5).

Change-Id: I11c30575514f02194c0f001444cf4021609e5049
2017-01-18 11:23:59 -08:00
Peter Boström e758f9d457 Merge "Add CSV per-frame stats to vpxdec." 2017-01-18 16:32:34 +00:00
James Zern 70c9b3c668 Merge "vp9_cx_iface,encoder_encode: check validate_img return" 2017-01-18 07:36:53 +00:00
Jerome Jiang 9152d434dc vp9: Disable partition copy when resizing is enabled.
Change-Id: I4fa3262e0f1c4018604c954b020ec5d1e3d1465c
2017-01-17 18:21:31 -08:00
Jerome Jiang 255866419d Merge "vp9: Set low variance flag when partition is copied." 2017-01-17 21:02:52 +00:00
Jerome Jiang 0c65aed099 vp9: Set low variance flag when partition is copied.
Also set the flag to 1 when exit early choosing 64x64 block
such that skipping new mv for golden works in these scenerios.

Change the size of prev_segment_id to the number of superblocks
to save memory.

Borg test shows quality regression of 0.012% on average PSNR
and 0.035% on SSIM.

Change-Id: I5014224c8617d439d35c66ece3fed9ae30b31d23
2017-01-17 11:14:50 -08:00
Johann Koenig add0587fae Merge "Cygwin x86_64 support." 2017-01-17 17:45:55 +00:00
Moriyoshi Koizumi 34be6057da Cygwin x86_64 support.
This should have been taken into account at 64347a10

Change-Id: Ie8e3ad7cbaab3e5799e04bd50f2639390b0a2428
2017-01-17 09:04:37 -08:00
Peter Boström a9ae351667 Add per-frame SSIM/PSNR stats to tools/tiny_ssim.
Adds an optional output framestats.csv file that prints comparions
per-frame instead of averaged over the entire clip. It prints
per-channel and combined metrics for SSIM and PSNR.

Change-Id: Id28dfade27bc5775b59a9d83cfe8b37d1d52b686
2017-01-17 10:47:50 -05:00
Ranjit Kumar Tulabandu 5f21aba4b0 Fix to avoid abrupt relaxation of max qindex in recode path
The fix relaxes the max qindex based on the data from previous loop of
coding if output frame size is greater than maximum frame size allowed

Change-Id: Iac1f63ec67559d68766e090a7cbb80b812b2560f
2017-01-16 18:03:27 +05:30
James Zern c42a281439 vp9_cx_iface,encoder_encode: check validate_img return
before calling vp9_apply_encoding_flags() which may crash if the
resolution was invalid. this is the same change as:
c0523090b vp8e_encode: check validate_config return

BUG=https://bugzilla.mozilla.org/show_bug.cgi?id=1315288

Change-Id: Icd2aab322422e83d3a778fca6d7789e5000239d7
2017-01-13 16:53:03 -08:00
Marco 159cc3b33c vp9: Add speed feature flag for computing average source sad.
If enabled will compute source_sad for every superblock on every frame,
prior to encoding. Off by default, only on for speed=8 when
copy_partition is set.

Change-Id: Iab7903180a23dad369135e8234b7f896f20e1231
2017-01-13 11:52:12 -08:00
Marco Paniconi f217049dbe Merge "vp9: Adjust threshold for copy partiton, for speed=8." 2017-01-13 19:07:57 +00:00
Marco 47270b6858 vp9: Adjust threshold for copy partiton, for speed=8.
Change-Id: I4799cb2b67d911ee385e6d6992c61633ca77e69d
2017-01-13 10:29:31 -08:00
Jingning Han b6fe63a505 Merge "Rework 8x8 transpose SSSE3 for avg computation" 2017-01-13 18:25:17 +00:00
Jingning Han 553e9e291f Merge "Rework 8x8 transpose SSSE3 for inverse 2D-DCT" 2017-01-13 18:25:09 +00:00
Peter Boström a981cb2809 Add CSV per-frame stats to vpxdec.
Used with --framestats=file.csv. Currently prints raw codec QP (not
internal 0-63 range) and bytes per frame.

Change-Id: Ifbb90129c218dda869eaf5b810bad12a32ebd82d
2017-01-13 08:14:49 -05:00
Marco Paniconi 888bb6c133 Merge "vp9: Update threshold for partition copy." 2017-01-13 06:22:53 +00:00
Jerome Jiang 2ff2376fbc vp9: Update threshold for partition copy.
Avoid many visual artifacts. Compression quality is improved by more
than 1%. Encode speed is about 4% for QVGA and 6% for VGA faster on
android.

Change-Id: I4dd0a81429ddf7efdef1e80a191da5fb8de8e8af
2017-01-12 18:48:38 -08:00
Johann d630cda597 Merge remote-tracking branch 'origin/longtailedduck' 2017-01-12 15:40:14 -08:00
Jingning Han 39fff1bea0 Rework 8x8 transpose SSSE3 for avg computation
Use same transpose process as inv_txfm_sse2 does.

Change-Id: I2db05f0b254628a11f621c4c09abb89501ba6d3c
2017-01-12 15:16:07 -08:00
Jingning Han f65170ea84 Rework 8x8 transpose SSSE3 for inverse 2D-DCT
Use same transpose process as inv_txfm_sse2 does.

Change-Id: Ic4827825bd174cba57a0a80e19bf458a648e7d94
2017-01-12 15:13:18 -08:00
Peter Boström 297dfd8696 Add decoder getters for the last quantizer.
To be used for frame stats output of vpxdec.

Change-Id: I0739a01bd3635c4b3fedd58f3e27363ce8fb1b1e
2017-01-12 16:16:22 -05:00
Marco Paniconi baa4a290eb Merge "vp9: Make the denoiser work with spatial SVC." 2017-01-12 17:54:41 +00:00
Johann Koenig 3628975a15 Merge "Create a class for buffers used in tests" 2017-01-12 01:02:58 +00:00
Peter Boström 70ead526f1 Merge "Add Y,U,V channel metrics and unweighted metrics." 2017-01-11 21:05:47 +00:00
Jerome Jiang 6e07bf3634 Merge "vp9: Turn on the partition copy for speed 8. Tune threshold." 2017-01-11 20:50:43 +00:00
Johann Koenig 9f27d1f843 Merge "arm idct16x16: remove extra config guards" 2017-01-11 20:22:27 +00:00
Peter Boström 605ca82e7f Add Y,U,V channel metrics and unweighted metrics.
Renames SSIM to VpxSSIM as an upscaled weighted SSIM metric, then prints
Y, U and V channels unweighted as well as a weighted but not scaled SSIM
score that's 8/1/1 parts Y/U/V (same as VpxSSIM).

Change-Id: Iff800cc8f145314eeb1a9b4af1e11a25bec095ca
2017-01-11 14:51:18 -05:00
Jingning Han d98bd7cc5f Merge "Rework forward 8x8 2D-DCT ssse3 implementation" 2017-01-11 19:28:39 +00:00
Jerome Jiang f129e09529 vp9: Turn on the partition copy for speed 8. Tune threshold.
For speed 8, it speeds up the encoding on android by 6% for QVGA and
7.4% for VGA with the new threshold. Overall PSNR is improved by 0.667
for rtc.

Change-Id: I4a644560b32c0b5b4e9f49ffb953d000413a3732
2017-01-11 10:48:16 -08:00
Johann 68d0f46ec0 arm idct16x16: remove extra config guards
This file is guarded by HAVE_NEON_ASM in the .mk file now.

Change-Id: I513a621c234aa90ad52e426c8ed494d8a7d4b74a
2017-01-11 10:17:14 -08:00
Johann 6886da7547 Create a class for buffers used in tests
Demonstrate its use with the IDCT test.

Change-Id: Idf87fe048847c180f13818fd4df916ba4500134b
2017-01-11 08:28:39 -08:00
hui su 7a0bfa6ec6 Add "Large" label to VP9 target level tests
Also reduce the number of test frames.

Change-Id: Iea6fa93ca6b924535aef7bf8b388db4d0ec84c08
2017-01-10 17:29:43 -08:00
Marco 7e3a82c384 vp9: Make the denoiser work with spatial SVC.
If enabled denoiser will only denoise the top spatial layer for now.

Added unittest for SVC with denoising.

Change-Id: Ifa373771c4ecfa208615eb163cc38f1c22c6664b
2017-01-10 17:23:58 -08:00
Jingning Han 9a780fa7db Rework forward 8x8 2D-DCT ssse3 implementation
This commit reworks the SSSE3 implementation of the forward 8x8
2D-DCT. It uses a cyclic rotation approach to the temporary xmm
registers. It reduces the average cycles from 158 to 154. The SSE2
version uses 169 cycles.

Change-Id: I1b79b9642aae0ed3fb3cefb5b70246e6de5d5caa
2017-01-10 12:50:55 -08:00
Marco 91fc730d83 vp9: 1 pass cbr: Adjustments to usage of gf_cbr_boost and aq=3 mode.
When aq=3 mode is on and the gf_cbr_boost is set: make sure golden frame
is always refreshed, and don't incorporate segement cost in qp setting
on the boosted golden frame.

Better performance on RTC set with gf_cbr_boost on,
for example with gf_cbr_boost=50, gains from ~0.5-3%.

Change-Id: Ie811f5e4d444ff3320bd6e2c1745b2c4c09a8460
2017-01-10 09:42:06 -08:00
Jerome Jiang 299ef2f8eb Merge "vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8." 2017-01-10 00:51:09 +00:00
Jerome Jiang 198b834c97 vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8.
Quality improved by 1.866 and 0.386 for two noisy clips (dark720p and
marcooffice720p), respectively.

Change-Id: Ib33a7672ae9ca53da156208f7cd13f07b5543e44
2017-01-09 16:44:07 -08:00
Jerome Jiang 5b1a8ca5e8 Merge "Fix compile warnings for target=armv7-android-gcc" 2017-01-09 23:53:41 +00:00
James Zern 9480da21e8 Merge "Refine 8-bit 16x16 idct NEON intrinsics" 2017-01-09 23:52:29 +00:00
Marco Paniconi 62cce50d55 Merge "vp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used." 2017-01-09 23:30:32 +00:00
Marco 35c4a13eb7 vp9: Fix comment in speed features.
Change-Id: I65d79c06b152922d725bf559adaa508f91cd5766
2017-01-09 13:05:31 -08:00
Marco bea22782e9 vp9: 1 pass cbr: Fix to qp clamping when gf_cbr_boost_pct is used.
Avoid the qp-clamping on gf/alt frame if gf_cbr_boost_pct is set.

Change only affect CBR mode when  gf_cbr_boost_pct is set.

Change-Id: I0655ed4f2b047c8ed1ed33a070c17960ad776704
2017-01-09 12:52:50 -08:00
Johann Koenig 371a64bfe7 Merge "postproc: vpx_mbpost_proc_down_neon" 2017-01-09 19:53:15 +00:00
Johann c23970ec25 postproc: vpx_mbpost_proc_down_neon
This was much more amenable to optimization than the across filter.
Speedup of almost 2.5x

BUG=webm:1320

Change-Id: I49acc0f9cb2e7642303df90132cbc938acade4c4
2017-01-09 10:21:56 -08:00
Linfeng Zhang 6abdd31555 Refine 8-bit 16x16 idct NEON intrinsics
Speed test shows 25% gain on vpx_idct16x16_256_add_neon(),
and vpx_idct16x16_10_add_neon() got trippled.

Change-Id: If8518d9b6a3efab74031297b8d40cd83c4a49541
2017-01-06 17:52:07 -08:00
Ranjit Kumar Tulabandu d3db846cc5 Fix for max qindex calculation of a gf interval
Calculation of active_worst_quality of a gf interval is modified
for coherency

BUG=webm:1355

Change-Id: I84cc2b47a8713f102a69419fb33ab020cffa3e71
2017-01-03 10:24:02 -08:00
Jerome Jiang 380a26112c Fix compile warnings for target=armv7-android-gcc
Fix compile warnings about implicit type conversion for
target=armv7-android-gcc in vpxenc.c.

BUG=webm:1348

Change-Id: I9fbabd843512f2a1a09f4bb934cd091e834eed9c
2016-12-22 14:56:20 -08:00
1214 changed files with 246526 additions and 117218 deletions
+1 -83
View File
@@ -1,91 +1,9 @@
---
Language: Cpp
# BasedOnStyle: Google
# Generated with clang-format 3.8.1
AccessModifierOffset: -1
AlignAfterOpenBracket: Align
AlignConsecutiveAssignments: false
AlignConsecutiveDeclarations: false
AlignEscapedNewlinesLeft: true
AlignOperands: true
AlignTrailingComments: true
AllowAllParametersOfDeclarationOnNextLine: true
AllowShortBlocksOnASingleLine: false
BasedOnStyle: Google
AllowShortCaseLabelsOnASingleLine: true
AllowShortFunctionsOnASingleLine: All
AllowShortIfStatementsOnASingleLine: true
AllowShortLoopsOnASingleLine: true
AlwaysBreakAfterDefinitionReturnType: None
AlwaysBreakAfterReturnType: None
AlwaysBreakBeforeMultilineStrings: true
AlwaysBreakTemplateDeclarations: true
BinPackArguments: true
BinPackParameters: true
BraceWrapping:
AfterClass: false
AfterControlStatement: false
AfterEnum: false
AfterFunction: false
AfterNamespace: false
AfterObjCDeclaration: false
AfterStruct: false
AfterUnion: false
BeforeCatch: false
BeforeElse: false
IndentBraces: false
BreakBeforeBinaryOperators: None
BreakBeforeBraces: Attach
BreakBeforeTernaryOperators: true
BreakConstructorInitializersBeforeComma: false
ColumnLimit: 80
CommentPragmas: '^ IWYU pragma:'
ConstructorInitializerAllOnOneLineOrOnePerLine: false
ConstructorInitializerIndentWidth: 4
ContinuationIndentWidth: 4
Cpp11BracedListStyle: false
DerivePointerAlignment: false
DisableFormat: false
ExperimentalAutoDetectBinPacking: false
ForEachMacros: [ foreach, Q_FOREACH, BOOST_FOREACH ]
IncludeCategories:
- Regex: '^<.*\.h>'
Priority: 1
- Regex: '^<.*'
Priority: 2
- Regex: '.*'
Priority: 3
IndentCaseLabels: true
IndentWidth: 2
IndentWrappedFunctionNames: false
KeepEmptyLinesAtTheStartOfBlocks: false
MacroBlockBegin: ''
MacroBlockEnd: ''
MaxEmptyLinesToKeep: 1
NamespaceIndentation: None
ObjCBlockIndentWidth: 2
ObjCSpaceAfterProperty: false
ObjCSpaceBeforeProtocolList: false
PenaltyBreakBeforeFirstCallParameter: 1
PenaltyBreakComment: 300
PenaltyBreakFirstLessLess: 120
PenaltyBreakString: 1000
PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 200
PointerAlignment: Right
ReflowComments: true
SortIncludes: false
SpaceAfterCStyleCast: false
SpaceBeforeAssignmentOperators: true
SpaceBeforeParens: ControlStatements
SpaceInEmptyParentheses: false
SpacesBeforeTrailingComments: 2
SpacesInAngles: false
SpacesInContainerLiterals: true
SpacesInCStyleCastParentheses: false
SpacesInParentheses: false
SpacesInSquareBrackets: false
Standard: Auto
TabWidth: 8
UseTab: Never
...
+2 -18
View File
@@ -1,18 +1,2 @@
*.[chs] filter=fixtabswsp
*.[ch]pp filter=fixtabswsp
*.[ch]xx filter=fixtabswsp
*.asm filter=fixtabswsp
*.php filter=fixtabswsp
*.pl filter=fixtabswsp
*.sh filter=fixtabswsp
*.txt filter=fixwsp
[Mm]akefile filter=fixwsp
*.mk filter=fixwsp
*.rc -crlf
*.ds[pw] -crlf
*.bat -crlf
*.mmp -crlf
*.dpj -crlf
*.pjt -crlf
*.vcp -crlf
*.inf -crlf
configure eol=lf
*.sh eol=lf
+3
View File
@@ -7,8 +7,10 @@
*.o
*~
.cproject
.idea
.project
.settings
.vscode
/*-*.mk
/*.asm
/*.doxy
@@ -20,6 +22,7 @@
/.install-*
/.libs
/Makefile
/arm_neon.h
/config.log
/config.mk
/docs/
+24 -3
View File
@@ -1,37 +1,58 @@
Adrian Grange <agrange@google.com>
Aex Converse <aconverse@google.com>
Aex Converse <aconverse@google.com> <alex.converse@gmail.com>
Aex Converse <alexconv@twitch.tv>
Aex Converse <alexconv@twitch.tv> <aconverse@google.com>
Aex Converse <alexconv@twitch.tv> <alex.converse@gmail.com>
Alexis Ballier <aballier@gentoo.org> <alexis.ballier@gmail.com>
Alpha Lam <hclam@google.com> <hclam@chromium.org>
Angie Chiang <angiebird@google.com>
Bohan Li <bohanli@google.com>
Chris Cunningham <chcunningham@chromium.org>
Chi Yo Tsai <chiyotsai@google.com>
Daniele Castagna <dcastagna@chromium.org> <dcastagna@google.com>
Deb Mukherjee <debargha@google.com>
Elliott Karpilovsky <elliottk@google.com>
Erik Niemeyer <erik.a.niemeyer@intel.com> <erik.a.niemeyer@gmail.com>
Fyodor Kyslov <kyslov@google.com>
Gregor Jasny <gjasny@gmail.com>
Gregor Jasny <gjasny@gmail.com> <gjasny@googlemail.com>
Guillaume Martres <gmartres@google.com> <smarter3@gmail.com>
Hangyu Kuang <hkuang@google.com>
Hui Su <huisu@google.com>
Jacky Chen <jackychen@google.com>
Jim Bankoski <jimbankoski@google.com>
Johann Koenig <johannkoenig@google.com>
Johann Koenig <johannkoenig@google.com> <johannkoenig@dhcp-172-19-7-52.mtv.corp.google.com>
Johann Koenig <johannkoenig@google.com> <johann.koenig@duck.com>
Johann Koenig <johannkoenig@google.com> <johann.koenig@gmail.com>
Johann Koenig <johannkoenig@google.com> <johannkoenig@chromium.org>
Johann <johann@duck.com> <johann.koenig@gmail.com>
John Koleszar <jkoleszar@google.com>
Joshua Litt <joshualitt@google.com> <joshualitt@chromium.org>
Konstantinos Margaritis <konma@vectorcamp.gr> <konstantinos@vectorcamp.gr>
Marco Paniconi <marpan@google.com>
Marco Paniconi <marpan@google.com> <marpan@chromium.org>
Martin Storsjö <martin@martin.st>
Michael Horowitz <mhoro@webrtc.org> <mhoro@google.com>
Pascal Massimino <pascal.massimino@gmail.com>
Paul Wilkins <paulwilkins@google.com>
Peter Boström <pbos@chromium.org> <pbos@google.com>
Peter de Rivaz <peter.derivaz@gmail.com>
Peter de Rivaz <peter.derivaz@gmail.com> <peter.derivaz@argondesign.com>
Ralph Giles <giles@xiph.org> <giles@entropywave.com>
Ralph Giles <giles@xiph.org> <giles@mozilla.com>
Ronald S. Bultje <rsbultje@gmail.com> <rbultje@google.com>
Sai Deng <sdeng@google.com>
Sami Pietilä <samipietila@google.com>
Shiyou Yin <yinshiyou-hf@loongson.cn>
Tamar Levy <tamar.levy@intel.com>
Tamar Levy <tamar.levy@intel.com> <levytamar82@gmail.com>
Tero Rintaluoma <teror@google.com> <tero.rintaluoma@on2.com>
Timothy B. Terriberry <tterribe@xiph.org> <tterriberry@mozilla.com>
Tom Finegan <tomfinegan@google.com>
Tom Finegan <tomfinegan@google.com> <tomfinegan@chromium.org>
Urvang Joshi <urvang@google.com> <urvang@chromium.org>
Yaowu Xu <yaowu@google.com> <adam@xuyaowu.com>
Yaowu Xu <yaowu@google.com> <yaowu@xuyaowu.com>
Yaowu Xu <yaowu@google.com> <Yaowu Xu>
Venkatarama NG. Avadhani <venkatarama.avadhani@ittiam.com>
Vitaly Buka <vitalybuka@chromium.org> <vitlaybuka@chromium.org>
Xiwei Gu <guxiwei-hf@loongson.cn>
+89 -4
View File
@@ -3,13 +3,15 @@
Aaron Watry <awatry@gmail.com>
Abo Talib Mahfoodh <ab.mahfoodh@gmail.com>
Adam Xu <adam@xuyaowu.com>
Adam B. Goode <adam.mckee84@gmail.com>
Adrian Grange <agrange@google.com>
Aex Converse <aconverse@google.com>
Ahmad Sharif <asharif@google.com>
Aidan Welch <aidansw@yahoo.com>
Aleksey Vasenev <margtu-fivt@ya.ru>
Alexander Potapenko <glider@google.com>
Alexander Voronov <avoronov@graphics.cs.msu.ru>
Alexandra Hájková <alexandra.khirnova@gmail.com>
Aex Converse <alexconv@twitch.tv>
Alexis Ballier <aballier@gentoo.org>
Alok Ahuja <waveletcoeff@gmail.com>
Alpha Lam <hclam@google.com>
@@ -17,17 +19,37 @@ A.Mahfoodh <ab.mahfoodh@gmail.com>
Ami Fischman <fischman@chromium.org>
Andoni Morales Alastruey <ylatuya@gmail.com>
Andres Mejia <mcitadel@gmail.com>
Andrew Lewis <andrewlewis@google.com>
Andrew Russell <anrussell@google.com>
Andrew Salkeld <andrew.salkeld@arm.com>
Angie Chen <yunqi@google.com>
Angie Chiang <angiebird@google.com>
Anton Venema <anton.venema@liveswitch.com>
Anupam Pandey <anupam.pandey@ittiam.com>
Aron Rosenberg <arosenberg@logitech.com>
Attila Nagy <attilanagy@google.com>
Birk Magnussen <birk.magnussen@googlemail.com>
Bohan Li <bohanli@google.com>
Brian Foley <bpfoley@google.com>
Brion Vibber <bvibber@wikimedia.org>
Casey Smalley <casey.smalley@arm.com>
changjun.yang <changjun.yang@intel.com>
Charles 'Buck' Krasic <ckrasic@google.com>
Cheng Chen <chengchen@google.com>
Chen Wang <wangchen20@iscas.ac.cn>
Cherma Rajan A <cherma.rajan@ittiam.com>
Chi Yo Tsai <chiyotsai@google.com>
chm <chm@rock-chips.com>
Chris Cunningham <chcunningham@chromium.org>
Christian Duvivier <cduvivier@google.com>
Chunbo Hua <chunbo.hua@intel.com>
Chun-Min Chang <chun.m.chang@gmail.com>
Clement Courbet <courbet@google.com>
Daniel Cheng <dcheng@chromium.org>
Daniele Castagna <dcastagna@chromium.org>
Daniel Kang <ddkang@google.com>
Daniel Sommermann <dcsommer@gmail.com>
Dan Zhu <zxdan@google.com>
Deb Mukherjee <debargha@google.com>
Deepa K G <deepa.kg@ittiam.com>
Dim Temp <dimtemp0@gmail.com>
@@ -35,28 +57,41 @@ Dmitry Kovalev <dkovalev@google.com>
Dragan Mrdjan <dmrdjan@mips.com>
Ed Baker <edward.baker@intel.com>
Ehsan Akhgari <ehsan.akhgari@gmail.com>
Elliott Karpilovsky <elliottk@google.com>
Erik Niemeyer <erik.a.niemeyer@intel.com>
Fabio Pedretti <fabio.ped@libero.it>
Frank Galligan <fgalligan@google.com>
Fredrik Söderquist <fs@opera.com>
Fritz Koenig <frkoenig@google.com>
Fyodor Kyslov <kyslov@google.com>
Gabriel Marin <gmx@chromium.org>
Gaute Strokkenes <gaute.strokkenes@broadcom.com>
George Steed <george.steed@arm.com>
Gerda Zsejke More <gerdazsejke.more@arm.com>
Geza Lore <gezalore@gmail.com>
Ghislain MARY <ghislainmary2@gmail.com>
Giuseppe Scrivano <gscrivano@gnu.org>
Gordana Cmiljanovic <gordana.cmiljanovic@imgtec.com>
Gregor Jasny <gjasny@gmail.com>
Guillaume Martres <gmartres@google.com>
Guillermo Ballester Valor <gbvalor@gmail.com>
Hangyu Kuang <hkuang@google.com>
Hanno Böck <hanno@hboeck.de>
Han Shen <shenhan@google.com>
Hao Chen <chenhao@loongson.cn>
Hari Limaye <hari.limaye@arm.com>
Harish Mahendrakar <harish.mahendrakar@ittiam.com>
Henrik Lundin <hlundin@google.com>
Hien Ho <hienho@google.com>
Hirokazu Honda <hiroh@chromium.org>
Hui Su <huisu@google.com>
Ilya Kurdyukov <jpegqs@gmail.com>
Ivan Krasin <krasin@chromium.org>
Ivan Maltz <ivanmaltz@google.com>
Jacek Caban <cjacek@gmail.com>
Jacky Chen <jackychen@google.com>
James Berry <jamesberry@google.com>
James Touton <bekenn@gmail.com>
James Yu <james.yu@linaro.org>
James Zern <jzern@google.com>
Jan Gerber <j@mailb.org>
@@ -66,16 +101,25 @@ Jean-Yves Avenard <jyavenard@mozilla.com>
Jeff Faust <jfaust@google.com>
Jeff Muizelaar <jmuizelaar@mozilla.com>
Jeff Petkau <jpet@chromium.org>
Jeremy Leconte <jleconte@google.com>
Jerome Jiang <jianj@google.com>
Jia Jia <jia.jia@linaro.org>
Jianhui Dai <jianhui.j.dai@intel.com>
Jian Zhou <zhoujian@google.com>
Jim Bankoski <jimbankoski@google.com>
jinbo <jinbo-hf@loongson.cn>
Jin Bo <jinbo@loongson.cn>
Jingning Han <jingning@google.com>
Joel Fernandes <joelaf@google.com>
Joey Parrish <joeyparrish@google.com>
Johann <johann@duck.com>
Johann Koenig <johannkoenig@google.com>
John Koleszar <jkoleszar@google.com>
Johnny Klonaris <google@jawknee.com>
John Stark <jhnstrk@gmail.com>
Jonathan Wright <jonathan.wright@arm.com>
Jon Kunkee <jkunkee@microsoft.com>
Jorge E. Moreira <jemoreira@google.com>
Joshua Bleecher Snyder <josh@treelinelabs.com>
Joshua Litt <joshualitt@google.com>
Julia Robson <juliamrobson@gmail.com>
@@ -83,27 +127,41 @@ Justin Clift <justin@salasaga.org>
Justin Lebar <justin.lebar@gmail.com>
Kaustubh Raste <kaustubh.raste@imgtec.com>
KO Myung-Hun <komh@chollian.net>
Konstantinos Margaritis <konma@vectorcamp.gr>
Kyle Siefring <kylesiefring@gmail.com>
Lawrence Velázquez <larryv@macports.org>
L. E. Segovia <amy@amyspark.me>
Linfeng Zhang <linfengz@google.com>
Liu Peng <pengliu.mail@gmail.com>
Lou Quillio <louquillio@google.com>
Luca Barbato <lu_zero@gentoo.org>
Luc Trudeau <luc@trud.ca>
Lu Wang <wanglu@loongson.cn>
Makoto Kato <makoto.kt@gmail.com>
Mans Rullgard <mans@mansr.com>
Marco Paniconi <marpan@google.com>
Mark Mentovai <mark@chromium.org>
Martin Ettl <ettl.martin78@googlemail.com>
Martin Storsjo <martin@martin.st>
Martin Storsjö <martin@martin.st>
Matthew Heaney <matthewjheaney@chromium.org>
Matthias Räncker <theonetruecamper@gmx.de>
Michael Horowitz <mhoro@webrtc.org>
Michael Kohler <michaelkohler@live.com>
Mike Frysinger <vapier@chromium.org>
Mike Hommey <mhommey@mozilla.com>
Mikhal Shemer <mikhal@google.com>
Mikko Koivisto <mikko.koivisto@unikie.com>
Min Chen <chenm003@gmail.com>
Minghai Shang <minghai@google.com>
Min Ye <yeemmi@google.com>
Mirko Bonadei <mbonadei@google.com>
Moriyoshi Koizumi <mozo@mozo.jp>
Morton Jonuschat <yabawock@gmail.com>
Nathan E. Egge <negge@mozilla.com>
Neeraj Gadgil <neeraj.gadgil@ittiam.com>
Neil Birkbeck <neil.birkbeck@gmail.com>
Nico Weber <thakis@chromium.org>
Niveditha Rau <niveditha.rau@gmail.com>
Parag Salasakar <img.mips1@gmail.com>
Pascal Massimino <pascal.massimino@gmail.com>
Patrik Westin <patrik.westin@gmail.com>
@@ -111,29 +169,45 @@ Paul Wilkins <paulwilkins@google.com>
Pavol Rusnak <stick@gk2.sk>
Paweł Hajdan <phajdan@google.com>
Pengchong Jin <pengchong@google.com>
Peter Boström <pbos@google.com>
Peter Boström <pbos@chromium.org>
Peter Collingbourne <pcc@chromium.org>
Peter de Rivaz <peter.derivaz@gmail.com>
Peter Kasting <pkasting@chromium.org>
Philip Jägenstedt <philipj@opera.com>
Priit Laes <plaes@plaes.org>
Rafael Ávila de Espíndola <rafael.espindola@gmail.com>
Rafaël Carré <funman@videolan.org>
Rafael de Lucena Valle <rafaeldelucena@gmail.com>
Rahul Chaudhry <rahulchaudhry@google.com>
Ralph Giles <giles@xiph.org>
Ranjit Kumar Tulabandu <ranjit.tulabandu@ittiam.com>
Raphael Kubo da Costa <raphael.kubo.da.costa@intel.com>
Ravi Chaudhary <ravi.chaudhary@ittiam.com>
Ritu Baldwa <ritu.baldwa@ittiam.com>
Rob Bradford <rob@linux.intel.com>
Ronald S. Bultje <rsbultje@gmail.com>
Rui Ueyama <ruiu@google.com>
Sai Deng <sdeng@google.com>
Salome Thirot <salome.thirot@arm.com>
Sami Pietilä <samipietila@google.com>
Sam James <sam@gentoo.org>
Sarah Parker <sarahparker@google.com>
Sasi Inguva <isasi@google.com>
Scott Graham <scottmg@chromium.org>
Scott LaVarnway <slavarnway@google.com>
Sean McGovern <gseanmcg@gmail.com>
Sergey Kolomenkin <kolomenkin@gmail.com>
Sergey Silkin <ssilkin@google.com>
Sergey Ulanov <sergeyu@chromium.org>
Shimon Doodkin <helpmepro1@gmail.com>
Shiyou Yin <yinshiyou-hf@loongson.cn>
Shubham Tandle <shubham.tandle@ittiam.com>
Shunyao Li <shunyaoli@google.com>
Sreerenj Balachandran <bsreerenj@gmail.com>
Stefan Holmer <holmer@google.com>
Suman Sunkara <sunkaras@google.com>
Supradeep T R <supradeep.tr@ittiam.com>
Sylvestre Ledru <sylvestre@mozilla.com>
Taekhyun Kim <takim@nvidia.com>
Takanori MATSUURA <t.matsuu@gmail.com>
Tamar Levy <tamar.levy@intel.com>
@@ -145,13 +219,24 @@ Timothy B. Terriberry <tterribe@xiph.org>
Tom Finegan <tomfinegan@google.com>
Tristan Matthews <le.businessman@gmail.com>
Urvang Joshi <urvang@google.com>
Venkatarama NG. Avadhani <venkatarama.avadhani@ittiam.com>
Vignesh Venkatasubramanian <vigneshv@google.com>
Vitaly Buka <vitalybuka@chromium.org>
Vlad Tsyrklevich <vtsyrklevich@chromium.org>
Wan-Teh Chang <wtc@google.com>
Wonkap Jang <wonkap@google.com>
Xiahong Bao <xiahong.bao@nxp.com>
Xiwei Gu <guxiwei-hf@loongson.cn>
Yaowu Xu <yaowu@google.com>
Yi Luo <luoyi@google.com>
Yongzhe Wang <yongzhe@google.com>
yuanhecai <yuanhecai@loongson.cn>
Yue Chen <yuec@google.com>
Yun Liu <yliuyliu@google.com>
Yunqing Wang <yunqingwang@google.com>
Yury Gitman <yuryg@google.com>
Zoe Liu <zoeliu@google.com>
Zoltan Kuscsik <zoltan@s57.io>
Google Inc.
The Mozilla Foundation
The Xiph.Org Foundation
+444 -1
View File
@@ -1,3 +1,446 @@
2025-01-09 v1.15.1 "Wigeon Duck"
This release bumps up the SO major version and fixes the language about ABI
compatibility in the previous release changelog.
2024-10-22 v1.15.0 "Wigeon Duck"
This release includes new codec control for key frame filtering, more Neon
optimizations, improvements to RTC encoding and bug fixes.
- Upgrading:
This release is ABI incompatible with the previous release.
It is strongly recommended to skip this release and upgrade to v1.15.1 since
the shared object was versioned incorrectly, as shown in
https://issues.webmproject.org/issues/384672478.
Temporal filtering improvement that can be turned on with the new codec
control VP9E_SET_KEY_FRAME_FILTERING, which gives 1+% BD-rate saving with
minimal encoder time increase.
libwebm is upgraded to libwebm-1.0.0.31-10-g3b63004
- Enhancement:
Neon optimization speed up
1-3% speed up across speed 5 to 10 for RTC
3% speed up for speed 0 and 1 for VoD in standard bitdepth
3% and 7% speed up for speed 0 and 1 respectively for VoD in high bitdepth
Scene detection is allowed for all RTC speeds (>=5)
Support profile guided optimizations
Delta quantization parameters for UV channels for vp8 is supported in RTC
rate control library
Rate control parameters are reset and maximum QP is enforced on scene
changes in SVC when there is no inter-layer prediction
- Bug fixes:
Fix to Uninitialized scalar variable in `vp9_rd_pick_inter_mode_sb()`
Fix to Integer-overflow in `resize_multistep`
Fix to Heap-buffer-overflow in `vpx_sad64x64_avx2`
Fix to Crash in `vpx_sad8x8_sse2`
Fix to Assertion in `write_modes`
Support profile guided optimizations
Fix to Integer-overflow in `encode_frame_to_data_rate`
Fix to Integer-overflow in `vp9_svc_check_reset_layer_rc_flag`
Fix to core dump error from /usr/bin/tools/tiny_ssim --help
Fix to use-of-uninitialized-value in `vp9_setup_tpl_stats`
Fix to Undefined-shift in `vp9_cyclic_refresh_setup`
Fix to redundant `&& __GNUC__` preproc check
Fix to valgrind warning in EncodeAPI.OssFuzz69906
Fix to Index-out-of-bounds in `vp8_rd_pick_inter_mode`
Fix to Integer-overflow in `vp8_pick_frame_size`
Fix to Use-of-uninitialized-value in `vpx_codec_peek_stream_info`
Fix to log clutters with the message "Warning: Desired height too large"
Fix to Integer-overflow in `vp9_svc_adjust_avg_frame_qindex`
Fix to integer overflows caused by huge target bitrate, frame rate, or
g_timebase numerator or denominator
Fix to missing license headers
Fix to build failure for Android Armv7
Fix to integer overflows in image helpers
Fix to Integer-overflow in `vp9_calc_iframe_target_size_one_pass_cbr`
Fix to Heap-buffer-overflow in `vp9_pick_inter_mode`
Fix to Segv in `vp9_multi_thread_tile_init`
Fix to Use-of-uninitialized-value in `vp9_row_mt_sync_mem_dealloc`
Fix to Crash in `mbloop_filter_vertical_edge_c`
Fix to Check failed in CheckUnwind
Fix to Heap-buffer-overflow in `write_modes_b` and `vpx_write`
Fix to Possible signed integer overflow found in `vpx_codec_encode`
Fix to build conflicts between Abseil and libaom/libvpx in Win ARM64 builds
Fix to build failures on aarch64
Fix to Data race in libvpx ARM NEON
Fix to Heap-buffer-overflow in `scale_plane_1_to_2_phase_0`
Fix to integer overflow in `encode_mb_row`
Fix to Floating-point-exception in `vp8_pick_frame_size`
Fix to Heap-buffer-overflow in `vp9_enc_setup_mi`
Fix to build failure with --target=arm64-win64-vs17
Fix to heap-buffer-overflow write in `vpx_img_read()`
Fix to C vs armv8-linux-gcc encode mismatches for `y4m_360p_10bit_input`
Fix to Null-dereference READ in `ml_predict_var_rd_partitioning`
Fix to Heap-buffer-overflow in `vpx_scaled_2d_ssse3`
Fix to Crash in `convolve_horiz`
Fix to Ill in `vpx_scaled_2d_ssse3`
Fix to Global-buffer-overflow in `cost_coeffs`
2024-05-21 v1.14.1 "Venetian Duck"
This release includes enhancements and bug fixes.
- Upgrading:
This release is ABI compatible with the previous release.
- Enhancement:
Improved the detection of compiler support for AArch64 extensions,
particularly SVE.
Added vpx_codec_get_global_headers() support for VP9.
- Bug fixes:
Added buffer bounds checks to vpx_writer and vpx_write_bit_buffer.
Fix to GetSegmentationData() crash in aq_mode=0 for RTC rate control.
Fix to alloc for row_base_thresh_freq_fac.
Free row mt memory before freeing cpi->tile_data.
Fix to buffer alloc for vp9_bitstream_worker_data.
Fix to VP8 race issue for multi-thread with pnsr_calc.
Fix to uv width/height in vp9_scale_and_extend_frame_ssse3.
Fix to integer division by zero and overflow in calc_pframe_target_size().
Fix to integer overflow in vpx_img_alloc() & vpx_img_wrap()(CVE-2024-5197).
Fix to UBSan error in vp9_rc_update_framerate().
Fix to UBSan errors in vp8_new_framerate().
Fix to integer overflow in vp8 encodeframe.c.
Handle EINTR from sem_wait().
2024-01-02 v1.14.0 "Venetian Duck"
This release drops support for old C compilers, such as Visual Studio 2012
and older, that disallow mixing variable declarations and statements (a C99
feature). It adds support for run-time CPU feature detection for Arm
platforms, as well as support for darwin23 (macOS 14).
- Upgrading:
This release is ABI incompatible with the previous release.
Various new features for rate control library for real-time: SVC parallel
encoding, loopfilter level, support for frame dropping, and screen content.
New callback function send_tpl_gop_stats for vp9 external rate control
library, which can be used to transmit TPL stats for a group of pictures. A
public header vpx_tpl.h is added for the definition of TPL stats used in
this callback.
libwebm is upgraded to libwebm-1.0.0.29-9-g1930e3c.
- Enhancement:
Improvements on Neon optimizations: VoD: 12-35% speed up for bitdepth 8,
68%-151% speed up for high bitdepth.
Improvements on AVX2 and SSE optimizations.
Improvements on LSX optimizations for LoongArch.
42-49% speedup on speed 0 VoD encoding.
Android API level predicates.
- Bug fixes:
Fix to missing prototypes from the rtcd header.
Fix to segfault when total size is enlarged but width is smaller.
Fix to the build for arm64ec using MSVC.
Fix to copy BLOCK_8X8's mi to PICK_MODE_CONTEXT::mic.
Fix to -Wshadow warnings.
Fix to heap overflow in vpx_get4x4sse_cs_neon.
Fix to buffer overrun in highbd Neon subpel variance filters.
Added bitexact encode test script.
Fix to -Wl,-z,defs with Clang's sanitizers.
Fix to decoder stability after error & continued decoding.
Fix to mismatch of VP9 encode with NEON intrinsics with C only version.
Fix to Arm64 MSVC compile vpx_highbd_fdct4x4_neon.
Fix to fragments count before use.
Fix to a case where target bandwidth is 0 for SVC.
Fix mask in vp9_quantize_avx2,highbd_get_max_lane_eob.
Fix to int overflow in vp9_calc_pframe_target_size_one_pass_cbr.
Fix to integer overflow in vp8,ratectrl.c.
Fix to integer overflow in vp9 svc.
Fix to avg_frame_bandwidth overflow.
Fix to per frame qp for temporal layers.
Fix to unsigned integer overflow in sse computation.
Fix to uninitialized mesh feature for BEST mode.
Fix to overflow in highbd temporal_filter.
Fix to unaligned loads w/w==4 in vpx_convolve_copy_neon.
Skip arm64_neon.h workaround w/VS >= 2019.
Fix to c vs avx mismatch of diamond_search_sad().
Fix to c vs intrinsic mismatch of vpx_hadamard_32x32() function.
Fix to a bug in vpx_hadamard_32x32_neon().
Fix to Clang -Wunreachable-code-aggressive warnings.
Fix to a bug in vpx_highbd_hadamard_32x32_neon().
Fix to -Wunreachable-code in mfqe_partition.
Force mode search on 64x64 if no mode is selected.
Fix to ubsan failure caused by left shift of negative.
Fix to integer overflow in calc_pframe_target_size.
Fix to float-cast-overflow in vp8_change_config().
Fix to a null ptr before use.
Conditionally skip using inter frames in speed features.
Remove invalid reference frames.
Disable intra mode search speed features conditionally.
Set nonrd keyframe under dynamic change of deadline for rtc.
Fix to scaled reference offsets.
Set skip_recode=0 in nonrd_pick_sb_modes.
Fix to an edge case when downsizing to one.
Fix to a bug in frame scaling.
Fix to pred buffer stride.
Fix to a bug in simple motion search.
Update frame size in actual encoding.
2023-09-29 v1.13.1 "Ugly Duckling"
This release contains two security related fixes. One each for VP8 and VP9.
- Upgrading:
This release is ABI compatible with the previous release.
- Bug fixes:
https://crbug.com/1486441 (CVE-2023-5217)
Fix to a crash related to VP9 encoding (#1642, CVE-2023-6349)
2023-01-31 v1.13.0 "Ugly Duckling"
This release includes more Neon and AVX2 optimizations, adds a new codec
control to set per frame QP, upgrades GoogleTest to v1.12.1, and includes
numerous bug fixes.
- Upgrading:
This release is ABI incompatible with the previous release.
New codec control VP9E_SET_QUANTIZER_ONE_PASS to set per frame QP.
GoogleTest is upgraded to v1.12.1.
.clang-format is upgraded to clang-format-11.
VPX_EXT_RATECTRL_ABI_VERSION was bumped due to incompatible changes to the
feature of using external rate control models for vp9.
- Enhancement:
Numerous improvements on Neon optimizations.
Numerous improvements on AVX2 optimizations.
Additional ARM targets added for Visual Studio.
- Bug fixes:
Fix to calculating internal stats when frame dropped.
Fix to segfault for external resize test in vp9.
Fix to build system with replacing egrep with grep -E.
Fix to a few bugs with external RTC rate control library.
Fix to make SVC work with VBR.
Fix to key frame setting in VP9 external RC.
Fix to -Wimplicit-int (Clang 16).
Fix to VP8 external RC for buffer levels.
Fix to VP8 external RC for dynamic update of layers.
Fix to VP9 auto level.
Fix to off-by-one error of max w/h in validate_config.
Fix to make SVC work for Profile 1.
2022-06-17 v1.12.0 "Torrent Duck"
This release adds optimizations for Loongarch, adds support for vp8 in the
real-time rate control library, upgrades GoogleTest to v1.11.0, updates
libwebm to libwebm-1.0.0.28-20-g206d268, and includes numerous bug fixes.
- Upgrading:
This release is ABI compatible with the previous release.
vp8 support in the real-time rate control library.
New codec control VP8E_SET_RTC_EXTERNAL_RATECTRL is added.
Configure support for darwin21 is added.
GoogleTest is upgraded to v1.11.0.
libwebm is updated to libwebm-1.0.0.28-20-g206d268.
Allow SimpleEncode environment to take target level as input to match
the level conformance in vp9.
- Enhancement:
Numerous improvements on checking memory allocations.
Optimizations for Loongarch.
Code clean-up.
- Bug fixes:
Fix to a crash related to {vp8/vp9}_set_roi_map.
Fix to compiling failure with -Wformat-nonliteral.
Fix to integer overflow with vp9 with high resolution content.
Fix to AddNoiseTest failure with ARMv7.
Fix to libvpx Null-dereference READ in vp8.
2021-09-27 v1.11.0 "Smew Duck"
This maintenance release adds support for VBR mode in VP9 rate control
interface, new codec controls to get quantization parameters and loop filter
levels, and includes several improvements to NEON and numerous bug fixes.
- Upgrading:
This release is ABI incompatible with the previous release.
New codec control is added to get quantization parameters and loop filter
levels.
VBR mode is supported in VP9 rate control library.
- Enhancement:
Numerous improvements for Neon optimizations.
Code clean-up and refactoring.
Calculation of rd multiplier is changed with BDRATE gains.
- Bug fixes:
Fix to overflow on duration.
Fix to several instances of -Wunused-but-set-variable.
Fix to avoid chroma resampling for 420mpeg2 input.
Fix to overflow in calc_iframe_target_size.
Fix to disallow skipping transform and quantization.
Fix some -Wsign-compare warnings in simple_encode.
Fix input file path in simple_encode_test.
Fix valid range for under/over_shoot pct.
2021-03-09 v1.10.0 "Ruddy Duck"
This maintenance release adds support for darwin20 and new codec controls, as
well as numerous bug fixes.
- Upgrading:
This release is ABI incompatible with the previous release.
New codec control is added to disable loopfilter for VP9.
New encoder control is added to disable feature to increase Q on overshoot
detection for CBR.
Configure support for darwin20 is added.
New codec control is added for VP9 rate control. The control ID of this
interface is VP9E_SET_EXTERNAL_RATE_CONTROL. To make VP9 use a customized
external rate control model, users will have to implement each callback
function in vpx_rc_funcs_t and register them using libvpx API
vpx_codec_control_() with the control ID.
- Enhancement:
Use -std=gnu++11 instead of -std=c++11 for c++ files.
- Bug fixes:
Override assembler with --as option of configure for MSVS.
Fix several compilation issues with gcc 4.8.5.
Fix to resetting rate control for temporal layers.
Fix to the rate control stats of SVC example encoder when number of spatial
layers is 1.
Fix to reusing motion vectors from the base spatial layer in SVC.
2 pass related flags removed from SVC example encoder.
2020-07-29 v1.9.0 "Quacking Duck"
This release adds support for NV12, a separate library for rate control, as
well as incremental improvements.
- Upgrading:
This release is ABI compatible with the previous release.
NV12 support is added to this release.
A new interface is added for VP9 rate control. The new library libvp9rc.a
must be linked by applications.
Googletest is updated to v1.10.0.
simple_encode.cc is compiled into a new library libsimple_encode.a with
CONFIG_RATE_CTRL.
- Enhancement:
Various changes to improve VP9 SVC, rate control, quality and speed to real
time encoding.
- Bug fixes:
Fix key frame update refresh simulcast flexible svc.
Fix to disable_16x16part speed feature for real time encoding.
Fix some signed integer overflows for VP9 rate control.
Fix initialization of delta_q_uv.
Fix condition in regulate_q for cyclic refresh.
Various fixes to dynamic resizing for VP9 SVC.
2019-12-09 v1.8.2 "Pekin Duck"
This release collects incremental improvements to many aspects of the library.
- Upgrading:
This release is ABI compatible with the previous release.
ARCH_* defines have been removed in favor of VPX_ARCH_*.
2019-07-15 v1.8.1 "Orpington Duck"
This release collects incremental improvements to many aspects of the library.
- Upgrading:
This release is ABI incompatible with the previous release.
VP8E_SET_CPUUSED now accepts values up to 9 for vp9.
VPX_CTRL_VP9E_SET_MAX_INTER_BITRATE_PCT had a spelling fix (was VP8E).
The --sdk-path option has been removed. If you were using it to build for
Android please read build/make/Android.mk for alternatives.
All PPC optimizations have been disabled:
https://bugs.chromium.org/p/webm/issues/detail?id=1522.
- Enhancements:
Various changes to improve encoder rate control, quality and speed
for practically every use case.
- Bug fixes:
vp9-rtc: Fix color artifacts for speed >= 8.
2019-01-31 v1.8.0 "Northern Shoveler Duck"
This release focused on encoding performance for realtime and VOD use cases.
- Upgrading:
This release is ABI incompatible with the previous release. This adds and
improves several vp9 controls. Most are related to SVC:
VP9E_SET_SVC_FRAME_DROP_LAYER:
- Frame dropping in SVC.
VP9E_SET_SVC_INTER_LAYER_PRED:
- Inter-layer prediction in SVC.
VP9E_SET_SVC_GF_TEMPORAL_REF:
- Enable long term temporal reference in SVC.
VP9E_SET_SVC_REF_FRAME_CONFIG/VP9E_GET_SVC_REF_FRAME_CONFIG:
- Extend and improve this control for better flexibility in setting SVC
pattern dynamically.
VP9E_SET_POSTENCODE_DROP:
- Allow for post-encode frame dropping (applies to non-SVC too).
VP9E_SET_SVC_SPATIAL_LAYER_SYNC:
- Enable spatial layer sync frames.
VP9E_SET_SVC_LAYER_ID:
- Extend api to specify temporal id for each spatial layers.
VP9E_SET_ROI_MAP:
- Extend Region of Interest functionality to VP9.
- Enhancements:
2 pass vp9 encoding has improved substantially. When using --auto-alt-ref=6,
we see approximately 8% for VBR and 10% for CQ. When using --auto-alt-ref=1,
the gains are approximately 4% for VBR and 5% for CQ.
For real-time encoding, speed 7 has improved by ~5-10%. Encodes targeted at
screen sharing have improved when the content changes significantly (slide
sharing) or scrolls. There is a new speed 9 setting for mobile devices which
is about 10-20% faster than speed 8.
- Bug fixes:
VP9 denoiser issue.
VP9 partition issue for 1080p.
VP9 rate control improvments.
Postprocessing Multi Frame Quality Enhancement (MFQE) issue.
VP8 multithread decoder issues.
A variety of fuzzing issues.
2018-01-04 v1.7.0 "Mandarin Duck"
This release focused on high bit depth performance (10/12 bit) and vp9
encoding improvements.
- Upgrading:
This release is ABI incompatible due to new vp9 encoder features.
Frame parallel decoding for vp9 has been removed.
- Enhancements:
vp9 encoding supports additional threads with --row-mt. This can be greater
than the number of tiles.
Two new vp9 encoder options have been added:
--corpus-complexity
--tune-content=film
Additional tooling for respecting the vp9 "level" profiles has been added.
- Bug fixes:
A variety of fuzzing issues.
vp8 threading fix for ARM.
Codec control VP9_SET_SKIP_LOOP_FILTER fixed.
Reject invalid multi resolution configurations.
2017-01-09 v1.6.1 "Long Tailed Duck"
This release improves upon the VP9 encoder and speeds up the encoding and
decoding processes.
@@ -272,7 +715,7 @@
of particular interest to real time streaming applications.
Temporal scalability allows the encoder to produce a stream that can
be decimated to different frame rates, with independent rate targetting
be decimated to different frame rates, with independent rate targeting
for each substream.
Multiframe quality enhancement postprocessing can make visual quality
+29
View File
@@ -0,0 +1,29 @@
# How to Contribute
We'd love to accept your patches and contributions to this project. There are
just a few small guidelines you need to follow.
## Contributor License Agreement
Contributions to this project must be accompanied by a Contributor License
Agreement. You (or your employer) retain the copyright to your contribution;
this simply gives us permission to use and redistribute your contributions as
part of the project. Head over to <https://cla.developers.google.com/> to see
your current agreements on file or to sign a new one.
You generally only need to submit a CLA once, so if you've already submitted one
(even if it was for a different project), you probably don't need to do it
again.
## Code reviews
All submissions, including submissions by project members, require review. We
use a [Gerrit](https://www.gerritcodereview.com) instance hosted at
https://chromium-review.googlesource.com for this purpose. See the
[WebM Project page](https://www.webmproject.org/code/contribute/submitting-patches/)
for additional details.
## Community Guidelines
This project follows
[Google's Open Source Community Guidelines](https://opensource.google.com/conduct/).
+104 -23
View File
@@ -1,5 +1,3 @@
README - 9 January 2017
Welcome to the WebM VP8/VP9 Codec SDK!
COMPILING THE APPLICATIONS/LIBRARIES:
@@ -9,22 +7,27 @@ COMPILING THE APPLICATIONS/LIBRARIES:
1. Prerequisites
* All x86 targets require the Yasm[1] assembler be installed.
* All Windows builds require that Cygwin[2] be installed.
* Building the documentation requires Doxygen[3]. If you do not
* All x86 targets require the Yasm[1] assembler be installed[2].
* All Windows builds require that Cygwin[3] or MSYS2[4] be installed.
* Building the documentation requires Doxygen[5]. If you do not
have this package, the install-docs option will be disabled.
* Downloading the data for the unit tests requires curl[4] and sha1sum.
* Downloading the data for the unit tests requires curl[6] and sha1sum.
sha1sum is provided via the GNU coreutils, installed by default on
many *nix platforms, as well as MinGW and Cygwin. If coreutils is not
available, a compatible version of sha1sum can be built from
source[5]. These requirements are optional if not running the unit
source[7]. These requirements are optional if not running the unit
tests.
[1]: http://www.tortall.net/projects/yasm
[2]: http://www.cygwin.com
[3]: http://www.doxygen.org
[4]: http://curl.haxx.se
[5]: http://www.microbrew.org/tools/md5sha1sum/
[2]: For Visual Studio the base yasm binary (not vsyasm) should be in the
PATH for Visual Studio. For VS2017 it is sufficient to rename
yasm-<version>-<arch>.exe to yasm.exe and place it in:
Program Files (x86)/Microsoft Visual Studio/2017/<level>/Common7/Tools/
[3]: http://www.cygwin.com
[4]: http://www.msys2.org/
[5]: http://www.doxygen.org
[6]: http://curl.haxx.se
[7]: http://www.microbrew.org/tools/md5sha1sum/
2. Out-of-tree builds
Out of tree builds are a supported method of building the application. For
@@ -41,7 +44,16 @@ COMPILING THE APPLICATIONS/LIBRARIES:
used to get a list of supported options:
$ ../libvpx/configure --help
4. Cross development
4. Compiler analyzers
Compilers have added sanitizers which instrument binaries with information
about address calculation, memory usage, threading, undefined behavior, and
other common errors. To simplify building libvpx with some of these features
use tools/set_analyzer_env.sh before running configure. It will set the
compiler and necessary flags for building as well as environment variables
read by the analyzer when testing the binaries.
$ source ../libvpx/tools/set_analyzer_env.sh address
5. Cross development
For cross development, the most notable option is the --target option. The
most up-to-date list of supported targets can be found at the bottom of the
--help output of the configure script. As of this writing, the list of
@@ -49,19 +61,35 @@ COMPILING THE APPLICATIONS/LIBRARIES:
arm64-android-gcc
arm64-darwin-gcc
arm64-darwin20-gcc
arm64-darwin21-gcc
arm64-darwin22-gcc
arm64-darwin23-gcc
arm64-darwin24-gcc
arm64-linux-gcc
arm64-win64-gcc
arm64-win64-vs15
arm64-win64-vs16
arm64-win64-vs16-clangcl
arm64-win64-vs17
arm64-win64-vs17-clangcl
armv7-android-gcc
armv7-darwin-gcc
armv7-linux-rvct
armv7-linux-gcc
armv7-none-rvct
armv7-win32-vs11
armv7-win32-vs12
armv7-win32-gcc
armv7-win32-vs14
armv7-win32-vs15
armv7-win32-vs16
armv7-win32-vs17
armv7s-darwin-gcc
armv8-linux-gcc
loongarch32-linux-gcc
loongarch64-linux-gcc
mips32-linux-gcc
mips64-linux-gcc
ppc64le-linux-gcc
sparc-solaris-gcc
x86-android-gcc
x86-darwin8-gcc
@@ -74,16 +102,18 @@ COMPILING THE APPLICATIONS/LIBRARIES:
x86-darwin13-gcc
x86-darwin14-gcc
x86-darwin15-gcc
x86-darwin16-gcc
x86-darwin17-gcc
x86-iphonesimulator-gcc
x86-linux-gcc
x86-linux-icc
x86-os2-gcc
x86-solaris-gcc
x86-win32-gcc
x86-win32-vs10
x86-win32-vs11
x86-win32-vs12
x86-win32-vs14
x86-win32-vs15
x86-win32-vs16
x86-win32-vs17
x86_64-android-gcc
x86_64-darwin9-gcc
x86_64-darwin10-gcc
@@ -92,15 +122,24 @@ COMPILING THE APPLICATIONS/LIBRARIES:
x86_64-darwin13-gcc
x86_64-darwin14-gcc
x86_64-darwin15-gcc
x86_64-darwin16-gcc
x86_64-darwin17-gcc
x86_64-darwin18-gcc
x86_64-darwin19-gcc
x86_64-darwin20-gcc
x86_64-darwin21-gcc
x86_64-darwin22-gcc
x86_64-darwin23-gcc
x86_64-darwin24-gcc
x86_64-iphonesimulator-gcc
x86_64-linux-gcc
x86_64-linux-icc
x86_64-solaris-gcc
x86_64-win64-gcc
x86_64-win64-vs10
x86_64-win64-vs11
x86_64-win64-vs12
x86_64-win64-vs14
x86_64-win64-vs15
x86_64-win64-vs16
x86_64-win64-vs17
generic-gnu
The generic-gnu target, in conjunction with the CROSS environment variable,
@@ -113,10 +152,10 @@ COMPILING THE APPLICATIONS/LIBRARIES:
$ CROSS=mipsel-linux-uclibc- ../libvpx/configure
In addition, the executables to be invoked can be overridden by specifying the
environment variables: CC, AR, LD, AS, STRIP, NM. Additional flags can be
passed to these executables with CFLAGS, LDFLAGS, and ASFLAGS.
environment variables: AR, AS, CC, CXX, LD, STRIP. Additional flags can be
passed to these executables with ASFLAGS, CFLAGS, CXXFLAGS, and LDFLAGS.
5. Configuration errors
6. Configuration errors
If the configuration step fails, the first step is to look in the error log.
This defaults to config.log. This should give a good indication of what went
wrong. If not, contact us for support.
@@ -144,7 +183,49 @@ CODE STYLE:
See also: http://clang.llvm.org/docs/ClangFormat.html
PROFILE GUIDED OPTIMIZATION (PGO)
Profile Guided Optimization can be enabled for Clang builds using the
commands:
$ export CC=clang
$ export CXX=clang++
$ ../libvpx/configure --enable-profile
$ make
Generate one or multiple PGO profile files by running vpxdec or vpxenc. For
example:
$ ./vpxdec ../vpx/out_ful/vp90-2-sintel_1280x546_tile_1x4_1257kbps.webm \
-o - > /dev/null
To convert and merge the raw profile files, use the llvm-profdata tool:
$ llvm-profdata merge -o perf.profdata default_8382761441159425451_0.profraw
Then, rebuild the project with the new profile file:
$ make clean
$ ../libvpx/configure --use-profile=perf.profdata
$ make
Note: Always use the llvm-profdata from the toolchain that is used for
compiling the PGO-enabled binary.
To observe the improvements from a PGO-enabled build, enable and compare the
list of failed optimizations by using the -Rpass-missed compiler flag. For
example, to list the failed loop vectorizations:
$ ../libvpx/configure --use-profile=perf.profdata \
--extra-cflags=-Rpass-missed=loop-vectorize
For guidance on utilizing PGO files to identify potential optimization
opportunities, see: tools/README.pgo.md
SUPPORT
This library is an open source project supported by its community. Please
email webm-discuss@webmproject.org for help.
BUG REPORTS
Bug reports can be filed in the libvpx issue tracker:
https://issues.webmproject.org/.
For security reports, select 'Security report' from the Template dropdown.
+39
View File
@@ -0,0 +1,39 @@
ShiftMediaProject libvpx
=============
[![Build status](https://ci.appveyor.com/api/projects/status/2gemwiy0qp5lf3sk?svg=true)](https://ci.appveyor.com/project/Sibras/libvpx)
[![Github All Releases](https://img.shields.io/github/downloads/ShiftMediaProject/libvpx/total.svg)](https://github.com/ShiftMediaProject/libvpx/releases)
[![GitHub release](https://img.shields.io/github/release/ShiftMediaProject/libvpx.svg)](https://github.com/ShiftMediaProject/libvpx/releases/latest)
[![GitHub issues](https://img.shields.io/github/issues/ShiftMediaProject/libvpx.svg)](https://github.com/ShiftMediaProject/libvpx/issues)
[![license](https://img.shields.io/github/license/ShiftMediaProject/libvpx.svg)](https://github.com/ShiftMediaProject/libvpx)
[![donate](https://img.shields.io/badge/donate-link-brightgreen.svg)](https://shiftmediaproject.github.io/8-donate/)
## ShiftMediaProject
Shift Media Project aims to provide native Windows development libraries for libvpx and associated dependencies to support simpler creation and debugging of rich media content directly within Visual Studio. [https://shiftmediaproject.github.io/](https://shiftmediaproject.github.io/)
## libvpx
VP8/VP9 Codec SDK. [https://www.webmproject.org/code/](https://www.webmproject.org/code/)
## Downloads
Development libraries are available from the [releases](https://github.com/ShiftMediaProject/libvpx/releases) page. These libraries are available for each supported Visual Studio version with a different download for each version. Each download contains both static and dynamic libraries to choose from in both 32bit and 64bit versions.
## Code
This repository contains code from the corresponding upstream project with additional modifications to allow it to be compiled with Visual Studio. New custom Visual Studio projects are provided within the 'SMP' sub-directory. Refer to the 'readme' contained within the 'SMP' directory for further details.
## Issues
Any issues related to the ShiftMediaProject specific changes should be sent to the [issues](https://github.com/ShiftMediaProject/libvpx/issues) page for the repository. Any issues related to the upstream project should be sent upstream directly (see the issues information of the upstream repository for more details).
## License
ShiftMediaProject original code is released under [LGPLv2.1](https://www.gnu.org/licenses/lgpl-2.1.html). All code from the upstream repository remains under its original license (see the license information of the upstream repository for more details).
## Copyright
As this repository includes code from upstream project(s) it includes many copyright owners. ShiftMediaProject makes NO claim of copyright on any upstream code. However, all original ShiftMediaProject authored code is copyright ShiftMediaProject. For a complete copyright list please checkout the source code to examine license headers. Unless expressly stated otherwise all code submitted to the ShiftMediaProject project (in any form) is licensed under [LGPLv2.1](https://www.gnu.org/licenses/lgpl-2.1.html) and copyright is donated to ShiftMediaProject. If you submit code that is not your own work it is your responsibility to place a header stating the copyright.
## Contributing
Patches related to the ShiftMediaProject specific changes should be sent as pull requests to the main repository. Any changes related to the upstream project should be sent upstream directly (see the contributing information of the upstream repository for more details).
+4 -3
View File
@@ -1,3 +1,4 @@
*.sln text=auto
*.vcxproj text=auto
*.vcxproj.filters text=auto
*.sln text eol=crlf
*.vcxproj text eol=crlf
*.vcxproj.filters text eol=crlf
*.bat text eol=crlf
+181
View File
@@ -0,0 +1,181 @@
diff --git a/vp8/common/x86/idctllm_mmx.asm b/vp8/common/x86/idctllm_mmx.asm
index 6cea86fe0..6f06c7afa 100644
--- a/vp8/common/x86/idctllm_mmx.asm
+++ b/vp8/common/x86/idctllm_mmx.asm
@@ -11,6 +11,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
; /****************************************************************************
; * Notes:
; *
diff --git a/vp8/common/x86/idctllm_sse2.asm b/vp8/common/x86/idctllm_sse2.asm
index bb79d2da3..410f3112e 100644
--- a/vp8/common/x86/idctllm_sse2.asm
+++ b/vp8/common/x86/idctllm_sse2.asm
@@ -11,6 +11,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
;void vp8_idct_dequant_0_2x_sse2
; (
; short *qcoeff - 0
diff --git a/vp8/common/x86/loopfilter_block_sse2_x86_64.asm b/vp8/common/x86/loopfilter_block_sse2_x86_64.asm
index 8d12f5385..b8e51a07f 100644
--- a/vp8/common/x86/loopfilter_block_sse2_x86_64.asm
+++ b/vp8/common/x86/loopfilter_block_sse2_x86_64.asm
@@ -11,6 +11,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
%macro LF_ABS 2
; %1 value not preserved
; %2 value preserved
diff --git a/vp8/common/x86/loopfilter_sse2.asm b/vp8/common/x86/loopfilter_sse2.asm
index ce5c31313..defb5b7f1 100644
--- a/vp8/common/x86/loopfilter_sse2.asm
+++ b/vp8/common/x86/loopfilter_sse2.asm
@@ -10,6 +10,8 @@
%include "vpx_ports/x86_abi_support.asm"
+
+section .text
%define _t0 0
%define _t1 _t0 + 16
%define _p3 _t1 + 16
diff --git a/vp8/common/x86/subpixel_sse2.asm b/vp8/common/x86/subpixel_sse2.asm
index 94e14aed6..89acbbecd 100644
--- a/vp8/common/x86/subpixel_sse2.asm
+++ b/vp8/common/x86/subpixel_sse2.asm
@@ -11,6 +11,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
%define BLOCK_HEIGHT_WIDTH 4
%define VP8_FILTER_WEIGHT 128
%define VP8_FILTER_SHIFT 7
diff --git a/vp8/common/x86/subpixel_ssse3.asm b/vp8/common/x86/subpixel_ssse3.asm
index 17247227d..a46b510f6 100644
--- a/vp8/common/x86/subpixel_ssse3.asm
+++ b/vp8/common/x86/subpixel_ssse3.asm
@@ -11,6 +11,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
%define BLOCK_HEIGHT_WIDTH 4
%define VP8_FILTER_WEIGHT 128
%define VP8_FILTER_SHIFT 7
diff --git a/vp8/encoder/x86/copy_sse3.asm b/vp8/encoder/x86/copy_sse3.asm
index c40b2d8bf..34c927b25 100644
--- a/vp8/encoder/x86/copy_sse3.asm
+++ b/vp8/encoder/x86/copy_sse3.asm
@@ -10,6 +10,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
%macro STACK_FRAME_CREATE_X3 0
%if ABI_IS_32BIT
%define src_ptr rsi
diff --git a/vp8/encoder/x86/dct_sse2.asm b/vp8/encoder/x86/dct_sse2.asm
index 3c28cb902..c50db001c 100644
--- a/vp8/encoder/x86/dct_sse2.asm
+++ b/vp8/encoder/x86/dct_sse2.asm
@@ -11,6 +11,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
%macro STACK_FRAME_CREATE 0
%if ABI_IS_32BIT
%define input rsi
diff --git a/vpx_dsp/x86/deblock_sse2.asm b/vpx_dsp/x86/deblock_sse2.asm
index b3af677d2..7b65bb4a7 100644
--- a/vpx_dsp/x86/deblock_sse2.asm
+++ b/vpx_dsp/x86/deblock_sse2.asm
@@ -11,6 +11,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
;macro in deblock functions
%macro FIRST_2_ROWS 0
movdqa xmm4, xmm0
diff --git a/vpx_dsp/x86/ssim_opt_x86_64.asm b/vpx_dsp/x86/ssim_opt_x86_64.asm
index 1ad3b88c8..d019e549d 100644
--- a/vpx_dsp/x86/ssim_opt_x86_64.asm
+++ b/vpx_dsp/x86/ssim_opt_x86_64.asm
@@ -10,6 +10,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
; tabulate_ssim - sums sum_s,sum_r,sum_sq_s,sum_sq_r, sum_sxr
%macro TABULATE_SSIM 0
paddusw xmm15, xmm3 ; sum_s
diff --git a/vpx_dsp/x86/vpx_high_subpixel_8t_sse2.asm b/vpx_dsp/x86/vpx_high_subpixel_8t_sse2.asm
index fc301fb39..318cb2682 100644
--- a/vpx_dsp/x86/vpx_high_subpixel_8t_sse2.asm
+++ b/vpx_dsp/x86/vpx_high_subpixel_8t_sse2.asm
@@ -11,6 +11,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
;Note: tap3 and tap4 have to be applied and added after other taps to avoid
;overflow.
diff --git a/vpx_dsp/x86/vpx_high_subpixel_bilinear_sse2.asm b/vpx_dsp/x86/vpx_high_subpixel_bilinear_sse2.asm
index bd51c75bc..e5f94fc76 100644
--- a/vpx_dsp/x86/vpx_high_subpixel_bilinear_sse2.asm
+++ b/vpx_dsp/x86/vpx_high_subpixel_bilinear_sse2.asm
@@ -10,6 +10,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
%macro HIGH_GET_PARAM_4 0
mov rdx, arg(5) ;filter ptr
mov rsi, arg(0) ;src_ptr
diff --git a/vpx_dsp/x86/vpx_subpixel_8t_sse2.asm b/vpx_dsp/x86/vpx_subpixel_8t_sse2.asm
index c8455e13a..ba052c85f 100644
--- a/vpx_dsp/x86/vpx_subpixel_8t_sse2.asm
+++ b/vpx_dsp/x86/vpx_subpixel_8t_sse2.asm
@@ -11,6 +11,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
;Note: tap3 and tap4 have to be applied and added after other taps to avoid
;overflow.
diff --git a/vpx_dsp/x86/vpx_subpixel_bilinear_sse2.asm b/vpx_dsp/x86/vpx_subpixel_bilinear_sse2.asm
index 65790b1c2..aa3f094bc 100644
--- a/vpx_dsp/x86/vpx_subpixel_bilinear_sse2.asm
+++ b/vpx_dsp/x86/vpx_subpixel_bilinear_sse2.asm
@@ -10,6 +10,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
%macro GET_PARAM_4 0
mov rdx, arg(5) ;filter ptr
mov rsi, arg(0) ;src_ptr
diff --git a/vpx_dsp/x86/vpx_subpixel_bilinear_ssse3.asm b/vpx_dsp/x86/vpx_subpixel_bilinear_ssse3.asm
index 32e3cd3d9..a370eeeae 100644
--- a/vpx_dsp/x86/vpx_subpixel_bilinear_ssse3.asm
+++ b/vpx_dsp/x86/vpx_subpixel_bilinear_ssse3.asm
@@ -10,6 +10,7 @@
%include "vpx_ports/x86_abi_support.asm"
+section .text
%macro GET_PARAM_4 0
mov rdx, arg(5) ;filter ptr
mov rsi, arg(0) ;src_ptr
+49 -29
View File
@@ -18,12 +18,46 @@ environment:
APPVEYOR_BUILD_WORKER_IMAGE: Visual Studio 2015
- MSVC_VER: 15
APPVEYOR_BUILD_WORKER_IMAGE: Visual Studio 2017
- MSVC_VER: 16
APPVEYOR_BUILD_WORKER_IMAGE: Visual Studio 2019
- MSVC_VER: 17
APPVEYOR_BUILD_WORKER_IMAGE: Visual Studio 2022
install:
# Install GitLink
- cmd: choco install gitlink
- cmd: nuget install gitlink -Version 2.4.0
- cmd: for /f "tokens=*" %%f in ('dir /s /b gitlink.exe') do copy /b %%f .\
before_build:
# Backup platform so it is not affected by vcvars
- cmd: SET PLATFORMBACK=%PLATFORM%
# Setup msvc environment for required compiler version (specified by MSVC_VER)
- ps: >-
if ($env:MSVC_VER -eq 17) {
$env:VCVARS="C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvarsall.bat"
} elseif ($env:MSVC_VER -eq 16) {
$env:VCVARS="C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvarsall.bat"
} elseif ($env:MSVC_VER -eq 15) {
$env:VCVARS="C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvarsall.bat"
} else {
$env:VSCOMNTOOLS=(Get-Content ("env:VS" + "$env:MSVC_VER" + "0COMNTOOLS"))
$env:VCVARS="%VSCOMNTOOLS%\..\..\VC\vcvarsall.bat"
}
- cmd: call "%VCVARS%" amd64
# Detect latest available windows sdk version
- ps: >-
if ($env:MSVC_VER -eq 12) {
$env:WindowsSDKVersion=8.1
} else {
$env:WindowsSDKVersion=$env:WindowsSDKVersion.TrimEnd('\')
}
# Reset platform
- cmd: SET PLATFORM=%PLATFORMBACK%
# Create build project to compile all configurations and platforms at once
- ps: >-
$script = @'
@@ -41,18 +75,18 @@ before_build:
</PropertyGroup>
<ItemGroup>
<ProjectToBuild Include="SMP/APPVEYOR_PROJECT_NAME.sln">
<Properties>Configuration=%(ConfigurationList.Identity);Platform=$(CurrentPlatform);OutDir=$(MSBuildThisFileDirectory)build_out\</Properties>
<Properties>Configuration=%(ConfigurationList.Identity);Platform=$(CurrentPlatform);OutDir=$(MSBuildThisFileDirectory)build_out\;WindowsTargetPlatformVersion=SDK_VER</Properties>
</ProjectToBuild>
</ItemGroup>
</Target>
<Target Name="Build" DependsOnTargets="List">
<MSBuild Projects="@(ProjectToBuild)" BuildInParallel="true" />
<MSBuild Projects="@(ProjectToBuild)" BuildInParallel="false" />
</Target>
<Target Name="GitLink" DependsOnTargets="Build" Outputs="%(PlatformList.Identity)">
<PropertyGroup>
<CurrentPlatform>%(PlatformList.Identity)</CurrentPlatform>
</PropertyGroup>
<Exec Command="GitLink . -f SMP/APPVEYOR_PROJECT_NAME.sln -c %(ConfigurationList.Identity) -p $(CurrentPlatform) -d $(MSBuildThisFileDirectory)build_out\lib\$(CurrentPlatform) -u https://github.com/APPVEYOR_REPO_NAME.git -s APPVEYOR_REPO_COMMIT"/>
<Exec Command="GitLink . -f SMP/APPVEYOR_PROJECT_NAME.sln -c %(ConfigurationList.Identity) -p $(CurrentPlatform) -d $(MSBuildThisFileDirectory)build_out\lib\$(CurrentPlatform) -u https://github.com/APPVEYOR_REPO_NAME.git -s APPVEYOR_REPO_COMMIT -errorsaswarnings"/>
</Target>
</Project>
@@ -67,43 +101,29 @@ before_build:
$script = $script -replace "APPVEYOR_MSVC_VER", "$env:MSVC_VER"
$script = $script -replace "SDK_VER", "$env:WindowsSDKVersion"
$script | Out-File build.vcxproj
# Backup platform so it is not affected by vcvars
- cmd: SET PLATFORMBACK=%PLATFORM%
# Setup msvc environment for required compiler version (specified by MSVC_VER)
- ps: >-
if ($env:MSVC_VER -eq 15) {
$env:VCVARS="C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvarsall.bat"
} else {
$env:VSCOMNTOOLS=(Get-Content ("env:VS" + "$env:MSVC_VER" + "0COMNTOOLS"))
$env:VCVARS="%VSCOMNTOOLS%\..\..\VC\vcvarsall.bat"
}
- cmd: call "%VCVARS%" amd64
# Reset platform
- cmd: SET PLATFORM=%PLATFORMBACK%
# Set Targets path so that gitlink works correctly
- ps: $env:MSBUILDDIR=((Get-Command msbuild.exe).Path | Split-Path -parent)
- ps: >-
if ($env:MSVC_VER -eq 15) {
if ($env:MSVC_VER -eq 17) {
$env:VCTargetsPath="$env:MSBUILDDIR\..\..\..\Microsoft\VC\v170\"
} elseif ($env:MSVC_VER -eq 16) {
$env:VCTargetsPath="$env:MSBUILDDIR\..\..\Microsoft\VC\v160\"
} elseif ($env:MSVC_VER -eq 15) {
$env:VCTargetsPath="$env:MSBUILDDIR\..\..\..\Common7\IDE\VC\VCTargets"
} else {
$env:VCTargetsPath="$env:MSBUILDDIR\..\..\..\Microsoft.Cpp\v4.0\V${env:MSVC_VER}0"
}
# Download and install yasm integration
- ps: (New-Object Net.WebClient).DownloadFile('http://www.tortall.net/projects/yasm/releases/vsyasm-1.3.0-win64.zip', "$pwd\yasm.zip")
- ps: (New-Object Net.WebClient).DownloadFile('https://github.com/ShiftMediaProject/VSYASM/releases/download/0.7/VSYASM.zip', "$pwd\yasm.zip")
- ps: Add-Type -A 'System.IO.Compression.FileSystem'; [IO.Compression.ZipFile]::ExtractToDirectory("$pwd\yasm.zip", "$pwd\TempYASMUnpack")
- ps: New-Item -ItemType Directory -Force -Path "$env:VCINSTALLDIR\bin\"
- ps: Move-Item -Force "TempYASMUnpack\*.exe" "$env:VCINSTALLDIR\bin\"
- ps: (Get-Content "$pwd\TempYASMUnpack\vsyasm.props") -replace '\$\(Platform\)', 'win$(PlatformArchitecture)' | Set-Content "$pwd\TempYASMUnpack\vsyasm.props"
- ps: Copy-Item -Force "TempYASMUnpack\*.*" "$env:VCTargetsPath\BuildCustomizations"
- cmd: call ".\TempYASMUnpack\install_script.bat"
# Additional yasm location in order to fix gitlink error
- ps: if ($env:MSVC_VER -ne 15) { Copy-Item -Force "TempYASMUnpack\*.*" "$env:VCTargetsPath\..\BuildCustomizations" }
- ps: if ($env:MSVC_VER -lt 15) { Copy-Item -Force "TempYASMUnpack\*.*" "$env:VCTargetsPath\..\BuildCustomizations" }
build:
project: build.vcxproj
@@ -120,6 +140,6 @@ deploy:
tag: $(APPVEYOR_REPO_TAG_NAME)
description: Pre-built static and shared libraries in 32b and 64b for $(APPVEYOR_PROJECT_NAME) $(APPVEYOR_REPO_TAG_NAME)
auth_token:
secure: aiTcAD/YitqgwuiBdC3ImXiUlHfIIDD7ayjCs3Y3aAO5vEm1gA7flCZpUZ60a5am
secure: c9Sads7Y16h7FP+LrR3IjVygYAgh8GByE8TtazxDg7jpPVxc+XDV81z7MoUc2Ada
artifact: $(APPVEYOR_PROJECT_NAME)_$(APPVEYOR_REPO_TAG_NAME)_msvc$(MSVC_VER)
force_update: true
+31
View File
@@ -0,0 +1,31 @@
/** DCE definitions
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "vpx_config.h"
#include <stddef.h>
#include <stdint.h>
#if !HAVE_AVX512
void vpx_sad64x64x4d_avx512(const uint8_t *src, int src_stride,
const uint8_t *const ref[4], int ref_stride,
uint32_t res[4]) {}
#endif
+46 -7
View File
@@ -1,21 +1,29 @@
Microsoft Visual Studio Solution File, Format Version 12.00
VisualStudioVersion = 12.0.30501.0
MinimumVisualStudioVersion = 10.0.40219.1
MinimumVisualStudioVersion = 12.0.30501.0
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "libvpx", "libvpx.vcxproj", "{8293418A-603A-4119-B7B4-1E6204606BA9}"
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "libvpx_winrt", "libvpx_winrt.vcxproj", "{A293418A-603A-4119-B7B4-1E6204606BA9}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|x64 = Debug|x64
Debug|x86 = Debug|x86
DebugDLL|x64 = DebugDLL|x64
DebugDLL|x86 = DebugDLL|x86
DebugDLLWinRT|x64 = DebugDLLWinRT|x64
DebugDLLWinRT|x86 = DebugDLLWinRT|x86
DebugWinRT|x64 = DebugWinRT|x64
DebugWinRT|x86 = DebugWinRT|x86
Release|x64 = Release|x64
Release|x86 = Release|x86
ReleaseDLL|x64 = ReleaseDLL|x64
ReleaseDLL|x86 = ReleaseDLL|x86
ReleaseLTO|x64 = ReleaseLTO|x64
ReleaseLTO|x86 = ReleaseLTO|x86
ReleaseDLLWinRT|x64 = ReleaseDLLWinRT|x64
ReleaseDLLWinRT|x86 = ReleaseDLLWinRT|x86
ReleaseWinRT|x64 = ReleaseWinRT|x64
ReleaseWinRT|x86 = ReleaseWinRT|x86
EndGlobalSection
GlobalSection(ProjectConfigurationPlatforms) = postSolution
{8293418A-603A-4119-B7B4-1E6204606BA9}.Debug|x64.ActiveCfg = Debug|x64
@@ -26,6 +34,10 @@ Global
{8293418A-603A-4119-B7B4-1E6204606BA9}.DebugDLL|x64.Build.0 = DebugDLL|x64
{8293418A-603A-4119-B7B4-1E6204606BA9}.DebugDLL|x86.ActiveCfg = DebugDLL|Win32
{8293418A-603A-4119-B7B4-1E6204606BA9}.DebugDLL|x86.Build.0 = DebugDLL|Win32
{8293418A-603A-4119-B7B4-1E6204606BA9}.DebugDLLWinRT|x64.ActiveCfg = DebugDLL|x64
{8293418A-603A-4119-B7B4-1E6204606BA9}.DebugDLLWinRT|x86.ActiveCfg = DebugDLL|Win32
{8293418A-603A-4119-B7B4-1E6204606BA9}.DebugWinRT|x64.ActiveCfg = Debug|x64
{8293418A-603A-4119-B7B4-1E6204606BA9}.DebugWinRT|x86.ActiveCfg = Debug|Win32
{8293418A-603A-4119-B7B4-1E6204606BA9}.Release|x64.ActiveCfg = Release|x64
{8293418A-603A-4119-B7B4-1E6204606BA9}.Release|x64.Build.0 = Release|x64
{8293418A-603A-4119-B7B4-1E6204606BA9}.Release|x86.ActiveCfg = Release|Win32
@@ -34,12 +46,39 @@ Global
{8293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseDLL|x64.Build.0 = ReleaseDLL|x64
{8293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseDLL|x86.ActiveCfg = ReleaseDLL|Win32
{8293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseDLL|x86.Build.0 = ReleaseDLL|Win32
{8293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseLTO|x64.ActiveCfg = ReleaseLTO|x64
{8293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseLTO|x64.Build.0 = ReleaseLTO|x64
{8293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseLTO|x86.ActiveCfg = ReleaseLTO|Win32
{8293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseLTO|x86.Build.0 = ReleaseLTO|Win32
{8293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseDLLWinRT|x64.ActiveCfg = ReleaseDLL|x64
{8293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseDLLWinRT|x86.ActiveCfg = ReleaseDLL|Win32
{8293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseWinRT|x64.ActiveCfg = Release|x64
{8293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseWinRT|x86.ActiveCfg = Release|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.Debug|x64.ActiveCfg = DebugWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.Debug|x86.ActiveCfg = DebugWinRT|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.DebugDLL|x64.ActiveCfg = DebugDLLWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.DebugDLL|x86.ActiveCfg = DebugDLLWinRT|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.DebugDLLWinRT|x64.ActiveCfg = DebugDLLWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.DebugDLLWinRT|x64.Build.0 = DebugDLLWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.DebugDLLWinRT|x86.ActiveCfg = DebugDLLWinRT|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.DebugDLLWinRT|x86.Build.0 = DebugDLLWinRT|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.DebugWinRT|x64.ActiveCfg = DebugWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.DebugWinRT|x64.Build.0 = DebugWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.DebugWinRT|x86.ActiveCfg = DebugWinRT|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.DebugWinRT|x86.Build.0 = DebugWinRT|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.Release|x64.ActiveCfg = ReleaseWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.Release|x86.ActiveCfg = ReleaseWinRT|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseDLL|x64.ActiveCfg = ReleaseDLLWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseDLL|x86.ActiveCfg = ReleaseDLLWinRT|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseDLLWinRT|x64.ActiveCfg = ReleaseDLLWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseDLLWinRT|x64.Build.0 = ReleaseDLLWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseDLLWinRT|x86.ActiveCfg = ReleaseDLLWinRT|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseDLLWinRT|x86.Build.0 = ReleaseDLLWinRT|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseWinRT|x64.ActiveCfg = ReleaseWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseWinRT|x64.Build.0 = ReleaseWinRT|x64
{A293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseWinRT|x86.ActiveCfg = ReleaseWinRT|Win32
{A293418A-603A-4119-B7B4-1E6204606BA9}.ReleaseWinRT|x86.Build.0 = ReleaseWinRT|Win32
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE
EndGlobalSection
GlobalSection(ExtensibilityGlobals) = postSolution
SolutionGuid = {88D532DE-4550-4C4D-BC24-1C29FC508170}
EndGlobalSection
EndGlobal
+79 -1077
View File
File diff suppressed because it is too large Load Diff
+304 -109
View File
@@ -31,9 +31,6 @@
<Filter Include="Header Files\libvpx\vp8\common">
<UniqueIdentifier>{34b86bcc-a49b-432d-a981-211912f5bf60}</UniqueIdentifier>
</Filter>
<Filter Include="Header Files\libvpx\vp8\common\x86">
<UniqueIdentifier>{5f9e2c09-e625-4f7c-9249-34fe181866f5}</UniqueIdentifier>
</Filter>
<Filter Include="Header Files\libvpx\vp8\encoder">
<UniqueIdentifier>{6d9fae0a-60a1-4c20-a459-8846d037a758}</UniqueIdentifier>
</Filter>
@@ -55,12 +52,6 @@
<Filter Include="Source Files\libvpx">
<UniqueIdentifier>{57d748f2-bcaf-4a8d-ad3e-391cc4b67276}</UniqueIdentifier>
</Filter>
<Filter Include="Source Files\libvpx\third_party">
<UniqueIdentifier>{56c36112-b467-4193-8d02-e872d8dc331b}</UniqueIdentifier>
</Filter>
<Filter Include="Source Files\libvpx\third_party\x86inc">
<UniqueIdentifier>{e7e44365-cfc9-462a-afdb-05436b19c6c4}</UniqueIdentifier>
</Filter>
<Filter Include="Source Files\libvpx\vp9">
<UniqueIdentifier>{1bb9d49d-039e-466a-8538-552ae707fb28}</UniqueIdentifier>
</Filter>
@@ -139,14 +130,14 @@
<Filter Include="Header Files\libvpx\vpx_util">
<UniqueIdentifier>{0f1488e1-b863-436b-a38e-233c1a2e23d6}</UniqueIdentifier>
</Filter>
<Filter Include="Header Files\libvpx\vpx\internal">
<UniqueIdentifier>{b0c7e733-ec6f-46fd-a3f9-aea2e4952d9f}</UniqueIdentifier>
</Filter>
</ItemGroup>
<ItemGroup>
<ClInclude Include="vpx_config.h">
<Filter>Header Files</Filter>
</ClInclude>
<ClInclude Include="..\vpx\internal\vpx_codec_internal.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
<ClInclude Include="..\vpx_mem\vpx_mem.h">
<Filter>Header Files\libvpx\vpx_mem</Filter>
</ClInclude>
@@ -159,9 +150,6 @@
<ClInclude Include="..\vpx_scale\yv12config.h">
<Filter>Header Files\libvpx\vpx_scale</Filter>
</ClInclude>
<ClInclude Include="..\vpx_ports\config.h">
<Filter>Header Files\libvpx\vpx_ports</Filter>
</ClInclude>
<ClInclude Include="..\vpx_ports\emmintrin_compat.h">
<Filter>Header Files\libvpx\vpx_ports</Filter>
</ClInclude>
@@ -237,9 +225,6 @@
<ClInclude Include="..\vp8\common\treecoder.h">
<Filter>Header Files\libvpx\vp8\common</Filter>
</ClInclude>
<ClInclude Include="..\vp8\common\x86\filter_x86.h">
<Filter>Header Files\libvpx\vp8\common\x86</Filter>
</ClInclude>
<ClInclude Include="..\vp8\encoder\bitstream.h">
<Filter>Header Files\libvpx\vp8\encoder</Filter>
</ClInclude>
@@ -495,9 +480,6 @@
<ClInclude Include="..\vp9\decoder\vp9_decoder.h">
<Filter>Header Files\libvpx\vp9\decoder</Filter>
</ClInclude>
<ClInclude Include="..\vp9\decoder\vp9_dthread.h">
<Filter>Header Files\libvpx\vp9\decoder</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_aq_variance.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
@@ -645,9 +627,132 @@
<ClInclude Include="..\vpx_dsp\postproc.h">
<Filter>Header Files\libvpx\vpx_dsp</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\fdct.h">
<ClInclude Include="..\vp8\encoder\ethreading.h">
<Filter>Header Files\libvpx\vp8\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vp8\encoder\picklpf.h">
<Filter>Header Files\libvpx\vp8\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vp8\common\vp8_skin_detection.h">
<Filter>Header Files\libvpx\vp8\common</Filter>
</ClInclude>
<ClInclude Include="..\vp8\encoder\temporal_filter.h">
<Filter>Header Files\libvpx\vp8\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_job_queue.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_multi_thread.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\bitdepth_conversion_avx2.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\bitdepth_conversion_sse2.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\convolve_avx2.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\convolve_ssse3.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\inv_txfm_ssse3.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\highbd_inv_txfm_sse2.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\highbd_inv_txfm_sse4.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\mem_sse2.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\transpose_sse2.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\skin_detection.h">
<Filter>Header Files\libvpx\vpx_dsp</Filter>
</ClInclude>
<ClInclude Include="..\vpx_util\vpx_atomics.h">
<Filter>Header Files\libvpx\vpx_util</Filter>
</ClInclude>
<ClInclude Include="..\vpx_util\vpx_write_yuv_frame.h">
<Filter>Header Files\libvpx\vpx_util</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_blockiness.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_partition_models.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\convolve_sse2.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\quantize_sse2.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vpx_dsp\x86\quantize_ssse3.h">
<Filter>Header Files\libvpx\vpx_dsp\x86</Filter>
</ClInclude>
<ClInclude Include="..\vp9\decoder\vp9_job_queue.h">
<Filter>Header Files\libvpx\vp9\decoder</Filter>
</ClInclude>
<ClInclude Include="..\vpx_util\vpx_timestamp.h">
<Filter>Header Files\libvpx\vpx_util</Filter>
</ClInclude>
<ClInclude Include="..\vpx_ports\compiler_attributes.h">
<Filter>Header Files\libvpx\vpx_ports</Filter>
</ClInclude>
<ClInclude Include="..\vpx_ports\static_assert.h">
<Filter>Header Files\libvpx\vpx_ports</Filter>
</ClInclude>
<ClInclude Include="..\vpx\internal\vpx_codec_internal.h">
<Filter>Header Files\libvpx\vpx\internal</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vp8.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vp8cx.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vp8dx.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vpx_codec.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vpx_decoder.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vpx_encoder.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vpx_frame_buffer.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vpx_image.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vpx_integer.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vpx_ext_ratectrl.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_ext_ratectrl.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_firstpass_stats.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_tpl_model.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vpx_tpl.h">
<Filter>Source Files\libvpx\vpx</Filter>
</ClInclude>
</ItemGroup>
<ItemGroup>
<ClCompile Include="..\vpx\src\vpx_encoder.c">
@@ -764,9 +869,6 @@
<ClCompile Include="..\vp8\common\x86\idct_blk_sse2.c">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp8\common\x86\filter_x86.c">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp8\common\x86\loopfilter_x86.c">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</ClCompile>
@@ -845,9 +947,6 @@
<ClCompile Include="..\vp8\encoder\segmentation.c">
<Filter>Source Files\libvpx\vp8\encoder</Filter>
</ClCompile>
<ClCompile Include="..\vp8\encoder\x86\vp8_enc_stubs_mmx.c">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp8\encoder\x86\vp8_enc_stubs_sse2.c">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</ClCompile>
@@ -1004,9 +1103,6 @@
<ClCompile Include="..\vp8\decoder\decodeframe.c">
<Filter>Source Files\libvpx\vp8\decoder</Filter>
</ClCompile>
<ClCompile Include="..\vp8\encoder\x86\quantize_ssse3.c">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp8\encoder\x86\quantize_sse4.c">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</ClCompile>
@@ -1025,9 +1121,6 @@
<ClCompile Include="..\vp9\decoder\vp9_decoder.c">
<Filter>Source Files\libvpx\vp9\decoder</Filter>
</ClCompile>
<ClCompile Include="..\vp9\decoder\vp9_dthread.c">
<Filter>Source Files\libvpx\vp9\decoder</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\vp9_aq_complexity.c">
<Filter>Source Files\libvpx\vp9\encoder</Filter>
</ClCompile>
@@ -1073,21 +1166,9 @@
<ClCompile Include="..\vp9\encoder\x86\vp9_quantize_sse2.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\vp9_dct_ssse3.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\vp9_error_intrin_avx2.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\vp9_highbd_block_error_intrin_sse2.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx\src\svc_encodeframe.c">
<Filter>Source Files\libvpx\vpx</Filter>
</ClCompile>
<ClCompile Include="..\vp8\common\copy_c.c">
<Filter>Source Files\libvpx\vp8\common</Filter>
</ClCompile>
<ClCompile Include="..\vp8\common\vp8_loopfilter.c">
<Filter>Source Files\libvpx\vp8\common</Filter>
</ClCompile>
@@ -1181,15 +1262,9 @@
<ClCompile Include="..\vpx_dsp\x86\variance_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\variance_impl_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\variance_sse2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\vpx_asm_stubs.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\vpx_subpixel_8t_intrin_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
@@ -1208,9 +1283,6 @@
<ClCompile Include="..\vp9\encoder\x86\vp9_dct_intrin_sse2.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\vp9_diamond_search_sad_avx.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\add_noise.c">
<Filter>Source Files\libvpx\vpx_dsp</Filter>
</ClCompile>
@@ -1235,6 +1307,162 @@
<ClCompile Include="..\vpx_dsp\x86\sum_squares_sse2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp8\common\vp8_skin_detection.c">
<Filter>Source Files\libvpx\vp8\common</Filter>
</ClCompile>
<ClCompile Include="..\vp8\encoder\x86\vp8_quantize_ssse3.c">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\vp9_frame_scale.c">
<Filter>Source Files\libvpx\vp9\encoder</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\vp9_multi_thread.c">
<Filter>Source Files\libvpx\vp9\encoder</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\vp9_error_avx2.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\vp9_frame_scale_ssse3.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_intrapred_intrin_sse2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_intrapred_intrin_ssse3.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_convolve_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\inv_txfm_ssse3.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\avg_intrin_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\avg_pred_sse2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_idct4x4_add_sse2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_idct4x4_add_sse4.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_idct8x8_add_sse2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_idct8x8_add_sse4.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_idct16x16_add_sse2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_idct16x16_add_sse4.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_idct32x32_add_sse2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_idct32x32_add_sse4.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\quantize_avx.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\quantize_ssse3.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\sad4d_avx512.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\skin_detection.c">
<Filter>Source Files\libvpx\vpx_dsp</Filter>
</ClCompile>
<ClCompile Include="..\vpx_util\vpx_write_yuv_frame.c">
<Filter>Source Files\libvpx\vpx_util</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\temporal_filter_sse4.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="dce_defs.c">
<Filter>Source Files</Filter>
</ClCompile>
<ClCompile Include="..\vp8\common\x86\bilinear_filter_sse2.c">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\common\x86\vp9_highbd_iht4x4_add_sse4.c">
<Filter>Source Files\libvpx\vp9\common\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\common\x86\vp9_highbd_iht8x8_add_sse4.c">
<Filter>Source Files\libvpx\vp9\common\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\common\x86\vp9_highbd_iht16x16_add_sse4.c">
<Filter>Source Files\libvpx\vp9\common\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\vp9_quantize_avx2.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\post_proc_sse2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\vpx_subpixel_4t_intrin_sse2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp8\encoder\copy_c.c">
<Filter>Source Files\libvpx\vp8\encoder</Filter>
</ClCompile>
<ClCompile Include="..\vp9\decoder\vp9_job_queue.c">
<Filter>Source Files\libvpx\vp9\decoder</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\highbd_temporal_filter_sse4.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\vp9_iface_common.c">
<Filter>Source Files\libvpx\vp9</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\vp9_ext_ratectrl.c">
<Filter>Source Files\libvpx\vp9\encoder</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\vp9_quantize_ssse3.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\subtract_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_quantize_intrin_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\quantize_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_sad4d_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\highbd_sad_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\avg_pred_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\inv_txfm_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx\src\vpx_tpl.c">
<Filter>Source Files\libvpx\vpx</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\vp9_tpl_model.c">
<Filter>Source Files\libvpx\vp9\encoder</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\sse.c">
<Filter>Source Files\libvpx\vpx_dsp</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\sse_sse4.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\sse_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
</ItemGroup>
<ItemGroup>
<None Include="libvpx.def">
@@ -1242,12 +1470,6 @@
</None>
</ItemGroup>
<ItemGroup>
<YASM Include="..\vpx_ports\x86_abi_support.asm">
<Filter>Source Files\libvpx\vpx_ports</Filter>
</YASM>
<YASM Include="..\vpx_ports\emms.asm">
<Filter>Source Files\libvpx\vpx_ports</Filter>
</YASM>
<YASM Include="..\vp8\common\x86\recon_mmx.asm">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</YASM>
@@ -1257,9 +1479,6 @@
<YASM Include="..\vp8\common\x86\subpixel_mmx.asm">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</YASM>
<YASM Include="..\vp8\common\x86\subpixel_sse2.asm">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</YASM>
<YASM Include="..\vp8\common\x86\subpixel_ssse3.asm">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</YASM>
@@ -1284,48 +1503,21 @@
<YASM Include="..\vp8\encoder\x86\fwalsh_sse2.asm">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</YASM>
<YASM Include="..\vp8\encoder\x86\quantize_mmx.asm">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</YASM>
<YASM Include="..\vp8\encoder\x86\temporal_filter_apply_sse2.asm">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</YASM>
<YASM Include="..\vp8\encoder\x86\dct_sse2.asm">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</YASM>
<YASM Include="..\vp8\encoder\x86\encodeopt.asm">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</YASM>
<YASM Include="..\vp9\encoder\x86\vp9_error_sse2.asm">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</YASM>
<YASM Include="..\vp9\encoder\x86\vp9_temporal_filter_apply_sse2.asm">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</YASM>
<YASM Include="..\third_party\x86inc\x86inc.asm">
<Filter>Source Files\libvpx\third_party\x86inc</Filter>
</YASM>
<YASM Include="..\vp8\common\x86\loopfilter_block_sse2_x86_64.asm">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</YASM>
<YASM Include="..\vp9\common\x86\vp9_mfqe_sse2.asm">
<Filter>Source Files\libvpx\vp9\common\x86</Filter>
</YASM>
<YASM Include="..\vp9\encoder\x86\vp9_quantize_ssse3_x86_64.asm">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</YASM>
<YASM Include="..\vp8\common\x86\copy_sse2.asm">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</YASM>
<YASM Include="..\vp8\common\x86\copy_sse3.asm">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</YASM>
<YASM Include="..\vp9\encoder\x86\vp9_highbd_error_sse2.asm">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</YASM>
<YASM Include="..\vp9\encoder\x86\vp9_highbd_error_avx.asm">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</YASM>
<YASM Include="..\vpx_dsp\x86\vpx_subpixel_bilinear_sse2.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
@@ -1356,30 +1548,12 @@
<YASM Include="..\vpx_dsp\x86\intrapred_ssse3.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
<YASM Include="..\vpx_dsp\x86\inv_txfm_ssse3_x86_64.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
<YASM Include="..\vpx_dsp\x86\inv_wht_sse2.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
<YASM Include="..\vpx_dsp\x86\quantize_avx_x86_64.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
<YASM Include="..\vpx_dsp\x86\quantize_ssse3_x86_64.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
<YASM Include="..\vpx_dsp\x86\sad_sse2.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
<YASM Include="..\vpx_dsp\x86\sad_sse3.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
<YASM Include="..\vpx_dsp\x86\sad_sse4.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
<YASM Include="..\vpx_dsp\x86\sad_ssse3.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
<YASM Include="..\vpx_dsp\x86\sad4d_sse2.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
@@ -1419,5 +1593,26 @@
<YASM Include="..\vpx_dsp\x86\deblock_sse2.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
<YASM Include="..\vp8\common\x86\subpixel_sse2.asm">
<Filter>Source Files\libvpx\vp8\common\x86</Filter>
</YASM>
<YASM Include="..\vpx_dsp\x86\bitdepth_conversion_sse2.asm">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</YASM>
<YASM Include="..\vp8\encoder\x86\block_error_sse2.asm">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</YASM>
<YASM Include="..\vpx_ports\emms_mmx.asm">
<Filter>Source Files\libvpx\vpx_ports</Filter>
</YASM>
<YASM Include="..\vpx_ports\float_control_word.asm">
<Filter>Source Files\libvpx\vpx_ports</Filter>
</YASM>
<YASM Include="..\vp8\encoder\x86\copy_sse2.asm">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</YASM>
<YASM Include="..\vp8\encoder\x86\copy_sse3.asm">
<Filter>Source Files\libvpx\vp8\encoder\x86</Filter>
</YASM>
</ItemGroup>
</Project>
+515
View File
@@ -0,0 +1,515 @@
<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup>
<ClInclude Include="..\vp8\common\alloccommon.h" />
<ClInclude Include="..\vp8\common\blockd.h" />
<ClInclude Include="..\vp8\common\entropy.h" />
<ClInclude Include="..\vp8\common\entropymode.h" />
<ClInclude Include="..\vp8\common\entropymv.h" />
<ClInclude Include="..\vp8\common\extend.h" />
<ClInclude Include="..\vp8\common\filter.h" />
<ClInclude Include="..\vp8\common\findnearmv.h" />
<ClInclude Include="..\vp8\common\header.h" />
<ClInclude Include="..\vp8\common\loopfilter.h" />
<ClInclude Include="..\vp8\common\modecont.h" />
<ClInclude Include="..\vp8\common\postproc.h" />
<ClInclude Include="..\vp8\common\quant_common.h" />
<ClInclude Include="..\vp8\common\reconinter.h" />
<ClInclude Include="..\vp8\common\reconintra.h" />
<ClInclude Include="..\vp8\common\reconintra4x4.h" />
<ClInclude Include="..\vp8\common\setupintrarecon.h" />
<ClInclude Include="..\vp8\common\swapyv12buffer.h" />
<ClInclude Include="..\vp8\common\treecoder.h" />
<ClInclude Include="..\vp8\common\vp8_skin_detection.h" />
<ClInclude Include="..\vp8\decoder\dboolhuff.h" />
<ClInclude Include="..\vp8\decoder\decodemv.h" />
<ClInclude Include="..\vp8\decoder\decoderthreading.h" />
<ClInclude Include="..\vp8\decoder\detokenize.h" />
<ClInclude Include="..\vp8\decoder\onyxd_int.h" />
<ClInclude Include="..\vp8\decoder\treereader.h" />
<ClInclude Include="..\vp8\encoder\bitstream.h" />
<ClInclude Include="..\vp8\encoder\block.h" />
<ClInclude Include="..\vp8\encoder\boolhuff.h" />
<ClInclude Include="..\vp8\encoder\dct_value_cost.h" />
<ClInclude Include="..\vp8\encoder\dct_value_tokens.h" />
<ClInclude Include="..\vp8\encoder\defaultcoefcounts.h" />
<ClInclude Include="..\vp8\encoder\denoising.h" />
<ClInclude Include="..\vp8\encoder\encodeframe.h" />
<ClInclude Include="..\vp8\encoder\encodeintra.h" />
<ClInclude Include="..\vp8\encoder\encodemb.h" />
<ClInclude Include="..\vp8\encoder\encodemv.h" />
<ClInclude Include="..\vp8\encoder\ethreading.h" />
<ClInclude Include="..\vp8\encoder\firstpass.h" />
<ClInclude Include="..\vp8\encoder\lookahead.h" />
<ClInclude Include="..\vp8\encoder\mcomp.h" />
<ClInclude Include="..\vp8\encoder\modecosts.h" />
<ClInclude Include="..\vp8\encoder\onyx_int.h" />
<ClInclude Include="..\vp8\encoder\pickinter.h" />
<ClInclude Include="..\vp8\encoder\picklpf.h" />
<ClInclude Include="..\vp8\encoder\quantize.h" />
<ClInclude Include="..\vp8\encoder\ratectrl.h" />
<ClInclude Include="..\vp8\encoder\rdopt.h" />
<ClInclude Include="..\vp8\encoder\segmentation.h" />
<ClInclude Include="..\vp8\encoder\temporal_filter.h" />
<ClInclude Include="..\vp8\encoder\tokenize.h" />
<ClInclude Include="..\vp8\encoder\treewriter.h" />
<ClInclude Include="..\vp9\common\vp9_alloccommon.h" />
<ClInclude Include="..\vp9\common\vp9_blockd.h" />
<ClInclude Include="..\vp9\common\vp9_common.h" />
<ClInclude Include="..\vp9\common\vp9_common_data.h" />
<ClInclude Include="..\vp9\common\vp9_entropy.h" />
<ClInclude Include="..\vp9\common\vp9_entropymode.h" />
<ClInclude Include="..\vp9\common\vp9_entropymv.h" />
<ClInclude Include="..\vp9\common\vp9_enums.h" />
<ClInclude Include="..\vp9\common\vp9_filter.h" />
<ClInclude Include="..\vp9\common\vp9_frame_buffers.h" />
<ClInclude Include="..\vp9\common\vp9_idct.h" />
<ClInclude Include="..\vp9\common\vp9_loopfilter.h" />
<ClInclude Include="..\vp9\common\vp9_mfqe.h" />
<ClInclude Include="..\vp9\common\vp9_mv.h" />
<ClInclude Include="..\vp9\common\vp9_mvref_common.h" />
<ClInclude Include="..\vp9\common\vp9_ppflags.h" />
<ClInclude Include="..\vp9\common\vp9_pred_common.h" />
<ClInclude Include="..\vp9\common\vp9_quant_common.h" />
<ClInclude Include="..\vp9\common\vp9_reconinter.h" />
<ClInclude Include="..\vp9\common\vp9_reconintra.h" />
<ClInclude Include="..\vp9\common\vp9_scale.h" />
<ClInclude Include="..\vp9\common\vp9_scan.h" />
<ClInclude Include="..\vp9\common\vp9_seg_common.h" />
<ClInclude Include="..\vp9\common\vp9_thread_common.h" />
<ClInclude Include="..\vp9\common\vp9_tile_common.h" />
<ClInclude Include="..\vp9\decoder\vp9_decodeframe.h" />
<ClInclude Include="..\vp9\decoder\vp9_decodemv.h" />
<ClInclude Include="..\vp9\decoder\vp9_decoder.h" />
<ClInclude Include="..\vp9\decoder\vp9_detokenize.h" />
<ClInclude Include="..\vp9\decoder\vp9_dsubexp.h" />
<ClInclude Include="..\vp9\decoder\vp9_job_queue.h" />
<ClInclude Include="..\vp9\encoder\vp9_alt_ref_aq.h" />
<ClInclude Include="..\vp9\encoder\vp9_aq_360.h" />
<ClInclude Include="..\vp9\encoder\vp9_aq_complexity.h" />
<ClInclude Include="..\vp9\encoder\vp9_aq_cyclicrefresh.h" />
<ClInclude Include="..\vp9\encoder\vp9_aq_variance.h" />
<ClInclude Include="..\vp9\encoder\vp9_bitstream.h" />
<ClInclude Include="..\vp9\encoder\vp9_block.h" />
<ClInclude Include="..\vp9\encoder\vp9_blockiness.h" />
<ClInclude Include="..\vp9\encoder\vp9_context_tree.h" />
<ClInclude Include="..\vp9\encoder\vp9_cost.h" />
<ClInclude Include="..\vp9\encoder\vp9_denoiser.h" />
<ClInclude Include="..\vp9\encoder\vp9_encodeframe.h" />
<ClInclude Include="..\vp9\encoder\vp9_encodemb.h" />
<ClInclude Include="..\vp9\encoder\vp9_encodemv.h" />
<ClInclude Include="..\vp9\encoder\vp9_encoder.h" />
<ClInclude Include="..\vp9\encoder\vp9_ethread.h" />
<ClInclude Include="..\vp9\encoder\vp9_extend.h" />
<ClInclude Include="..\vp9\encoder\vp9_ext_ratectrl.h" />
<ClInclude Include="..\vp9\encoder\vp9_firstpass.h" />
<ClInclude Include="..\vp9\encoder\vp9_firstpass_stats.h" />
<ClInclude Include="..\vp9\encoder\vp9_job_queue.h" />
<ClInclude Include="..\vp9\encoder\vp9_lookahead.h" />
<ClInclude Include="..\vp9\encoder\vp9_mbgraph.h" />
<ClInclude Include="..\vp9\encoder\vp9_mcomp.h" />
<ClInclude Include="..\vp9\encoder\vp9_multi_thread.h" />
<ClInclude Include="..\vp9\encoder\vp9_noise_estimate.h" />
<ClInclude Include="..\vp9\encoder\vp9_partition_models.h" />
<ClInclude Include="..\vp9\encoder\vp9_picklpf.h" />
<ClInclude Include="..\vp9\encoder\vp9_pickmode.h" />
<ClInclude Include="..\vp9\encoder\vp9_quantize.h" />
<ClInclude Include="..\vp9\encoder\vp9_ratectrl.h" />
<ClInclude Include="..\vp9\encoder\vp9_rd.h" />
<ClInclude Include="..\vp9\encoder\vp9_rdopt.h" />
<ClInclude Include="..\vp9\encoder\vp9_resize.h" />
<ClInclude Include="..\vp9\encoder\vp9_segmentation.h" />
<ClInclude Include="..\vp9\encoder\vp9_skin_detection.h" />
<ClInclude Include="..\vp9\encoder\vp9_speed_features.h" />
<ClInclude Include="..\vp9\encoder\vp9_subexp.h" />
<ClInclude Include="..\vp9\encoder\vp9_svc_layercontext.h" />
<ClInclude Include="..\vp9\encoder\vp9_temporal_filter.h" />
<ClInclude Include="..\vp9\encoder\vp9_tokenize.h" />
<ClInclude Include="..\vp9\encoder\vp9_tpl_model.h" />
<ClInclude Include="..\vp9\encoder\vp9_treewriter.h" />
<ClInclude Include="..\vp9\vp9_dx_iface.h" />
<ClInclude Include="..\vp9\vp9_iface_common.h" />
<ClInclude Include="..\vpx\vp8.h" />
<ClInclude Include="..\vpx\vp8cx.h" />
<ClInclude Include="..\vpx\vp8dx.h" />
<ClInclude Include="..\vpx\vpx_codec.h" />
<ClInclude Include="..\vpx\vpx_decoder.h" />
<ClInclude Include="..\vpx\vpx_encoder.h" />
<ClInclude Include="..\vpx\vpx_ext_ratectrl.h" />
<ClInclude Include="..\vpx\vpx_frame_buffer.h" />
<ClInclude Include="..\vpx\vpx_image.h" />
<ClInclude Include="..\vpx\vpx_integer.h" />
<ClInclude Include="..\vpx\vpx_tpl.h" />
<ClInclude Include="..\vpx\internal\vpx_codec_internal.h" />
<ClInclude Include="..\vpx_dsp\bitreader.h" />
<ClInclude Include="..\vpx_dsp\bitreader_buffer.h" />
<ClInclude Include="..\vpx_dsp\bitwriter.h" />
<ClInclude Include="..\vpx_dsp\bitwriter_buffer.h" />
<ClInclude Include="..\vpx_dsp\fwd_txfm.h" />
<ClInclude Include="..\vpx_dsp\inv_txfm.h" />
<ClInclude Include="..\vpx_dsp\postproc.h" />
<ClInclude Include="..\vpx_dsp\prob.h" />
<ClInclude Include="..\vpx_dsp\psnr.h" />
<ClInclude Include="..\vpx_dsp\quantize.h" />
<ClInclude Include="..\vpx_dsp\skin_detection.h" />
<ClInclude Include="..\vpx_dsp\txfm_common.h" />
<ClInclude Include="..\vpx_dsp\variance.h" />
<ClInclude Include="..\vpx_dsp\vpx_convolve.h" />
<ClInclude Include="..\vpx_dsp\vpx_dsp_common.h" />
<ClInclude Include="..\vpx_dsp\vpx_filter.h" />
<ClInclude Include="..\vpx_dsp\x86\bitdepth_conversion_avx2.h" />
<ClInclude Include="..\vpx_dsp\x86\bitdepth_conversion_sse2.h" />
<ClInclude Include="..\vpx_dsp\x86\convolve.h" />
<ClInclude Include="..\vpx_dsp\x86\convolve_avx2.h" />
<ClInclude Include="..\vpx_dsp\x86\convolve_sse2.h" />
<ClInclude Include="..\vpx_dsp\x86\convolve_ssse3.h" />
<ClInclude Include="..\vpx_dsp\x86\fwd_dct32x32_impl_avx2.h" />
<ClInclude Include="..\vpx_dsp\x86\fwd_dct32x32_impl_sse2.h" />
<ClInclude Include="..\vpx_dsp\x86\fwd_txfm_impl_sse2.h" />
<ClInclude Include="..\vpx_dsp\x86\fwd_txfm_sse2.h" />
<ClInclude Include="..\vpx_dsp\x86\highbd_inv_txfm_sse2.h" />
<ClInclude Include="..\vpx_dsp\x86\highbd_inv_txfm_sse4.h" />
<ClInclude Include="..\vpx_dsp\x86\inv_txfm_sse2.h" />
<ClInclude Include="..\vpx_dsp\x86\inv_txfm_ssse3.h" />
<ClInclude Include="..\vpx_dsp\x86\mem_sse2.h" />
<ClInclude Include="..\vpx_dsp\x86\quantize_sse2.h" />
<ClInclude Include="..\vpx_dsp\x86\quantize_ssse3.h" />
<ClInclude Include="..\vpx_dsp\x86\transpose_sse2.h" />
<ClInclude Include="..\vpx_dsp\x86\txfm_common_sse2.h" />
<ClInclude Include="..\vpx_mem\include\vpx_mem_intrnl.h" />
<ClInclude Include="..\vpx_mem\vpx_mem.h" />
<ClInclude Include="..\vpx_ports\bitops.h" />
<ClInclude Include="..\vpx_ports\compiler_attributes.h" />
<ClInclude Include="..\vpx_ports\emmintrin_compat.h" />
<ClInclude Include="..\vpx_ports\mem.h" />
<ClInclude Include="..\vpx_ports\mem_ops.h" />
<ClInclude Include="..\vpx_ports\mem_ops_aligned.h" />
<ClInclude Include="..\vpx_ports\static_assert.h" />
<ClInclude Include="..\vpx_ports\system_state.h" />
<ClInclude Include="..\vpx_ports\vpx_once.h" />
<ClInclude Include="..\vpx_ports\vpx_timer.h" />
<ClInclude Include="..\vpx_ports\x86.h" />
<ClInclude Include="..\vpx_scale\vpx_scale.h" />
<ClInclude Include="..\vpx_scale\yv12config.h" />
<ClInclude Include="..\vpx_util\endian_inl.h" />
<ClInclude Include="..\vpx_util\vpx_atomics.h" />
<ClInclude Include="..\vpx_util\vpx_thread.h" />
<ClInclude Include="..\vpx_util\vpx_timestamp.h" />
<ClInclude Include="..\vpx_util\vpx_write_yuv_frame.h" />
<ClInclude Include="vpx_config.h" />
<ClInclude Include="vpx_version.h" />
<ClInclude Include="x86\vp8_rtcd.h" />
<ClInclude Include="x86\vp9_rtcd.h" />
<ClInclude Include="x86\vpx_dsp_rtcd.h" />
<ClInclude Include="x86\vpx_scale_rtcd.h" />
<ClInclude Include="x86_64\vp8_rtcd.h" />
<ClInclude Include="x86_64\vp9_rtcd.h" />
<ClInclude Include="x86_64\vpx_dsp_rtcd.h" />
<ClInclude Include="x86_64\vpx_scale_rtcd.h" />
</ItemGroup>
<ItemGroup>
<ClCompile Include="..\vp8\common\alloccommon.c" />
<ClCompile Include="..\vp8\common\blockd.c" />
<ClCompile Include="..\vp8\common\debugmodes.c" />
<ClCompile Include="..\vp8\common\dequantize.c" />
<ClCompile Include="..\vp8\common\entropy.c" />
<ClCompile Include="..\vp8\common\entropymode.c" />
<ClCompile Include="..\vp8\common\entropymv.c" />
<ClCompile Include="..\vp8\common\extend.c" />
<ClCompile Include="..\vp8\common\filter.c" />
<ClCompile Include="..\vp8\common\findnearmv.c" />
<ClCompile Include="..\vp8\common\generic\systemdependent.c" />
<ClCompile Include="..\vp8\common\idctllm.c" />
<ClCompile Include="..\vp8\common\idct_blk.c" />
<ClCompile Include="..\vp8\common\loopfilter_filters.c" />
<ClCompile Include="..\vp8\common\mbpitch.c" />
<ClCompile Include="..\vp8\common\mfqe.c" />
<ClCompile Include="..\vp8\common\modecont.c" />
<ClCompile Include="..\vp8\common\postproc.c" />
<ClCompile Include="..\vp8\common\quant_common.c" />
<ClCompile Include="..\vp8\common\reconinter.c" />
<ClCompile Include="..\vp8\common\reconintra.c" />
<ClCompile Include="..\vp8\common\reconintra4x4.c" />
<ClCompile Include="..\vp8\common\rtcd.c" />
<ClCompile Include="..\vp8\common\setupintrarecon.c" />
<ClCompile Include="..\vp8\common\swapyv12buffer.c" />
<ClCompile Include="..\vp8\common\treecoder.c" />
<ClCompile Include="..\vp8\common\vp8_loopfilter.c" />
<ClCompile Include="..\vp8\common\vp8_skin_detection.c" />
<ClCompile Include="..\vp8\common\x86\bilinear_filter_sse2.c" />
<ClCompile Include="..\vp8\common\x86\idct_blk_mmx.c" />
<ClCompile Include="..\vp8\common\x86\idct_blk_sse2.c" />
<ClCompile Include="..\vp8\common\x86\loopfilter_x86.c" />
<ClCompile Include="..\vp8\common\x86\vp8_asm_stubs.c" />
<ClCompile Include="..\vp8\decoder\dboolhuff.c" />
<ClCompile Include="..\vp8\decoder\decodeframe.c" />
<ClCompile Include="..\vp8\decoder\decodemv.c" />
<ClCompile Include="..\vp8\decoder\detokenize.c" />
<ClCompile Include="..\vp8\decoder\onyxd_if.c" />
<ClCompile Include="..\vp8\decoder\threading.c" />
<ClCompile Include="..\vp8\encoder\bitstream.c" />
<ClCompile Include="..\vp8\encoder\boolhuff.c" />
<ClCompile Include="..\vp8\encoder\copy_c.c" />
<ClCompile Include="..\vp8\encoder\dct.c" />
<ClCompile Include="..\vp8\encoder\denoising.c" />
<ClCompile Include="..\vp8\encoder\encodeframe.c" />
<ClCompile Include="..\vp8\encoder\encodeintra.c" />
<ClCompile Include="..\vp8\encoder\encodemb.c" />
<ClCompile Include="..\vp8\encoder\encodemv.c" />
<ClCompile Include="..\vp8\encoder\ethreading.c" />
<ClCompile Include="..\vp8\encoder\firstpass.c" />
<ClCompile Include="..\vp8\encoder\lookahead.c" />
<ClCompile Include="..\vp8\encoder\mcomp.c" />
<ClCompile Include="..\vp8\encoder\modecosts.c" />
<ClCompile Include="..\vp8\encoder\onyx_if.c" />
<ClCompile Include="..\vp8\encoder\pickinter.c" />
<ClCompile Include="..\vp8\encoder\picklpf.c" />
<ClCompile Include="..\vp8\encoder\ratectrl.c" />
<ClCompile Include="..\vp8\encoder\rdopt.c" />
<ClCompile Include="..\vp8\encoder\segmentation.c" />
<ClCompile Include="..\vp8\encoder\temporal_filter.c" />
<ClCompile Include="..\vp8\encoder\tokenize.c" />
<ClCompile Include="..\vp8\encoder\treewriter.c" />
<ClCompile Include="..\vp8\encoder\vp8_quantize.c" />
<ClCompile Include="..\vp8\encoder\x86\denoising_sse2.c" />
<ClCompile Include="..\vp8\encoder\x86\quantize_sse4.c" />
<ClCompile Include="..\vp8\encoder\x86\vp8_enc_stubs_sse2.c" />
<ClCompile Include="..\vp8\encoder\x86\vp8_quantize_sse2.c" />
<ClCompile Include="..\vp8\encoder\x86\vp8_quantize_ssse3.c" />
<ClCompile Include="..\vp8\vp8_cx_iface.c" />
<ClCompile Include="..\vp8\vp8_dx_iface.c" />
<ClCompile Include="..\vp9\common\vp9_alloccommon.c" />
<ClCompile Include="..\vp9\common\vp9_blockd.c" />
<ClCompile Include="..\vp9\common\vp9_common_data.c" />
<ClCompile Include="..\vp9\common\vp9_debugmodes.c" />
<ClCompile Include="..\vp9\common\vp9_entropy.c" />
<ClCompile Include="..\vp9\common\vp9_entropymode.c" />
<ClCompile Include="..\vp9\common\vp9_entropymv.c" />
<ClCompile Include="..\vp9\common\vp9_filter.c" />
<ClCompile Include="..\vp9\common\vp9_frame_buffers.c" />
<ClCompile Include="..\vp9\common\vp9_idct.c" />
<ClCompile Include="..\vp9\common\vp9_loopfilter.c" />
<ClCompile Include="..\vp9\common\vp9_mvref_common.c" />
<ClCompile Include="..\vp9\common\vp9_pred_common.c" />
<ClCompile Include="..\vp9\common\vp9_quant_common.c" />
<ClCompile Include="..\vp9\common\vp9_reconinter.c" />
<ClCompile Include="..\vp9\common\vp9_reconintra.c" />
<ClCompile Include="..\vp9\common\vp9_rtcd.c" />
<ClCompile Include="..\vp9\common\vp9_scale.c" />
<ClCompile Include="..\vp9\common\vp9_scan.c" />
<ClCompile Include="..\vp9\common\vp9_seg_common.c" />
<ClCompile Include="..\vp9\common\vp9_thread_common.c" />
<ClCompile Include="..\vp9\common\vp9_tile_common.c" />
<ClCompile Include="..\vp9\common\x86\vp9_highbd_iht16x16_add_sse4.c" />
<ClCompile Include="..\vp9\common\x86\vp9_highbd_iht4x4_add_sse4.c" />
<ClCompile Include="..\vp9\common\x86\vp9_highbd_iht8x8_add_sse4.c" />
<ClCompile Include="..\vp9\common\x86\vp9_idct_intrin_sse2.c" />
<ClCompile Include="..\vp9\decoder\vp9_decodeframe.c" />
<ClCompile Include="..\vp9\decoder\vp9_decodemv.c" />
<ClCompile Include="..\vp9\decoder\vp9_decoder.c" />
<ClCompile Include="..\vp9\decoder\vp9_detokenize.c" />
<ClCompile Include="..\vp9\decoder\vp9_dsubexp.c" />
<ClCompile Include="..\vp9\decoder\vp9_job_queue.c" />
<ClCompile Include="..\vp9\encoder\vp9_alt_ref_aq.c" />
<ClCompile Include="..\vp9\encoder\vp9_aq_360.c" />
<ClCompile Include="..\vp9\encoder\vp9_aq_complexity.c" />
<ClCompile Include="..\vp9\encoder\vp9_aq_cyclicrefresh.c" />
<ClCompile Include="..\vp9\encoder\vp9_aq_variance.c" />
<ClCompile Include="..\vp9\encoder\vp9_bitstream.c" />
<ClCompile Include="..\vp9\encoder\vp9_blockiness.c" />
<ClCompile Include="..\vp9\encoder\vp9_context_tree.c" />
<ClCompile Include="..\vp9\encoder\vp9_cost.c" />
<ClCompile Include="..\vp9\encoder\vp9_dct.c" />
<ClCompile Include="..\vp9\encoder\vp9_encodeframe.c" />
<ClCompile Include="..\vp9\encoder\vp9_encodemb.c" />
<ClCompile Include="..\vp9\encoder\vp9_encodemv.c" />
<ClCompile Include="..\vp9\encoder\vp9_encoder.c" />
<ClCompile Include="..\vp9\encoder\vp9_ethread.c" />
<ClCompile Include="..\vp9\encoder\vp9_ext_ratectrl.c" />
<ClCompile Include="..\vp9\encoder\vp9_extend.c" />
<ClCompile Include="..\vp9\encoder\vp9_firstpass.c" />
<ClCompile Include="..\vp9\encoder\vp9_frame_scale.c" />
<ClCompile Include="..\vp9\encoder\vp9_lookahead.c" />
<ClCompile Include="..\vp9\encoder\vp9_mbgraph.c" />
<ClCompile Include="..\vp9\encoder\vp9_mcomp.c" />
<ClCompile Include="..\vp9\encoder\vp9_multi_thread.c" />
<ClCompile Include="..\vp9\encoder\vp9_noise_estimate.c" />
<ClCompile Include="..\vp9\encoder\vp9_picklpf.c" />
<ClCompile Include="..\vp9\encoder\vp9_pickmode.c" />
<ClCompile Include="..\vp9\encoder\vp9_quantize.c" />
<ClCompile Include="..\vp9\encoder\vp9_ratectrl.c" />
<ClCompile Include="..\vp9\encoder\vp9_rd.c" />
<ClCompile Include="..\vp9\encoder\vp9_rdopt.c" />
<ClCompile Include="..\vp9\encoder\vp9_resize.c" />
<ClCompile Include="..\vp9\encoder\vp9_segmentation.c" />
<ClCompile Include="..\vp9\encoder\vp9_skin_detection.c" />
<ClCompile Include="..\vp9\encoder\vp9_speed_features.c" />
<ClCompile Include="..\vp9\encoder\vp9_subexp.c" />
<ClCompile Include="..\vp9\encoder\vp9_svc_layercontext.c" />
<ClCompile Include="..\vp9\encoder\vp9_temporal_filter.c" />
<ClCompile Include="..\vp9\encoder\vp9_tokenize.c" />
<ClCompile Include="..\vp9\encoder\vp9_tpl_model.c" />
<ClCompile Include="..\vp9\encoder\vp9_treewriter.c" />
<ClCompile Include="..\vp9\encoder\x86\highbd_temporal_filter_sse4.c" />
<ClCompile Include="..\vp9\encoder\x86\temporal_filter_sse4.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_dct_intrin_sse2.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_error_avx2.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_frame_scale_ssse3.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_highbd_block_error_intrin_sse2.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_quantize_avx2.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_quantize_sse2.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_quantize_ssse3.c" />
<ClCompile Include="..\vp9\vp9_cx_iface.c" />
<ClCompile Include="..\vp9\vp9_dx_iface.c" />
<ClCompile Include="..\vp9\vp9_iface_common.c" />
<ClCompile Include="..\vpx\src\vpx_codec.c" />
<ClCompile Include="..\vpx\src\vpx_decoder.c" />
<ClCompile Include="..\vpx\src\vpx_encoder.c" />
<ClCompile Include="..\vpx\src\vpx_image.c" />
<ClCompile Include="..\vpx_dsp\add_noise.c" />
<ClCompile Include="..\vpx_dsp\avg.c" />
<ClCompile Include="..\vpx_dsp\bitreader.c" />
<ClCompile Include="..\vpx_dsp\bitreader_buffer.c" />
<ClCompile Include="..\vpx_dsp\bitwriter.c" />
<ClCompile Include="..\vpx_dsp\bitwriter_buffer.c" />
<ClCompile Include="..\vpx_dsp\deblock.c" />
<ClCompile Include="..\vpx_dsp\fwd_txfm.c" />
<ClCompile Include="..\vpx_dsp\intrapred.c" />
<ClCompile Include="..\vpx_dsp\inv_txfm.c" />
<ClCompile Include="..\vpx_dsp\loopfilter.c" />
<ClCompile Include="..\vpx_dsp\prob.c" />
<ClCompile Include="..\vpx_dsp\psnr.c" />
<ClCompile Include="..\vpx_dsp\quantize.c" />
<ClCompile Include="..\vpx_dsp\sad.c" />
<ClCompile Include="..\vpx_dsp\sse.c" />
<ClCompile Include="..\vpx_dsp\skin_detection.c" />
<ClCompile Include="..\vpx_dsp\subtract.c" />
<ClCompile Include="..\vpx_dsp\sum_squares.c" />
<ClCompile Include="..\vpx_dsp\variance.c" />
<ClCompile Include="..\vpx_dsp\vpx_convolve.c" />
<ClCompile Include="..\vpx_dsp\vpx_dsp_rtcd.c" />
<ClCompile Include="..\vpx_dsp\x86\avg_intrin_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\avg_intrin_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\avg_pred_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\avg_pred_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\fwd_txfm_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\fwd_txfm_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_convolve_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_idct16x16_add_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_idct16x16_add_sse4.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_idct32x32_add_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_idct32x32_add_sse4.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_idct4x4_add_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_idct4x4_add_sse4.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_idct8x8_add_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_idct8x8_add_sse4.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_intrapred_intrin_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_intrapred_intrin_ssse3.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_loopfilter_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_sad4d_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_sad_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_quantize_intrin_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_quantize_intrin_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_variance_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\inv_txfm_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\inv_txfm_ssse3.c" />
<ClCompile Include="..\vpx_dsp\x86\inv_txfm_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\loopfilter_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\loopfilter_sse2.c">
<ObjectFileName>$(IntDir)\vpx_%(Filename).obj</ObjectFileName>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\post_proc_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\quantize_avx.c" />
<ClCompile Include="..\vpx_dsp\x86\quantize_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\quantize_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\quantize_ssse3.c" />
<ClCompile Include="..\vpx_dsp\x86\sad4d_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\sad4d_avx512.c">
<ExcludedFromBuild Condition="'$(VisualStudioVersion)' == '14.0'">true</ExcludedFromBuild>
<ExcludedFromBuild Condition="'$(VisualStudioVersion)' == '12.0'">true</ExcludedFromBuild>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\sad_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\sse_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\sse_sse4.c" />
<ClCompile Include="..\vpx_dsp\x86\subtract_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\sum_squares_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\variance_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\variance_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\vpx_subpixel_4t_intrin_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\vpx_subpixel_8t_intrin_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\vpx_subpixel_8t_intrin_ssse3.c" />
<ClCompile Include="..\vpx_mem\vpx_mem.c" />
<ClCompile Include="..\vpx_scale\generic\gen_scalers.c" />
<ClCompile Include="..\vpx_scale\generic\vpx_scale.c" />
<ClCompile Include="..\vpx_scale\generic\yv12config.c" />
<ClCompile Include="..\vpx_scale\generic\yv12extend.c" />
<ClCompile Include="..\vpx_scale\vpx_scale_rtcd.c" />
<ClCompile Include="..\vpx_util\vpx_thread.c" />
<ClCompile Include="..\vpx_util\vpx_write_yuv_frame.c" />
<ClCompile Include="dce_defs.c" />
<ClCompile Include="vpx_config.c">
<ObjectFileName>$(IntDir)\vpx_config_.obj</ObjectFileName>
</ClCompile>
</ItemGroup>
<ItemGroup>
<None Include="libvpx.def" />
</ItemGroup>
<ItemGroup>
<YASM Include="..\vp8\common\x86\dequantize_mmx.asm" />
<YASM Include="..\vp8\common\x86\idctllm_mmx.asm" />
<YASM Include="..\vp8\common\x86\idctllm_sse2.asm" />
<YASM Include="..\vp8\common\x86\iwalsh_sse2.asm" />
<YASM Include="..\vp8\common\x86\loopfilter_block_sse2_x86_64.asm">
<ExcludedFromBuild Condition="'$(Platform)'=='Win32'">true</ExcludedFromBuild>
</YASM>
<YASM Include="..\vp8\common\x86\loopfilter_sse2.asm" />
<YASM Include="..\vp8\common\x86\mfqe_sse2.asm" />
<YASM Include="..\vp8\common\x86\recon_mmx.asm" />
<YASM Include="..\vp8\common\x86\recon_sse2.asm" />
<YASM Include="..\vp8\common\x86\subpixel_mmx.asm" />
<YASM Include="..\vp8\common\x86\subpixel_sse2.asm" />
<YASM Include="..\vp8\common\x86\subpixel_ssse3.asm" />
<YASM Include="..\vp8\encoder\x86\block_error_sse2.asm" />
<YASM Include="..\vp8\encoder\x86\copy_sse2.asm" />
<YASM Include="..\vp8\encoder\x86\copy_sse3.asm" />
<YASM Include="..\vp8\encoder\x86\dct_sse2.asm" />
<YASM Include="..\vp8\encoder\x86\fwalsh_sse2.asm" />
<YASM Include="..\vp8\encoder\x86\temporal_filter_apply_sse2.asm" />
<YASM Include="..\vp9\common\x86\vp9_mfqe_sse2.asm" />
<YASM Include="..\vp9\encoder\x86\vp9_dct_sse2.asm" />
<YASM Include="..\vp9\encoder\x86\vp9_error_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\add_noise_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\avg_ssse3_x86_64.asm">
<ExcludedFromBuild Condition="'$(Platform)'=='Win32'">true</ExcludedFromBuild>
</YASM>
<YASM Include="..\vpx_dsp\x86\bitdepth_conversion_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\deblock_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\fwd_txfm_ssse3_x86_64.asm">
<ExcludedFromBuild Condition="'$(Platform)'=='Win32'">true</ExcludedFromBuild>
</YASM>
<YASM Include="..\vpx_dsp\x86\highbd_intrapred_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\highbd_sad4d_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\highbd_sad_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\highbd_subpel_variance_impl_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\highbd_variance_impl_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\intrapred_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\intrapred_ssse3.asm" />
<YASM Include="..\vpx_dsp\x86\inv_wht_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\sad4d_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\sad_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\ssim_opt_x86_64.asm">
<ExcludedFromBuild Condition="'$(Platform)'=='Win32'">true</ExcludedFromBuild>
</YASM>
<YASM Include="..\vpx_dsp\x86\subpel_variance_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\subtract_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\vpx_convolve_copy_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\vpx_high_subpixel_8t_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\vpx_high_subpixel_bilinear_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\vpx_subpixel_8t_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\vpx_subpixel_8t_ssse3.asm" />
<YASM Include="..\vpx_dsp\x86\vpx_subpixel_bilinear_sse2.asm" />
<YASM Include="..\vpx_dsp\x86\vpx_subpixel_bilinear_ssse3.asm" />
<YASM Include="..\vpx_ports\emms_mmx.asm" />
<YASM Include="..\vpx_ports\float_control_word.asm" />
</ItemGroup>
</Project>
+513
View File
@@ -0,0 +1,513 @@
<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" ToolsVersion="12.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<PropertyGroup Label="Globals">
<ProjectGuid>{A293418A-603A-4119-B7B4-1E6204606BA9}</ProjectGuid>
<RootNamespace>vpx</RootNamespace>
</PropertyGroup>
<ImportGroup Label="PropertySheets">
<Import Project="smp_winrt.props" />
<Import Project="libvpx_files.props" />
</ImportGroup>
<ImportGroup Label="ExtensionSettings">
<Import Project="$(VCTargetsPath)\BuildCustomizations\yasm.props" />
</ImportGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='DebugWinRT|Win32'">
<ClCompile>
<AdditionalIncludeDirectories>.\;..\;.\x86;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<DisableSpecificWarnings>4752;4703;%(DisableSpecificWarnings)</DisableSpecificWarnings>
</ClCompile>
<Lib>
<AdditionalOptions>/ignore:4221 %(AdditionalOptions)</AdditionalOptions>
</Lib>
<PostBuildEvent>
<Command>mkdir "$(OutDir)"\include
mkdir "$(OutDir)"\include\vpx
copy ..\vpx\*.h "$(OutDir)"\include\vpx
mkdir $(OutDir)\licenses
copy ..\LICENSE $(OutDir)\licenses\libvpx.txt</Command>
</PostBuildEvent>
<PreBuildEvent>
<Command>if exist ..\vpx_config.h (
del ..\vpx_config.h
)
if exist ..\vpx_config.c (
del ..\vpx_config.c
)
if exist ..\vpx_config.asm (
del ..\vpx_config.asm
)
if exist ..\vpx_version.h (
del ..\vpx_version.h
)
if exist ..\vpx_scale_rtcd.h (
del ..\vpx_scale_rtcd.h
)
if exist ..\vpx_dsp_rtcd.h (
del ..\vpx_dsp_rtcd.h
)
if exist ..\vp8_rtcd.h (
del ..\vp8_rtcd.h
)
if exist ..\vp9_rtcd.h (
del ..\vp9_rtcd.h
)
if exist "$(OutDir)"\include\vpx (
rd /s /q "$(OutDir)"\include\vpx
cd ../
cd $(ProjectDir)
)</Command>
</PreBuildEvent>
<YASM>
<IncludePaths>$(ProjectDir);$(ProjectDir)\..\;%(IncludePaths)</IncludePaths>
</YASM>
<CustomBuildStep>
<Message>Custom Clean Step</Message>
</CustomBuildStep>
<CustomBuildStep>
<Outputs>force_clean</Outputs>
<Command>if exist "$(OutDir)"\include\vpx (
rmdir /s /q "$(OutDir)"\include\vpx
)
if exist $(OutDir)\licenses\libvpx.txt (
del /f /q $(OutDir)\licenses\libvpx.txt
)</Command>
</CustomBuildStep>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='DebugWinRT|x64'">
<ClCompile>
<AdditionalIncludeDirectories>.\;..\;.\x86_64;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<DisableSpecificWarnings>4752;4703;%(DisableSpecificWarnings)</DisableSpecificWarnings>
</ClCompile>
<Lib>
<AdditionalOptions>/ignore:4221 %(AdditionalOptions)</AdditionalOptions>
</Lib>
<PostBuildEvent>
<Command>mkdir "$(OutDir)"\include
mkdir "$(OutDir)"\include\vpx
copy ..\vpx\*.h "$(OutDir)"\include\vpx
mkdir $(OutDir)\licenses
copy ..\LICENSE $(OutDir)\licenses\libvpx.txt</Command>
</PostBuildEvent>
<PreBuildEvent>
<Command>if exist ..\vpx_config.h (
del ..\vpx_config.h
)
if exist ..\vpx_config.c (
del ..\vpx_config.c
)
if exist ..\vpx_config.asm (
del ..\vpx_config.asm
)
if exist ..\vpx_version.h (
del ..\vpx_version.h
)
if exist ..\vpx_scale_rtcd.h (
del ..\vpx_scale_rtcd.h
)
if exist ..\vpx_dsp_rtcd.h (
del ..\vpx_dsp_rtcd.h
)
if exist ..\vp8_rtcd.h (
del ..\vp8_rtcd.h
)
if exist ..\vp9_rtcd.h (
del ..\vp9_rtcd.h
)
if exist "$(OutDir)"\include\vpx (
rd /s /q "$(OutDir)"\include\vpx
cd ../
cd $(ProjectDir)
)</Command>
</PreBuildEvent>
<YASM>
<IncludePaths>$(ProjectDir);$(ProjectDir)\..\;%(IncludePaths)</IncludePaths>
</YASM>
<CustomBuildStep>
<Message>Custom Clean Step</Message>
</CustomBuildStep>
<CustomBuildStep>
<Outputs>force_clean</Outputs>
<Command>if exist "$(OutDir)"\include\vpx (
rmdir /s /q "$(OutDir)"\include\vpx
)
if exist $(OutDir)\licenses\libvpx.txt (
del /f /q $(OutDir)\licenses\libvpx.txt
)</Command>
</CustomBuildStep>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLLWinRT|Win32'">
<ClCompile>
<AdditionalIncludeDirectories>.\;..\;.\x86;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<DisableSpecificWarnings>4752;4703;%(DisableSpecificWarnings)</DisableSpecificWarnings>
</ClCompile>
<Link>
<ModuleDefinitionFile>.\libvpx.def</ModuleDefinitionFile>
</Link>
<PostBuildEvent>
<Command>mkdir "$(OutDir)"\include
mkdir "$(OutDir)"\include\vpx
copy ..\vpx\*.h "$(OutDir)"\include\vpx
mkdir $(OutDir)\licenses
copy ..\LICENSE $(OutDir)\licenses\libvpx.txt</Command>
</PostBuildEvent>
<PreBuildEvent>
<Command>if exist ..\vpx_config.h (
del ..\vpx_config.h
)
if exist ..\vpx_config.c (
del ..\vpx_config.c
)
if exist ..\vpx_config.asm (
del ..\vpx_config.asm
)
if exist ..\vpx_version.h (
del ..\vpx_version.h
)
if exist ..\vpx_scale_rtcd.h (
del ..\vpx_scale_rtcd.h
)
if exist ..\vpx_dsp_rtcd.h (
del ..\vpx_dsp_rtcd.h
)
if exist ..\vp8_rtcd.h (
del ..\vp8_rtcd.h
)
if exist ..\vp9_rtcd.h (
del ..\vp9_rtcd.h
)
if exist "$(OutDir)"\include\vpx (
rd /s /q "$(OutDir)"\include\vpx
cd ../
cd $(ProjectDir)
)</Command>
</PreBuildEvent>
<YASM>
<IncludePaths>$(ProjectDir);$(ProjectDir)\..\;%(IncludePaths)</IncludePaths>
</YASM>
<CustomBuildStep>
<Message>Custom Clean Step</Message>
</CustomBuildStep>
<CustomBuildStep>
<Outputs>force_clean</Outputs>
<Command>if exist "$(OutDir)"\include\vpx (
rmdir /s /q "$(OutDir)"\include\vpx
)
if exist $(OutDir)\licenses\libvpx.txt (
del /f /q $(OutDir)\licenses\libvpx.txt
)</Command>
</CustomBuildStep>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLLWinRT|x64'">
<ClCompile>
<AdditionalIncludeDirectories>.\;..\;.\x86_64;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<DisableSpecificWarnings>4752;4703;%(DisableSpecificWarnings)</DisableSpecificWarnings>
</ClCompile>
<Link>
<ModuleDefinitionFile>.\libvpx.def</ModuleDefinitionFile>
</Link>
<PostBuildEvent>
<Command>mkdir "$(OutDir)"\include
mkdir "$(OutDir)"\include\vpx
copy ..\vpx\*.h "$(OutDir)"\include\vpx
mkdir $(OutDir)\licenses
copy ..\LICENSE $(OutDir)\licenses\libvpx.txt</Command>
</PostBuildEvent>
<PreBuildEvent>
<Command>if exist ..\vpx_config.h (
del ..\vpx_config.h
)
if exist ..\vpx_config.c (
del ..\vpx_config.c
)
if exist ..\vpx_config.asm (
del ..\vpx_config.asm
)
if exist ..\vpx_version.h (
del ..\vpx_version.h
)
if exist ..\vpx_scale_rtcd.h (
del ..\vpx_scale_rtcd.h
)
if exist ..\vpx_dsp_rtcd.h (
del ..\vpx_dsp_rtcd.h
)
if exist ..\vp8_rtcd.h (
del ..\vp8_rtcd.h
)
if exist ..\vp9_rtcd.h (
del ..\vp9_rtcd.h
)
if exist "$(OutDir)"\include\vpx (
rd /s /q "$(OutDir)"\include\vpx
cd ../
cd $(ProjectDir)
)</Command>
</PreBuildEvent>
<YASM>
<IncludePaths>$(ProjectDir);$(ProjectDir)\..\;%(IncludePaths)</IncludePaths>
</YASM>
<CustomBuildStep>
<Message>Custom Clean Step</Message>
</CustomBuildStep>
<CustomBuildStep>
<Outputs>force_clean</Outputs>
<Command>if exist "$(OutDir)"\include\vpx (
rmdir /s /q "$(OutDir)"\include\vpx
)
if exist $(OutDir)\licenses\libvpx.txt (
del /f /q $(OutDir)\licenses\libvpx.txt
)</Command>
</CustomBuildStep>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseWinRT|Win32'">
<ClCompile>
<AdditionalIncludeDirectories>.\;..\;.\x86;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<DisableSpecificWarnings>4752;4703;%(DisableSpecificWarnings)</DisableSpecificWarnings>
</ClCompile>
<Lib>
<AdditionalOptions>/ignore:4221 %(AdditionalOptions)</AdditionalOptions>
</Lib>
<PostBuildEvent>
<Command>mkdir "$(OutDir)"\include
mkdir "$(OutDir)"\include\vpx
copy ..\vpx\*.h "$(OutDir)"\include\vpx
mkdir $(OutDir)\licenses
copy ..\LICENSE $(OutDir)\licenses\libvpx.txt</Command>
</PostBuildEvent>
<PreBuildEvent>
<Command>if exist ..\vpx_config.h (
del ..\vpx_config.h
)
if exist ..\vpx_config.c (
del ..\vpx_config.c
)
if exist ..\vpx_config.asm (
del ..\vpx_config.asm
)
if exist ..\vpx_version.h (
del ..\vpx_version.h
)
if exist ..\vpx_scale_rtcd.h (
del ..\vpx_scale_rtcd.h
)
if exist ..\vpx_dsp_rtcd.h (
del ..\vpx_dsp_rtcd.h
)
if exist ..\vp8_rtcd.h (
del ..\vp8_rtcd.h
)
if exist ..\vp9_rtcd.h (
del ..\vp9_rtcd.h
)
if exist "$(OutDir)"\include\vpx (
rd /s /q "$(OutDir)"\include\vpx
cd ../
cd $(ProjectDir)
)</Command>
</PreBuildEvent>
<YASM>
<IncludePaths>$(ProjectDir);$(ProjectDir)\..\;%(IncludePaths)</IncludePaths>
</YASM>
<CustomBuildStep>
<Message>Custom Clean Step</Message>
</CustomBuildStep>
<CustomBuildStep>
<Outputs>force_clean</Outputs>
<Command>if exist "$(OutDir)"\include\vpx (
rmdir /s /q "$(OutDir)"\include\vpx
)
if exist $(OutDir)\licenses\libvpx.txt (
del /f /q $(OutDir)\licenses\libvpx.txt
)</Command>
</CustomBuildStep>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseWinRT|x64'">
<ClCompile>
<AdditionalIncludeDirectories>.\;..\;.\x86_64;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<DisableSpecificWarnings>4752;4703;%(DisableSpecificWarnings)</DisableSpecificWarnings>
</ClCompile>
<Lib>
<AdditionalOptions>/ignore:4221 %(AdditionalOptions)</AdditionalOptions>
</Lib>
<PostBuildEvent>
<Command>mkdir "$(OutDir)"\include
mkdir "$(OutDir)"\include\vpx
copy ..\vpx\*.h "$(OutDir)"\include\vpx
mkdir $(OutDir)\licenses
copy ..\LICENSE $(OutDir)\licenses\libvpx.txt</Command>
</PostBuildEvent>
<PreBuildEvent>
<Command>if exist ..\vpx_config.h (
del ..\vpx_config.h
)
if exist ..\vpx_config.c (
del ..\vpx_config.c
)
if exist ..\vpx_config.asm (
del ..\vpx_config.asm
)
if exist ..\vpx_version.h (
del ..\vpx_version.h
)
if exist ..\vpx_scale_rtcd.h (
del ..\vpx_scale_rtcd.h
)
if exist ..\vpx_dsp_rtcd.h (
del ..\vpx_dsp_rtcd.h
)
if exist ..\vp8_rtcd.h (
del ..\vp8_rtcd.h
)
if exist ..\vp9_rtcd.h (
del ..\vp9_rtcd.h
)
if exist "$(OutDir)"\include\vpx (
rd /s /q "$(OutDir)"\include\vpx
cd ../
cd $(ProjectDir)
)</Command>
</PreBuildEvent>
<YASM>
<IncludePaths>$(ProjectDir);$(ProjectDir)\..\;%(IncludePaths)</IncludePaths>
</YASM>
<CustomBuildStep>
<Message>Custom Clean Step</Message>
</CustomBuildStep>
<CustomBuildStep>
<Outputs>force_clean</Outputs>
<Command>if exist "$(OutDir)"\include\vpx (
rmdir /s /q "$(OutDir)"\include\vpx
)
if exist $(OutDir)\licenses\libvpx.txt (
del /f /q $(OutDir)\licenses\libvpx.txt
)</Command>
</CustomBuildStep>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLLWinRT|Win32'">
<ClCompile>
<AdditionalIncludeDirectories>.\;..\;.\x86;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<DisableSpecificWarnings>4752;4703;%(DisableSpecificWarnings)</DisableSpecificWarnings>
</ClCompile>
<Link>
<ModuleDefinitionFile>.\libvpx.def</ModuleDefinitionFile>
</Link>
<PostBuildEvent>
<Command>mkdir "$(OutDir)"\include
mkdir "$(OutDir)"\include\vpx
copy ..\vpx\*.h "$(OutDir)"\include\vpx
mkdir $(OutDir)\licenses
copy ..\LICENSE $(OutDir)\licenses\libvpx.txt</Command>
</PostBuildEvent>
<PreBuildEvent>
<Command>if exist ..\vpx_config.h (
del ..\vpx_config.h
)
if exist ..\vpx_config.c (
del ..\vpx_config.c
)
if exist ..\vpx_config.asm (
del ..\vpx_config.asm
)
if exist ..\vpx_version.h (
del ..\vpx_version.h
)
if exist ..\vpx_scale_rtcd.h (
del ..\vpx_scale_rtcd.h
)
if exist ..\vpx_dsp_rtcd.h (
del ..\vpx_dsp_rtcd.h
)
if exist ..\vp8_rtcd.h (
del ..\vp8_rtcd.h
)
if exist ..\vp9_rtcd.h (
del ..\vp9_rtcd.h
)
if exist "$(OutDir)"\include\vpx (
rd /s /q "$(OutDir)"\include\vpx
cd ../
cd $(ProjectDir)
)</Command>
</PreBuildEvent>
<YASM>
<IncludePaths>$(ProjectDir);$(ProjectDir)\..\;%(IncludePaths)</IncludePaths>
</YASM>
<CustomBuildStep>
<Message>Custom Clean Step</Message>
</CustomBuildStep>
<CustomBuildStep>
<Outputs>force_clean</Outputs>
<Command>if exist "$(OutDir)"\include\vpx (
rmdir /s /q "$(OutDir)"\include\vpx
)
if exist $(OutDir)\licenses\libvpx.txt (
del /f /q $(OutDir)\licenses\libvpx.txt
)</Command>
</CustomBuildStep>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLLWinRT|x64'">
<ClCompile>
<AdditionalIncludeDirectories>.\;..\;.\x86_64;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<DisableSpecificWarnings>4752;4703;%(DisableSpecificWarnings)</DisableSpecificWarnings>
</ClCompile>
<Link>
<ModuleDefinitionFile>.\libvpx.def</ModuleDefinitionFile>
</Link>
<PostBuildEvent>
<Command>mkdir "$(OutDir)"\include
mkdir "$(OutDir)"\include\vpx
copy ..\vpx\*.h "$(OutDir)"\include\vpx
mkdir $(OutDir)\licenses
copy ..\LICENSE $(OutDir)\licenses\libvpx.txt</Command>
</PostBuildEvent>
<PreBuildEvent>
<Command>if exist ..\vpx_config.h (
del ..\vpx_config.h
)
if exist ..\vpx_config.c (
del ..\vpx_config.c
)
if exist ..\vpx_config.asm (
del ..\vpx_config.asm
)
if exist ..\vpx_version.h (
del ..\vpx_version.h
)
if exist ..\vpx_scale_rtcd.h (
del ..\vpx_scale_rtcd.h
)
if exist ..\vpx_dsp_rtcd.h (
del ..\vpx_dsp_rtcd.h
)
if exist ..\vp8_rtcd.h (
del ..\vp8_rtcd.h
)
if exist ..\vp9_rtcd.h (
del ..\vp9_rtcd.h
)
if exist "$(OutDir)"\include\vpx (
rd /s /q "$(OutDir)"\include\vpx
cd ../
cd $(ProjectDir)
)</Command>
</PreBuildEvent>
<YASM>
<IncludePaths>$(ProjectDir);$(ProjectDir)\..\;%(IncludePaths)</IncludePaths>
</YASM>
<CustomBuildStep>
<Message>Custom Clean Step</Message>
</CustomBuildStep>
<CustomBuildStep>
<Outputs>force_clean</Outputs>
<Command>if exist "$(OutDir)"\include\vpx (
rmdir /s /q "$(OutDir)"\include\vpx
)
if exist $(OutDir)\licenses\libvpx.txt (
del /f /q $(OutDir)\licenses\libvpx.txt
)</Command>
</CustomBuildStep>
</ItemDefinitionGroup>
<ImportGroup Label="ExtensionTargets">
<Import Project="$(VCTargetsPath)\BuildCustomizations\yasm.targets" />
</ImportGroup>
</Project>
File diff suppressed because it is too large Load Diff
+59
View File
@@ -0,0 +1,59 @@
@ECHO OFF
SET PROJECT=libvpx
@REM Detect the newest available Windows SDK
CALL :GetWindowsSdkVer
@REM Open the project
%PROJECT%.sln
EXIT /B 0
:GetWindowsSdkVer
SET WindowsTargetPlatformVersion=
IF "%WindowsTargetPlatformVersion%"=="" CALL :GetWin10SdkVer
IF "%WindowsTargetPlatformVersion%"=="" CALL :GetWin81SdkVer
EXIT /B 0
:GetWin10SdkVer
CALL :GetWin10SdkVerHelper HKLM\SOFTWARE\Wow6432Node > nul 2>&1
IF errorlevel 1 CALL :GetWin10SdkVerHelper HKCU\SOFTWARE\Wow6432Node > nul 2>&1
IF errorlevel 1 CALL :GetWin10SdkVerHelper HKLM\SOFTWARE > nul 2>&1
IF errorlevel 1 CALL :GetWin10SdkVerHelper HKCU\SOFTWARE > nul 2>&1
IF errorlevel 1 EXIT /B 1
EXIT /B 0
:GetWin10SdkVerHelper
@REM Get Windows 10 SDK installed folder
FOR /F "tokens=1,2*" %%i IN ('reg query "%1\Microsoft\Microsoft SDKs\Windows\v10.0" /v "InstallationFolder"') DO (
IF "%%i"=="InstallationFolder" (
SET WindowsSdkDir=%%~k
)
)
@REM get windows 10 sdk version number
SETLOCAL enableDelayedExpansion
IF NOT "%WindowsSdkDir%"=="" FOR /f %%i IN ('dir "%WindowsSdkDir%include\" /b /ad-h /on') DO (
@REM Skip if Windows.h is not found in %%i\um. This would indicate that only the UCRT MSIs were
@REM installed for this Windows SDK version.
IF EXIST "%WindowsSdkDir%include\%%i\um\Windows.h" (
SET result=%%i
IF "!result:~0,3!"=="10." (
SET SDK=!result!
IF "!result!"=="%VSCMD_ARG_WINSDK%" SET findSDK=1
)
)
)
IF "%findSDK%"=="1" SET SDK=%VSCMD_ARG_WINSDK%
ENDLOCAL & SET WindowsTargetPlatformVersion=%SDK%
IF "%WindowsTargetPlatformVersion%"=="" (
EXIT /B 1
)
EXIT /B 0
:GetWin81SdkVer
SET WindowsTargetPlatformVersion=8.1
EXIT /B 0
+31 -31
View File
@@ -1,31 +1,31 @@
This is a small list of steps in order to build libvpx into a msvc DLL and lib file.
The project contains Release and Debug builds for static lib files (Debug/Release)
as well as dynamic shared dll files (DebugDLL/ReleaseDLL).
Choose whichever project configuration meets your requirements.
*** Building with YASM ***
In order to build libvpx using msvc you must first download and install YASM.
YASM is required to compile all libvpx assembly files.
1) Download yasm for Visual Studio from here:
http://yasm.tortall.net/Download.html
Currently only up to VS2010 is supported on the web page so just download that.
2) Follow the instructions found within the downloaded archive for installing YASM
Note: With newer version of VS the BuildCustomization path should be the version specific to the VS version you are using.
so for instance the path for Visual Studio 2013 is:
C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\BuildCustomizations
and the path for Visual Studio 2015 would be:
C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V140\BuildCustomizations
and the path for Visual Studio 2017 would be:
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE\VC\VCTargets\BuildCustomizations
The exact location can vary based on installation paths and can be found within Visual Studio in $(VCTargetsPath) macro
3) In order to use version 1.3.0 of vsyasm you will also have to fix a error in the distributed build customizations
a) Open vsyasm.props
b) Replace the 1 occurrence of $(Platform) with win$(PlatformArchitecture)
This is a small list of steps in order to build libvpx into a msvc dll and/or lib file.
The project contains Release and Debug builds for static lib files (Debug/Release)
as well as dynamic shared dll files (DebugDLL/ReleaseDLL). Along with the standard
windows dll/lib configurations mentioned above there are also equivalent variants that
can be used to compile for WinRT/UWP (These configurations have a WinRT suffix).
There are also architecture configurations for either 32bit (x86) or 64bit (x64) compilation.
Choose whichever project configuration meets your requirements.
The project configurations support being built with various different windows SDK versions.
By default they will use the lowest SDK version that would be available for Visual Studio
version 2013 and up (This is the 8.1 SDK). However a batch file is also included
(libvpx_with_latest_sdk.bat) which can be used to auto detect the newest available SDK
installed on the host machine and then open the project using that as the compilation SDK.
When using the WinRT/UWP project configurations the projects will automatically compile towards
the default application target for the Version of Visual Studio being used:
VS 2013: 8.1
VS 2015: 8.1
VS 2017+: 10.0.10240.0
*** Building with YASM ***
In order to build libvpx using msvc you must first download and install YASM.
YASM is required to compile all assembly files.
1) Visual Studio YASM integration can be downloaded from https://github.com/ShiftMediaProject/VSYASM/releases/latest
2) Once downloaded simply follow the install instructions included in the download.
+350
View File
@@ -0,0 +1,350 @@
<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup Label="ProjectConfigurations">
<ProjectConfiguration Include="DebugDLL|Win32">
<Configuration>DebugDLL</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="DebugDLL|x64">
<Configuration>DebugDLL</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Debug|Win32">
<Configuration>Debug</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Debug|x64">
<Configuration>Debug</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="ReleaseDLL|Win32">
<Configuration>ReleaseDLL</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="ReleaseDLL|x64">
<Configuration>ReleaseDLL</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|Win32">
<Configuration>Release</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|x64">
<Configuration>Release</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
</ItemGroup>
<PropertyGroup Label="Globals">
<PlatformToolset Condition="'$(VisualStudioVersion)' == '17.0'">v143</PlatformToolset>
<PlatformToolset Condition="'$(VisualStudioVersion)' == '16.0'">v142</PlatformToolset>
<PlatformToolset Condition="'$(VisualStudioVersion)' == '15.0'">v141</PlatformToolset>
<PlatformToolset Condition="'$(VisualStudioVersion)' == '14.0'">v140</PlatformToolset>
<PlatformToolset Condition="'$(VisualStudioVersion)' == '12.0'">v120</PlatformToolset>
<CharacterSet>Unicode</CharacterSet>
<WindowsTargetPlatformVersion Condition="'$(WindowsTargetPlatformVersion)' != ''">$(WindowsTargetPlatformVersion)</WindowsTargetPlatformVersion>
<WindowsTargetPlatformVersion Condition="'$(VisualStudioVersion)'&gt;= '16.0'">10.0</WindowsTargetPlatformVersion>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration">
<ConfigurationType>StaticLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="Configuration">
<ConfigurationType>StaticLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLL|Win32'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLL|x64'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration">
<ConfigurationType>StaticLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">
<ConfigurationType>StaticLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLL|Win32'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<WholeProgramOptimization>true</WholeProgramOptimization>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLL|x64'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<WholeProgramOptimization>true</WholeProgramOptimization>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
</ImportGroup>
<ImportGroup Label="PropertySheets">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<PropertyGroup Label="UserMacros" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<TargetName>lib$(RootNamespace)d</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<TargetName>lib$(RootNamespace)d</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLL|Win32'">
<TargetName>$(RootNamespace)d</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLL|x64'">
<TargetName>$(RootNamespace)d</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<TargetName>lib$(RootNamespace)</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<TargetName>lib$(RootNamespace)</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLL|Win32'">
<TargetName>$(RootNamespace)</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLL|x64'">
<TargetName>$(RootNamespace)</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>Disabled</Optimization>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<PreprocessorDefinitions>_WINDOWS;WIN32;_WIN32_WINNT=0x0601;_DEBUG;_LIB;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<ProgramDataBaseFileName>$(OutDir)\lib\x86\$(TargetName).pdb</ProgramDataBaseFileName>
<MinimalRebuild>false</MinimalRebuild>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Lib>
<OutputFile>$(OutDir)\lib\x86\$(TargetName)$(TargetExt)</OutputFile>
<TargetMachine>MachineX86</TargetMachine>
<SubSystem>Console</SubSystem>
<AdditionalLibraryDirectories>$(OutDir)\lib\x86\;$(ProjectDir)\..\..\prebuilt\lib\x86\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
</Lib>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>Disabled</Optimization>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<PreprocessorDefinitions>_WINDOWS;WIN32;_WIN32_WINNT=0x0601;_DEBUG;_LIB;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<ProgramDataBaseFileName>$(OutDir)\lib\x64\$(TargetName).pdb</ProgramDataBaseFileName>
<MinimalRebuild>false</MinimalRebuild>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Lib>
<OutputFile>$(OutDir)\lib\x64\$(TargetName)$(TargetExt)</OutputFile>
<TargetMachine>MachineX64</TargetMachine>
<SubSystem>Console</SubSystem>
<AdditionalLibraryDirectories>$(OutDir)\lib\x64\;$(ProjectDir)\..\..\prebuilt\lib\x64\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
</Lib>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLL|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>Disabled</Optimization>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<PreprocessorDefinitions>_WINDOWS;WIN32;_WIN32_WINNT=0x0601;_DEBUG;_USRDLL;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<BufferSecurityCheck>true</BufferSecurityCheck>
<ProgramDataBaseFileName>$(IntDir)$(TargetName).pdb</ProgramDataBaseFileName>
<MinimalRebuild>false</MinimalRebuild>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Link>
<OutputFile>$(OutDir)\bin\x86\$(TargetName)$(TargetExt)</OutputFile>
<ProgramDatabaseFile>$(OutDir)\lib\x86\$(TargetName).pdb</ProgramDatabaseFile>
<SubSystem>Console</SubSystem>
<ImportLibrary>$(OutDir)\lib\x86\$(TargetName).lib</ImportLibrary>
<ProfileGuidedDatabase>$(IntDir)\$(TargetName).pgd</ProfileGuidedDatabase>
<LargeAddressAware>true</LargeAddressAware>
<AdditionalLibraryDirectories>$(OutDir)\lib\x86\;$(ProjectDir)\..\..\prebuilt\lib\x86\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion>6.1</MinimumRequiredVersion>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLL|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>Disabled</Optimization>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<PreprocessorDefinitions>_WINDOWS;WIN32;_WIN32_WINNT=0x0601;_DEBUG;_USRDLL;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<ProgramDataBaseFileName>$(IntDir)$(TargetName).pdb</ProgramDataBaseFileName>
<MinimalRebuild>false</MinimalRebuild>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Link>
<OutputFile>$(OutDir)\bin\x64\$(TargetName)$(TargetExt)</OutputFile>
<ProgramDatabaseFile>$(OutDir)\lib\x64\$(TargetName).pdb</ProgramDatabaseFile>
<SubSystem>Console</SubSystem>
<ImportLibrary>$(OutDir)\lib\x64\$(TargetName).lib</ImportLibrary>
<ProfileGuidedDatabase>$(IntDir)\$(TargetName).pgd</ProfileGuidedDatabase>
<AdditionalLibraryDirectories>$(OutDir)\lib\x64\;$(ProjectDir)\..\..\prebuilt\lib\x64\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion>6.1</MinimumRequiredVersion>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>MaxSpeed</Optimization>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<InlineFunctionExpansion>AnySuitable</InlineFunctionExpansion>
<FavorSizeOrSpeed>Speed</FavorSizeOrSpeed>
<EnableFiberSafeOptimizations>true</EnableFiberSafeOptimizations>
<OmitFramePointers>true</OmitFramePointers>
<StringPooling>true</StringPooling>
<PreprocessorDefinitions>_WINDOWS;WIN32;_WIN32_WINNT=0x0601;NDEBUG;_LIB;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<InterproceduralOptimization>SingleFile</InterproceduralOptimization>
<ProgramDataBaseFileName>$(OutDir)\lib\x86\$(TargetName).pdb</ProgramDataBaseFileName>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Lib>
<OutputFile>$(OutDir)\lib\x86\$(TargetName)$(TargetExt)</OutputFile>
<TargetMachine>MachineX86</TargetMachine>
<SubSystem>Console</SubSystem>
<AdditionalLibraryDirectories>$(OutDir)\lib\x86\;$(ProjectDir)\..\..\prebuilt\lib\x86\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
</Lib>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>MaxSpeed</Optimization>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<InlineFunctionExpansion>AnySuitable</InlineFunctionExpansion>
<FavorSizeOrSpeed>Speed</FavorSizeOrSpeed>
<EnableFiberSafeOptimizations>true</EnableFiberSafeOptimizations>
<OmitFramePointers>true</OmitFramePointers>
<StringPooling>true</StringPooling>
<PreprocessorDefinitions>_WINDOWS;WIN32;_WIN32_WINNT=0x0601;NDEBUG;_LIB;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<InterproceduralOptimization>SingleFile</InterproceduralOptimization>
<ProgramDataBaseFileName>$(OutDir)\lib\x64\$(TargetName).pdb</ProgramDataBaseFileName>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Lib>
<OutputFile>$(OutDir)\lib\x64\$(TargetName)$(TargetExt)</OutputFile>
<TargetMachine>MachineX64</TargetMachine>
<SubSystem>Console</SubSystem>
<AdditionalLibraryDirectories>$(OutDir)\lib\x64\;$(ProjectDir)\..\..\prebuilt\lib\x64\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
</Lib>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLL|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>MaxSpeed</Optimization>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<InlineFunctionExpansion>AnySuitable</InlineFunctionExpansion>
<FavorSizeOrSpeed>Speed</FavorSizeOrSpeed>
<EnableFiberSafeOptimizations>true</EnableFiberSafeOptimizations>
<OmitFramePointers>true</OmitFramePointers>
<StringPooling>true</StringPooling>
<PreprocessorDefinitions>_WINDOWS;WIN32;_WIN32_WINNT=0x0601;NDEBUG;_USRDLL;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<ProgramDataBaseFileName>$(IntDir)$(TargetName).pdb</ProgramDataBaseFileName>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Link>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
<OutputFile>$(OutDir)\bin\x86\$(TargetName)$(TargetExt)</OutputFile>
<ProgramDatabaseFile>$(OutDir)\lib\x86\$(TargetName).pdb</ProgramDatabaseFile>
<SubSystem>Console</SubSystem>
<ImportLibrary>$(OutDir)\lib\x86\$(TargetName).lib</ImportLibrary>
<ProfileGuidedDatabase>$(IntDir)\$(TargetName).pgd</ProfileGuidedDatabase>
<LargeAddressAware>true</LargeAddressAware>
<AdditionalLibraryDirectories>$(OutDir)\lib\x86\;$(ProjectDir)\..\..\prebuilt\lib\x86\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion>6.1</MinimumRequiredVersion>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLL|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>MaxSpeed</Optimization>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<InlineFunctionExpansion>AnySuitable</InlineFunctionExpansion>
<FavorSizeOrSpeed>Speed</FavorSizeOrSpeed>
<EnableFiberSafeOptimizations>true</EnableFiberSafeOptimizations>
<OmitFramePointers>true</OmitFramePointers>
<StringPooling>true</StringPooling>
<PreprocessorDefinitions>_WINDOWS;WIN32;_WIN32_WINNT=0x0601;NDEBUG;_USRDLL;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<ProgramDataBaseFileName>$(IntDir)$(TargetName).pdb</ProgramDataBaseFileName>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Link>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
<OutputFile>$(OutDir)\bin\x64\$(TargetName)$(TargetExt)</OutputFile>
<ProgramDatabaseFile>$(OutDir)\lib\x64\$(TargetName).pdb</ProgramDatabaseFile>
<SubSystem>Console</SubSystem>
<ImportLibrary>$(OutDir)\lib\x64\$(TargetName).lib</ImportLibrary>
<ProfileGuidedDatabase>$(IntDir)\$(TargetName).pgd</ProfileGuidedDatabase>
<AdditionalLibraryDirectories>$(OutDir)\lib\x64\;$(ProjectDir)\..\..\prebuilt\lib\x64\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion>6.1</MinimumRequiredVersion>
</Link>
</ItemDefinitionGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
<ImportGroup Label="ExtensionTargets">
</ImportGroup>
<ItemGroup />
</Project>
+392
View File
@@ -0,0 +1,392 @@
<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup Label="ProjectConfigurations">
<ProjectConfiguration Include="DebugDLLWinRT|Win32">
<Configuration>DebugDLLWinRT</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="DebugDLLWinRT|x64">
<Configuration>DebugDLLWinRT</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="DebugWinRT|Win32">
<Configuration>DebugWinRT</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="DebugWinRT|x64">
<Configuration>DebugWinRT</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="ReleaseDLLWinRT|Win32">
<Configuration>ReleaseDLLWinRT</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="ReleaseDLLWinRT|x64">
<Configuration>ReleaseDLLWinRT</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="ReleaseWinRT|Win32">
<Configuration>ReleaseWinRT</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="ReleaseWinRT|x64">
<Configuration>ReleaseWinRT</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
</ItemGroup>
<PropertyGroup Label="Globals">
<WindowsTargetPlatformVersion Condition="'$(WindowsTargetPlatformVersion)' != ''">$(WindowsTargetPlatformVersion)</WindowsTargetPlatformVersion>
<WindowsTargetPlatformVersion Condition="'$(VisualStudioVersion)'&gt;= '16.0'">10.0</WindowsTargetPlatformVersion>
<PlatformToolset Condition="'$(VisualStudioVersion)' == '17.0'">v143</PlatformToolset>
<PlatformToolset Condition="'$(VisualStudioVersion)' == '16.0'">v142</PlatformToolset>
<PlatformToolset Condition="'$(VisualStudioVersion)' == '15.0'">v141</PlatformToolset>
<PlatformToolset Condition="'$(VisualStudioVersion)' == '14.0'">v140</PlatformToolset>
<PlatformToolset Condition="'$(VisualStudioVersion)' == '12.0'">v120</PlatformToolset>
<CharacterSet>Unicode</CharacterSet>
<DefaultLanguage>en-US</DefaultLanguage>
<WindowsAppContainer>true</WindowsAppContainer>
<AppContainerApplication>true</AppContainerApplication>
<MinimumVisualStudioVersion>12.0</MinimumVisualStudioVersion>
<ApplicationType>Windows Store</ApplicationType>
<ApplicationTypeRevision Condition="'$(VisualStudioVersion)' == '17.0'">10.0</ApplicationTypeRevision>
<ApplicationTypeRevision Condition="'$(VisualStudioVersion)' == '16.0'">10.0</ApplicationTypeRevision>
<ApplicationTypeRevision Condition="'$(VisualStudioVersion)' == '15.0'">10.0</ApplicationTypeRevision>
<ApplicationTypeRevision Condition="'$(VisualStudioVersion)' == '14.0'">8.1</ApplicationTypeRevision>
<ApplicationTypeRevision Condition="'$(VisualStudioVersion)' == '12.0'">8.1</ApplicationTypeRevision>
<WindowsTargetPlatformVersion Condition="'$(ApplicationTypeRevision)|$(WindowsTargetPlatformVersion)' == '10.0|'">10.0.10240.0</WindowsTargetPlatformVersion>
<WindowsTargetPlatformMinVersion Condition="'$(ApplicationTypeRevision)' == '10.0'">10.0.10240.0</WindowsTargetPlatformMinVersion>
<TargetPlatformMinVersion Condition="'$(ApplicationTypeRevision)' == '10.0'">10.0.10240.0</TargetPlatformMinVersion>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugWinRT|Win32'" Label="Configuration">
<ConfigurationType>StaticLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugWinRT|x64'" Label="Configuration">
<ConfigurationType>StaticLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLLWinRT|Win32'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLLWinRT|x64'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseWinRT|Win32'" Label="Configuration">
<ConfigurationType>StaticLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseWinRT|x64'" Label="Configuration">
<ConfigurationType>StaticLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLLWinRT|Win32'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<WholeProgramOptimization>true</WholeProgramOptimization>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLLWinRT|x64'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<WholeProgramOptimization>true</WholeProgramOptimization>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
</ImportGroup>
<ImportGroup Label="PropertySheets">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<PropertyGroup Label="UserMacros" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugWinRT|Win32'">
<TargetName>lib$(RootNamespace)d_winrt</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugWinRT|x64'">
<TargetName>lib$(RootNamespace)d_winrt</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLLWinRT|Win32'">
<TargetName>$(RootNamespace)d_winrt</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLLWinRT|x64'">
<TargetName>$(RootNamespace)d_winrt</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseWinRT|Win32'">
<TargetName>lib$(RootNamespace)_winrt</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseWinRT|x64'">
<TargetName>lib$(RootNamespace)_winrt</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLLWinRT|Win32'">
<TargetName>$(RootNamespace)_winrt</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLLWinRT|x64'">
<TargetName>$(RootNamespace)_winrt</TargetName>
<OutDir>$(ProjectDir)..\..\..\msvc\</OutDir>
<IntDir>$(ProjectDir)obj\$(Configuration)\$(Platform)\$(ProjectName)\</IntDir>
<GeneratedFilesDir>$(ProjectDir)obj\Generated</GeneratedFilesDir>
<CustomBuildAfterTargets>Clean</CustomBuildAfterTargets>
<MSBuildWarningsAsMessages>MSB8012</MSBuildWarningsAsMessages>
</PropertyGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='DebugWinRT|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>Disabled</Optimization>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<PreprocessorDefinitions>WIN32;_WINDOWS;_DEBUG;_LIB;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<ProgramDataBaseFileName>$(OutDir)\lib\x86\$(TargetName).pdb</ProgramDataBaseFileName>
<MinimalRebuild>false</MinimalRebuild>
<PrecompiledHeader>NotUsing</PrecompiledHeader>
<CompileAsWinRT>false</CompileAsWinRT>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Lib>
<OutputFile>$(OutDir)\lib\x86\$(TargetName)$(TargetExt)</OutputFile>
<TargetMachine>MachineX86</TargetMachine>
<SubSystem>Console</SubSystem>
<AdditionalLibraryDirectories>$(OutDir)\lib\x86\;$(ProjectDir)\..\..\prebuilt\lib\x86\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
</Lib>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='DebugWinRT|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>Disabled</Optimization>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<PreprocessorDefinitions>WIN32;_WINDOWS;_DEBUG;_LIB;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<ProgramDataBaseFileName>$(OutDir)\lib\x64\$(TargetName).pdb</ProgramDataBaseFileName>
<MinimalRebuild>false</MinimalRebuild>
<PrecompiledHeader>NotUsing</PrecompiledHeader>
<CompileAsWinRT>false</CompileAsWinRT>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Lib>
<OutputFile>$(OutDir)\lib\x64\$(TargetName)$(TargetExt)</OutputFile>
<TargetMachine>MachineX64</TargetMachine>
<SubSystem>Console</SubSystem>
<AdditionalLibraryDirectories>$(OutDir)\lib\x64\;$(ProjectDir)\..\..\prebuilt\lib\x64\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
</Lib>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLLWinRT|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>Disabled</Optimization>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<PreprocessorDefinitions>WIN32;_WINDOWS;_DEBUG;_USRDLL;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<BufferSecurityCheck>true</BufferSecurityCheck>
<ProgramDataBaseFileName>$(IntDir)$(TargetName).pdb</ProgramDataBaseFileName>
<MinimalRebuild>false</MinimalRebuild>
<PrecompiledHeader>NotUsing</PrecompiledHeader>
<CompileAsWinRT>false</CompileAsWinRT>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Link>
<OutputFile>$(OutDir)\bin\x86\$(TargetName)$(TargetExt)</OutputFile>
<ProgramDatabaseFile>$(OutDir)\lib\x86\$(TargetName).pdb</ProgramDatabaseFile>
<SubSystem>Console</SubSystem>
<ImportLibrary>$(OutDir)\lib\x86\$(TargetName).lib</ImportLibrary>
<ProfileGuidedDatabase>$(IntDir)\$(TargetName).pgd</ProfileGuidedDatabase>
<LargeAddressAware>true</LargeAddressAware>
<AdditionalLibraryDirectories>$(OutDir)\lib\x86\;$(ProjectDir)\..\..\prebuilt\lib\x86\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '10.0'">10.0</MinimumRequiredVersion>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '8.1'">8.1</MinimumRequiredVersion>
<GenerateWindowsMetadata>false</GenerateWindowsMetadata>
<WindowsMetadataFile>$(OutDir)\lib\x86\$(RootNamespace).winmd</WindowsMetadataFile>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='DebugDLLWinRT|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>Disabled</Optimization>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<PreprocessorDefinitions>WIN32;_WINDOWS;_DEBUG;_USRDLL;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<ProgramDataBaseFileName>$(IntDir)$(TargetName).pdb</ProgramDataBaseFileName>
<MinimalRebuild>false</MinimalRebuild>
<PrecompiledHeader>NotUsing</PrecompiledHeader>
<CompileAsWinRT>false</CompileAsWinRT>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Link>
<OutputFile>$(OutDir)\bin\x64\$(TargetName)$(TargetExt)</OutputFile>
<ProgramDatabaseFile>$(OutDir)\lib\x64\$(TargetName).pdb</ProgramDatabaseFile>
<SubSystem>Console</SubSystem>
<ImportLibrary>$(OutDir)\lib\x64\$(TargetName).lib</ImportLibrary>
<ProfileGuidedDatabase>$(IntDir)\$(TargetName).pgd</ProfileGuidedDatabase>
<AdditionalLibraryDirectories>$(OutDir)\lib\x64\;$(ProjectDir)\..\..\prebuilt\lib\x64\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '10.0'">10.0</MinimumRequiredVersion>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '8.1'">8.1</MinimumRequiredVersion>
<GenerateWindowsMetadata>false</GenerateWindowsMetadata>
<WindowsMetadataFile>$(OutDir)\lib\x64\$(RootNamespace).winmd</WindowsMetadataFile>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseWinRT|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>MaxSpeed</Optimization>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<InlineFunctionExpansion>AnySuitable</InlineFunctionExpansion>
<FavorSizeOrSpeed>Speed</FavorSizeOrSpeed>
<EnableFiberSafeOptimizations>true</EnableFiberSafeOptimizations>
<OmitFramePointers>true</OmitFramePointers>
<StringPooling>true</StringPooling>
<PreprocessorDefinitions>WIN32;_WINDOWS;NDEBUG;_LIB;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<InterproceduralOptimization>SingleFile</InterproceduralOptimization>
<ProgramDataBaseFileName>$(OutDir)\lib\x86\$(TargetName).pdb</ProgramDataBaseFileName>
<PrecompiledHeader>NotUsing</PrecompiledHeader>
<CompileAsWinRT>false</CompileAsWinRT>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Lib>
<OutputFile>$(OutDir)\lib\x86\$(TargetName)$(TargetExt)</OutputFile>
<TargetMachine>MachineX86</TargetMachine>
<SubSystem>Console</SubSystem>
<AdditionalLibraryDirectories>$(OutDir)\lib\x86\;$(ProjectDir)\..\..\prebuilt\lib\x86\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
</Lib>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseWinRT|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>MaxSpeed</Optimization>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<InlineFunctionExpansion>AnySuitable</InlineFunctionExpansion>
<FavorSizeOrSpeed>Speed</FavorSizeOrSpeed>
<EnableFiberSafeOptimizations>true</EnableFiberSafeOptimizations>
<OmitFramePointers>true</OmitFramePointers>
<StringPooling>true</StringPooling>
<PreprocessorDefinitions>WIN32;_WINDOWS;NDEBUG;_LIB;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<InterproceduralOptimization>SingleFile</InterproceduralOptimization>
<ProgramDataBaseFileName>$(OutDir)\lib\x64\$(TargetName).pdb</ProgramDataBaseFileName>
<AdditionalLibraryDirectories>$(OutDir)\lib\x64\;$(ProjectDir)\..\..\prebuilt\lib\x64\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
<PrecompiledHeader>NotUsing</PrecompiledHeader>
<CompileAsWinRT>false</CompileAsWinRT>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Lib>
<OutputFile>$(OutDir)\lib\x64\$(TargetName)$(TargetExt)</OutputFile>
<TargetMachine>MachineX64</TargetMachine>
<SubSystem>Console</SubSystem>
<AdditionalLibraryDirectories>$(OutDir)\lib\x64\;$(ProjectDir)\..\..\prebuilt\lib\x64\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
</Lib>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLLWinRT|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>MaxSpeed</Optimization>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<InlineFunctionExpansion>AnySuitable</InlineFunctionExpansion>
<FavorSizeOrSpeed>Speed</FavorSizeOrSpeed>
<EnableFiberSafeOptimizations>true</EnableFiberSafeOptimizations>
<OmitFramePointers>true</OmitFramePointers>
<StringPooling>true</StringPooling>
<PreprocessorDefinitions>WIN32;_WINDOWS;NDEBUG;_USRDLL;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<ProgramDataBaseFileName>$(IntDir)$(TargetName).pdb</ProgramDataBaseFileName>
<PrecompiledHeader>NotUsing</PrecompiledHeader>
<CompileAsWinRT>false</CompileAsWinRT>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Link>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
<OutputFile>$(OutDir)\bin\x86\$(TargetName)$(TargetExt)</OutputFile>
<ProgramDatabaseFile>$(OutDir)\lib\x86\$(TargetName).pdb</ProgramDatabaseFile>
<SubSystem>Console</SubSystem>
<ImportLibrary>$(OutDir)\lib\x86\$(TargetName).lib</ImportLibrary>
<ProfileGuidedDatabase>$(IntDir)\$(TargetName).pgd</ProfileGuidedDatabase>
<LargeAddressAware>true</LargeAddressAware>
<AdditionalLibraryDirectories>$(OutDir)\lib\x86\;$(ProjectDir)\..\..\prebuilt\lib\x86\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '10.0'">10.0</MinimumRequiredVersion>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '8.1'">8.1</MinimumRequiredVersion>
<GenerateWindowsMetadata>false</GenerateWindowsMetadata>
<WindowsMetadataFile>$(OutDir)\lib\x86\$(RootNamespace).winmd</WindowsMetadataFile>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='ReleaseDLLWinRT|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>MaxSpeed</Optimization>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<InlineFunctionExpansion>AnySuitable</InlineFunctionExpansion>
<FavorSizeOrSpeed>Speed</FavorSizeOrSpeed>
<EnableFiberSafeOptimizations>true</EnableFiberSafeOptimizations>
<OmitFramePointers>true</OmitFramePointers>
<StringPooling>true</StringPooling>
<PreprocessorDefinitions>WIN32;_WINDOWS;NDEBUG;_USRDLL;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;_CRT_SECURE_NO_WARNINGS;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<AdditionalIncludeDirectories>$(OutDir)\include;$(ProjectDir)\..\..\prebuilt\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<ProgramDataBaseFileName>$(IntDir)$(TargetName).pdb</ProgramDataBaseFileName>
<PrecompiledHeader>NotUsing</PrecompiledHeader>
<CompileAsWinRT>false</CompileAsWinRT>
<TreatSpecificWarningsAsErrors>4113;%(TreatSpecificWarningsAsErrors)</TreatSpecificWarningsAsErrors>
</ClCompile>
<Link>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
<OutputFile>$(OutDir)\bin\x64\$(TargetName)$(TargetExt)</OutputFile>
<ProgramDatabaseFile>$(OutDir)\lib\x64\$(TargetName).pdb</ProgramDatabaseFile>
<SubSystem>Console</SubSystem>
<ImportLibrary>$(OutDir)\lib\x64\$(TargetName).lib</ImportLibrary>
<ProfileGuidedDatabase>$(IntDir)\$(TargetName).pgd</ProfileGuidedDatabase>
<AdditionalLibraryDirectories>$(OutDir)\lib\x64\;$(ProjectDir)\..\..\prebuilt\lib\x64\;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '10.0'">10.0</MinimumRequiredVersion>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '8.1'">8.1</MinimumRequiredVersion>
<GenerateWindowsMetadata>false</GenerateWindowsMetadata>
<WindowsMetadataFile>$(OutDir)\lib\x64\$(RootNamespace).winmd</WindowsMetadataFile>
</Link>
</ItemDefinitionGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
<ImportGroup Label="ExtensionTargets">
</ImportGroup>
<ItemGroup />
</Project>
+28 -12
View File
@@ -1,17 +1,24 @@
ARCH_ARM equ 0
ARCH_MIPS equ 0
VPX_ARCH_ARM equ 0
VPX_ARCH_AARCH64 equ 0
VPX_ARCH_MIPS equ 0
%ifidn __OUTPUT_FORMAT__,win64
ARCH_X86 equ 0
ARCH_X86_64 equ 1
VPX_ARCH_X86 equ 0
VPX_ARCH_X86_64 equ 1
%elifidn __OUTPUT_FORMAT__,x64
ARCH_X86 equ 0
ARCH_X86_64 equ 1
VPX_ARCH_X86 equ 0
VPX_ARCH_X86_64 equ 1
%elifidn __OUTPUT_FORMAT__,win32
ARCH_X86 equ 1
ARCH_X86_64 equ 0
VPX_ARCH_X86 equ 1
VPX_ARCH_X86_64 equ 0
%endif
HAVE_NEON equ 0
VPX_ARCH_PPC equ 0
VPX_ARCH_LOONGARCH equ 0
HAVE_NEON_ASM equ 0
HAVE_NEON equ 0
HAVE_NEON_DOTPROD equ 0
HAVE_NEON_I8MM equ 0
HAVE_SVE equ 0
HAVE_SVE2 equ 0
HAVE_MIPS32 equ 0
HAVE_DSPR2 equ 0
HAVE_MSA equ 0
@@ -24,6 +31,11 @@ HAVE_SSSE3 equ 1
HAVE_SSE4_1 equ 1
HAVE_AVX equ 1
HAVE_AVX2 equ 1
HAVE_AVX512 equ 1
HAVE_VSX equ 0
HAVE_MMI equ 0
HAVE_LSX equ 0
HAVE_LASX equ 0
HAVE_VPX_PORTS equ 1
HAVE_PTHREAD_H equ 0
HAVE_UNISTD_H equ 0
@@ -39,7 +51,7 @@ CONFIG_GCOV equ 0
CONFIG_RVCT equ 0
CONFIG_GCC equ 0
CONFIG_MSVS equ 1
CONFIG_PIC equ 0
CONFIG_PIC equ 1
CONFIG_BIG_ENDIAN equ 0
CONFIG_CODEC_SRCS equ 0
CONFIG_DEBUG_LIBS equ 0
@@ -81,7 +93,11 @@ CONFIG_VP9_HIGHBITDEPTH equ 1
CONFIG_BETTER_HW_COMPATIBILITY equ 0
CONFIG_EXPERIMENTAL equ 0
CONFIG_SIZE_LIMIT equ 0
CONFIG_SPATIAL_SVC equ 0
CONFIG_ALWAYS_ADJUST_BPM equ 0
CONFIG_BITSTREAM_DEBUG equ 0
CONFIG_MISMATCH_DEBUG equ 0
CONFIG_FP_MB_STATS equ 0
CONFIG_EMULATE_HARDWARE equ 0
CONFIG_MISC_FIXES equ 0
CONFIG_NON_GREEDY_MV equ 0
CONFIG_RATE_CTRL equ 0
CONFIG_COLLECT_COMPONENT_TIMING equ 0
+1 -1
View File
@@ -6,5 +6,5 @@
/* in the file PATENTS. All contributing project authors may */
/* be found in the AUTHORS file in the root of the source tree. */
#include "vpx/vpx_codec.h"
static const char* const cfg = "--target=x86-win32-vs14 --disable-unit-tests --disable-docs --disable-install-bins --disable-shared --enable-static --enable-vp8 --enable-vp9 --enable-static-msvcrt --enable-vp9-highbitdepth";
static const char* const cfg = "--target=x86-win32-vs15 --disable-unit-tests --disable-docs --disable-install-bins --disable-shared --enable-static --enable-vp8 --enable-vp9 --enable-static-msvcrt --enable-vp9-highbitdepth";
const char *vpx_codec_build_config(void) {return cfg;}
+34 -14
View File
@@ -9,18 +9,25 @@
#ifndef VPX_CONFIG_H
#define VPX_CONFIG_H
#define RESTRICT
#define INLINE __forceinline
#define ARCH_ARM 0
#define ARCH_MIPS 0
#if defined( __x86_64 ) || defined( _M_X64 )
#define ARCH_X86 0
#define ARCH_X86_64 1
#define INLINE __inline
#define VPX_ARCH_ARM 0
#define VPX_ARCH_AARCH64 0
#define VPX_ARCH_MIPS 0
#if defined(__x86_64) || defined(_M_X64)
#define VPX_ARCH_X86 0
#define VPX_ARCH_X86_64 1
#else
#define ARCH_X86 1
#define ARCH_X86_64 0
#define VPX_ARCH_X86 1
#define VPX_ARCH_X86_64 0
#endif
#define HAVE_NEON 0
#define VPX_ARCH_PPC 0
#define VPX_ARCH_LOONGARCH 0
#define HAVE_NEON_ASM 0
#define HAVE_NEON 0
#define HAVE_NEON_DOTPROD 0
#define HAVE_NEON_I8MM 0
#define HAVE_SVE 0
#define HAVE_SVE2 0
#define HAVE_MIPS32 0
#define HAVE_DSPR2 0
#define HAVE_MSA 0
@@ -33,6 +40,15 @@
#define HAVE_SSE4_1 1
#define HAVE_AVX 1
#define HAVE_AVX2 1
#if _MSC_VER >= 1910
#define HAVE_AVX512 1
#else
#define HAVE_AVX512 0
#endif
#define HAVE_VSX 0
#define HAVE_MMI 0
#define HAVE_LSX 0
#define HAVE_LASX 0
#define HAVE_VPX_PORTS 1
#define HAVE_PTHREAD_H 0
#define HAVE_UNISTD_H 0
@@ -48,7 +64,7 @@
#define CONFIG_RVCT 0
#define CONFIG_GCC 0
#define CONFIG_MSVS 1
#define CONFIG_PIC 0
#define CONFIG_PIC 1
#define CONFIG_BIG_ENDIAN 0
#define CONFIG_CODEC_SRCS 0
#define CONFIG_DEBUG_LIBS 0
@@ -68,9 +84,9 @@
#define CONFIG_ENCODERS 1
#define CONFIG_DECODERS 1
#if _DLL
#define CONFIG_STATIC_MSVCRT 0
#else
#define CONFIG_STATIC_MSVCRT 1
#else
#define CONFIG_STATIC_MSVCRT 0
#endif
#define CONFIG_SPATIAL_RESAMPLING 1
#define CONFIG_REALTIME_ONLY 0
@@ -99,8 +115,12 @@
#define CONFIG_BETTER_HW_COMPATIBILITY 0
#define CONFIG_EXPERIMENTAL 0
#define CONFIG_SIZE_LIMIT 0
#define CONFIG_SPATIAL_SVC 0
#define CONFIG_ALWAYS_ADJUST_BPM 0
#define CONFIG_BITSTREAM_DEBUG 0
#define CONFIG_MISMATCH_DEBUG 0
#define CONFIG_FP_MB_STATS 0
#define CONFIG_EMULATE_HARDWARE 0
#define CONFIG_MISC_FIXES 0
#define CONFIG_NON_GREEDY_MV 0
#define CONFIG_RATE_CTRL 0
#define CONFIG_COLLECT_COMPONENT_TIMING 0
#endif /* VPX_CONFIG_H */
+7 -3
View File
@@ -1,7 +1,11 @@
// This file is generated. Do not edit.
#ifndef VPX_VERSION_H_
#define VPX_VERSION_H_
#define VERSION_MAJOR 1
#define VERSION_MINOR 6
#define VERSION_MINOR 15
#define VERSION_PATCH 1
#define VERSION_EXTRA ""
#define VERSION_PACKED ((VERSION_MAJOR<<16)|(VERSION_MINOR<<8)|(VERSION_PATCH))
#define VERSION_STRING_NOSP "v1.6.1"
#define VERSION_STRING " v1.6.1"
#define VERSION_STRING_NOSP "v1.15.1"
#define VERSION_STRING " v1.15.1"
#endif // VPX_VERSION_H_
+100 -106
View File
@@ -1,3 +1,14 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VP8_RTCD_H_
#define VP8_RTCD_H_
@@ -26,57 +37,48 @@ struct yv12_buffer_config;
extern "C" {
#endif
void vp8_bilinear_predict16x16_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict16x16_sse2(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict16x16_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict16x16)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict16x16_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict16x16_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict16x16_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict16x16)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict4x4_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict4x4_mmx(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict4x4)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict4x4_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict4x4_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict4x4)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict8x4_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict8x4_mmx(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict8x4)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict8x4_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict8x4_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict8x4)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict8x8_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict8x8_sse2(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict8x8_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict8x8)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_blend_b_c(unsigned char *y, unsigned char *u, unsigned char *v, int y1, int u1, int v1, int alpha, int stride);
#define vp8_blend_b vp8_blend_b_c
void vp8_blend_mb_inner_c(unsigned char *y, unsigned char *u, unsigned char *v, int y1, int u1, int v1, int alpha, int stride);
#define vp8_blend_mb_inner vp8_blend_mb_inner_c
void vp8_blend_mb_outer_c(unsigned char *y, unsigned char *u, unsigned char *v, int y1, int u1, int v1, int alpha, int stride);
#define vp8_blend_mb_outer vp8_blend_mb_outer_c
void vp8_bilinear_predict8x8_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict8x8_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict8x8_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict8x8)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
int vp8_block_error_c(short *coeff, short *dqcoeff);
int vp8_block_error_sse2(short *coeff, short *dqcoeff);
RTCD_EXTERN int (*vp8_block_error)(short *coeff, short *dqcoeff);
void vp8_copy32xn_c(const unsigned char *src_ptr, int source_stride, unsigned char *dst_ptr, int dst_stride, int n);
void vp8_copy32xn_sse2(const unsigned char *src_ptr, int source_stride, unsigned char *dst_ptr, int dst_stride, int n);
void vp8_copy32xn_sse3(const unsigned char *src_ptr, int source_stride, unsigned char *dst_ptr, int dst_stride, int n);
RTCD_EXTERN void (*vp8_copy32xn)(const unsigned char *src_ptr, int source_stride, unsigned char *dst_ptr, int dst_stride, int n);
void vp8_copy32xn_c(const unsigned char *src_ptr, int src_stride, unsigned char *dst_ptr, int dst_stride, int height);
void vp8_copy32xn_sse2(const unsigned char *src_ptr, int src_stride, unsigned char *dst_ptr, int dst_stride, int height);
void vp8_copy32xn_sse3(const unsigned char *src_ptr, int src_stride, unsigned char *dst_ptr, int dst_stride, int height);
RTCD_EXTERN void (*vp8_copy32xn)(const unsigned char *src_ptr, int src_stride, unsigned char *dst_ptr, int dst_stride, int height);
void vp8_copy_mem16x16_c(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem16x16_sse2(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_copy_mem16x16)(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem16x16_c(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
void vp8_copy_mem16x16_sse2(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
RTCD_EXTERN void (*vp8_copy_mem16x16)(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
void vp8_copy_mem8x4_c(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem8x4_mmx(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_copy_mem8x4)(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem8x4_c(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
void vp8_copy_mem8x4_mmx(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
RTCD_EXTERN void (*vp8_copy_mem8x4)(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
void vp8_copy_mem8x8_c(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem8x8_mmx(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_copy_mem8x8)(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem8x8_c(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
void vp8_copy_mem8x8_mmx(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
RTCD_EXTERN void (*vp8_copy_mem8x8)(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
void vp8_dc_only_idct_add_c(short input, unsigned char *pred, int pred_stride, unsigned char *dst, int dst_stride);
void vp8_dc_only_idct_add_mmx(short input, unsigned char *pred, int pred_stride, unsigned char *dst, int dst_stride);
RTCD_EXTERN void (*vp8_dc_only_idct_add)(short input, unsigned char *pred, int pred_stride, unsigned char *dst, int dst_stride);
void vp8_dc_only_idct_add_c(short input_dc, unsigned char *pred_ptr, int pred_stride, unsigned char *dst_ptr, int dst_stride);
void vp8_dc_only_idct_add_mmx(short input_dc, unsigned char *pred_ptr, int pred_stride, unsigned char *dst_ptr, int dst_stride);
RTCD_EXTERN void (*vp8_dc_only_idct_add)(short input_dc, unsigned char *pred_ptr, int pred_stride, unsigned char *dst_ptr, int dst_stride);
int vp8_denoiser_filter_c(unsigned char *mc_running_avg_y, int mc_avg_y_stride, unsigned char *running_avg_y, int avg_y_stride, unsigned char *sig, int sig_stride, unsigned int motion_magnitude, int increase_denoising);
int vp8_denoiser_filter_sse2(unsigned char *mc_running_avg_y, int mc_avg_y_stride, unsigned char *running_avg_y, int avg_y_stride, unsigned char *sig, int sig_stride, unsigned int motion_magnitude, int increase_denoising);
@@ -86,9 +88,9 @@ int vp8_denoiser_filter_uv_c(unsigned char *mc_running_avg, int mc_avg_stride, u
int vp8_denoiser_filter_uv_sse2(unsigned char *mc_running_avg, int mc_avg_stride, unsigned char *running_avg, int avg_stride, unsigned char *sig, int sig_stride, unsigned int motion_magnitude, int increase_denoising);
RTCD_EXTERN int (*vp8_denoiser_filter_uv)(unsigned char *mc_running_avg, int mc_avg_stride, unsigned char *running_avg, int avg_stride, unsigned char *sig, int sig_stride, unsigned int motion_magnitude, int increase_denoising);
void vp8_dequant_idct_add_c(short *input, short *dq, unsigned char *output, int stride);
void vp8_dequant_idct_add_mmx(short *input, short *dq, unsigned char *output, int stride);
RTCD_EXTERN void (*vp8_dequant_idct_add)(short *input, short *dq, unsigned char *output, int stride);
void vp8_dequant_idct_add_c(short *input, short *dq, unsigned char *dest, int stride);
void vp8_dequant_idct_add_mmx(short *input, short *dq, unsigned char *dest, int stride);
RTCD_EXTERN void (*vp8_dequant_idct_add)(short *input, short *dq, unsigned char *dest, int stride);
void vp8_dequant_idct_add_uv_block_c(short *q, short *dq, unsigned char *dst_u, unsigned char *dst_v, int stride, char *eobs);
void vp8_dequant_idct_add_uv_block_sse2(short *q, short *dq, unsigned char *dst_u, unsigned char *dst_v, int stride, char *eobs);
@@ -98,9 +100,9 @@ void vp8_dequant_idct_add_y_block_c(short *q, short *dq, unsigned char *dst, int
void vp8_dequant_idct_add_y_block_sse2(short *q, short *dq, unsigned char *dst, int stride, char *eobs);
RTCD_EXTERN void (*vp8_dequant_idct_add_y_block)(short *q, short *dq, unsigned char *dst, int stride, char *eobs);
void vp8_dequantize_b_c(struct blockd*, short *dqc);
void vp8_dequantize_b_mmx(struct blockd*, short *dqc);
RTCD_EXTERN void (*vp8_dequantize_b)(struct blockd*, short *dqc);
void vp8_dequantize_b_c(struct blockd*, short *DQC);
void vp8_dequantize_b_mmx(struct blockd*, short *DQC);
RTCD_EXTERN void (*vp8_dequantize_b)(struct blockd*, short *DQC);
int vp8_diamond_search_sad_c(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, union int_mv *best_mv, int search_param, int sad_per_bit, int *num00, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
int vp8_diamond_search_sadx4(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, union int_mv *best_mv, int search_param, int sad_per_bit, int *num00, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
@@ -122,42 +124,37 @@ void vp8_filter_by_weight8x8_c(unsigned char *src, int src_stride, unsigned char
void vp8_filter_by_weight8x8_sse2(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride, int src_weight);
RTCD_EXTERN void (*vp8_filter_by_weight8x8)(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride, int src_weight);
int vp8_full_search_sad_c(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
int vp8_full_search_sadx3(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
int vp8_full_search_sadx8(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
RTCD_EXTERN int (*vp8_full_search_sad)(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
void vp8_loop_filter_bh_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bh_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
RTCD_EXTERN void (*vp8_loop_filter_bh)(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bh_c(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bh_sse2(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
RTCD_EXTERN void (*vp8_loop_filter_bh)(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bv_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bv_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
RTCD_EXTERN void (*vp8_loop_filter_bv)(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bv_c(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bv_sse2(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
RTCD_EXTERN void (*vp8_loop_filter_bv)(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbh_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbh_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
RTCD_EXTERN void (*vp8_loop_filter_mbh)(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbh_c(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbh_sse2(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
RTCD_EXTERN void (*vp8_loop_filter_mbh)(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbv_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbv_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
RTCD_EXTERN void (*vp8_loop_filter_mbv)(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbv_c(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbv_sse2(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
RTCD_EXTERN void (*vp8_loop_filter_mbv)(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bhs_c(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
void vp8_loop_filter_bhs_sse2(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
RTCD_EXTERN void (*vp8_loop_filter_simple_bh)(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
void vp8_loop_filter_bhs_c(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_bhs_sse2(unsigned char *y, int ystride, const unsigned char *blimit);
RTCD_EXTERN void (*vp8_loop_filter_simple_bh)(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_bvs_c(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
void vp8_loop_filter_bvs_sse2(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
RTCD_EXTERN void (*vp8_loop_filter_simple_bv)(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
void vp8_loop_filter_bvs_c(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_bvs_sse2(unsigned char *y, int ystride, const unsigned char *blimit);
RTCD_EXTERN void (*vp8_loop_filter_simple_bv)(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_simple_horizontal_edge_c(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
void vp8_loop_filter_simple_horizontal_edge_sse2(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
RTCD_EXTERN void (*vp8_loop_filter_simple_mbh)(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
void vp8_loop_filter_simple_horizontal_edge_c(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_simple_horizontal_edge_sse2(unsigned char *y, int ystride, const unsigned char *blimit);
RTCD_EXTERN void (*vp8_loop_filter_simple_mbh)(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_simple_vertical_edge_c(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_simple_vertical_edge_sse2(unsigned char *y, int ystride, const unsigned char *blimit);
RTCD_EXTERN void (*vp8_loop_filter_simple_mbv)(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_simple_vertical_edge_c(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
void vp8_loop_filter_simple_vertical_edge_sse2(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
RTCD_EXTERN void (*vp8_loop_filter_simple_mbv)(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
int vp8_mbblock_error_c(struct macroblock *mb, int dc);
int vp8_mbblock_error_sse2(struct macroblock *mb, int dc);
@@ -167,9 +164,9 @@ int vp8_mbuverror_c(struct macroblock *mb);
int vp8_mbuverror_sse2(struct macroblock *mb);
RTCD_EXTERN int (*vp8_mbuverror)(struct macroblock *mb);
int vp8_refining_search_sad_c(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
int vp8_refining_search_sadx4(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
RTCD_EXTERN int (*vp8_refining_search_sad)(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
int vp8_refining_search_sad_c(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int error_per_bit, int search_range, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
int vp8_refining_search_sadx4(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int error_per_bit, int search_range, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
RTCD_EXTERN int (*vp8_refining_search_sad)(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int error_per_bit, int search_range, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
void vp8_regular_quantize_b_c(struct block *, struct blockd *);
void vp8_regular_quantize_b_sse2(struct block *, struct blockd *);
@@ -184,40 +181,40 @@ void vp8_short_fdct8x4_c(short *input, short *output, int pitch);
void vp8_short_fdct8x4_sse2(short *input, short *output, int pitch);
RTCD_EXTERN void (*vp8_short_fdct8x4)(short *input, short *output, int pitch);
void vp8_short_idct4x4llm_c(short *input, unsigned char *pred, int pitch, unsigned char *dst, int dst_stride);
void vp8_short_idct4x4llm_mmx(short *input, unsigned char *pred, int pitch, unsigned char *dst, int dst_stride);
RTCD_EXTERN void (*vp8_short_idct4x4llm)(short *input, unsigned char *pred, int pitch, unsigned char *dst, int dst_stride);
void vp8_short_idct4x4llm_c(short *input, unsigned char *pred_ptr, int pred_stride, unsigned char *dst_ptr, int dst_stride);
void vp8_short_idct4x4llm_mmx(short *input, unsigned char *pred_ptr, int pred_stride, unsigned char *dst_ptr, int dst_stride);
RTCD_EXTERN void (*vp8_short_idct4x4llm)(short *input, unsigned char *pred_ptr, int pred_stride, unsigned char *dst_ptr, int dst_stride);
void vp8_short_inv_walsh4x4_c(short *input, short *output);
void vp8_short_inv_walsh4x4_sse2(short *input, short *output);
RTCD_EXTERN void (*vp8_short_inv_walsh4x4)(short *input, short *output);
void vp8_short_inv_walsh4x4_c(short *input, short *mb_dqcoeff);
void vp8_short_inv_walsh4x4_sse2(short *input, short *mb_dqcoeff);
RTCD_EXTERN void (*vp8_short_inv_walsh4x4)(short *input, short *mb_dqcoeff);
void vp8_short_inv_walsh4x4_1_c(short *input, short *output);
void vp8_short_inv_walsh4x4_1_c(short *input, short *mb_dqcoeff);
#define vp8_short_inv_walsh4x4_1 vp8_short_inv_walsh4x4_1_c
void vp8_short_walsh4x4_c(short *input, short *output, int pitch);
void vp8_short_walsh4x4_sse2(short *input, short *output, int pitch);
RTCD_EXTERN void (*vp8_short_walsh4x4)(short *input, short *output, int pitch);
void vp8_sixtap_predict16x16_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict16x16_sse2(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict16x16_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict16x16)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict16x16_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict16x16_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict16x16_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict16x16)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict4x4_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict4x4_mmx(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict4x4_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict4x4)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict4x4_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict4x4_mmx(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict4x4_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict4x4)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x4_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x4_sse2(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x4_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict8x4)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x4_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x4_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x4_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict8x4)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x8_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x8_sse2(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x8_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict8x8)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x8_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x8_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x8_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict8x8)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_temporal_filter_apply_c(unsigned char *frame1, unsigned int stride, unsigned char *frame2, unsigned int block_size, int strength, int filter_weight, unsigned int *accumulator, unsigned short *count);
void vp8_temporal_filter_apply_sse2(unsigned char *frame1, unsigned int stride, unsigned char *frame2, unsigned int block_size, int strength, int filter_weight, unsigned int *accumulator, unsigned short *count);
@@ -237,9 +234,9 @@ static void setup_rtcd_internal(void)
if (flags & HAS_SSE2) vp8_bilinear_predict16x16 = vp8_bilinear_predict16x16_sse2;
if (flags & HAS_SSSE3) vp8_bilinear_predict16x16 = vp8_bilinear_predict16x16_ssse3;
vp8_bilinear_predict4x4 = vp8_bilinear_predict4x4_c;
if (flags & HAS_MMX) vp8_bilinear_predict4x4 = vp8_bilinear_predict4x4_mmx;
if (flags & HAS_SSE2) vp8_bilinear_predict4x4 = vp8_bilinear_predict4x4_sse2;
vp8_bilinear_predict8x4 = vp8_bilinear_predict8x4_c;
if (flags & HAS_MMX) vp8_bilinear_predict8x4 = vp8_bilinear_predict8x4_mmx;
if (flags & HAS_SSE2) vp8_bilinear_predict8x4 = vp8_bilinear_predict8x4_sse2;
vp8_bilinear_predict8x8 = vp8_bilinear_predict8x8_c;
if (flags & HAS_SSE2) vp8_bilinear_predict8x8 = vp8_bilinear_predict8x8_sse2;
if (flags & HAS_SSSE3) vp8_bilinear_predict8x8 = vp8_bilinear_predict8x8_ssse3;
@@ -277,9 +274,6 @@ static void setup_rtcd_internal(void)
if (flags & HAS_SSE2) vp8_filter_by_weight16x16 = vp8_filter_by_weight16x16_sse2;
vp8_filter_by_weight8x8 = vp8_filter_by_weight8x8_c;
if (flags & HAS_SSE2) vp8_filter_by_weight8x8 = vp8_filter_by_weight8x8_sse2;
vp8_full_search_sad = vp8_full_search_sad_c;
if (flags & HAS_SSE3) vp8_full_search_sad = vp8_full_search_sadx3;
if (flags & HAS_SSE4_1) vp8_full_search_sad = vp8_full_search_sadx8;
vp8_loop_filter_bh = vp8_loop_filter_bh_c;
if (flags & HAS_SSE2) vp8_loop_filter_bh = vp8_loop_filter_bh_sse2;
vp8_loop_filter_bv = vp8_loop_filter_bv_c;
@@ -336,4 +330,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VP8_RTCD_H_
+93 -52
View File
@@ -1,3 +1,14 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VP9_RTCD_H_
#define VP9_RTCD_H_
@@ -14,12 +25,15 @@
#include "vpx/vpx_integer.h"
#include "vp9/common/vp9_common.h"
#include "vp9/common/vp9_enums.h"
#include "vp9/common/vp9_filter.h"
struct macroblockd;
/* Encoder forward decls */
struct macroblock;
struct vp9_variance_vtable;
struct macroblock_plane;
struct vp9_sad_table;
struct ScanOrder;
struct search_site_config;
struct mv;
union int_mv;
@@ -29,16 +43,22 @@ struct yv12_buffer_config;
extern "C" {
#endif
void vp9_apply_temporal_filter_c(const uint8_t *y_src, int y_src_stride, const uint8_t *y_pre, int y_pre_stride, const uint8_t *u_src, const uint8_t *v_src, int uv_src_stride, const uint8_t *u_pre, const uint8_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accumulator, uint16_t *y_count, uint32_t *u_accumulator, uint16_t *u_count, uint32_t *v_accumulator, uint16_t *v_count);
void vp9_apply_temporal_filter_sse4_1(const uint8_t *y_src, int y_src_stride, const uint8_t *y_pre, int y_pre_stride, const uint8_t *u_src, const uint8_t *v_src, int uv_src_stride, const uint8_t *u_pre, const uint8_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accumulator, uint16_t *y_count, uint32_t *u_accumulator, uint16_t *u_count, uint32_t *v_accumulator, uint16_t *v_count);
RTCD_EXTERN void (*vp9_apply_temporal_filter)(const uint8_t *y_src, int y_src_stride, const uint8_t *y_pre, int y_pre_stride, const uint8_t *u_src, const uint8_t *v_src, int uv_src_stride, const uint8_t *u_pre, const uint8_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accumulator, uint16_t *y_count, uint32_t *u_accumulator, uint16_t *u_count, uint32_t *v_accumulator, uint16_t *v_count);
int64_t vp9_block_error_c(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
#define vp9_block_error vp9_block_error_c
int64_t vp9_block_error_sse2(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
int64_t vp9_block_error_avx2(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
RTCD_EXTERN int64_t (*vp9_block_error)(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
int vp9_diamond_search_sad_c(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
int vp9_diamond_search_sad_avx(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
RTCD_EXTERN int (*vp9_diamond_search_sad)(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
int64_t vp9_block_error_fp_c(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
int64_t vp9_block_error_fp_sse2(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
int64_t vp9_block_error_fp_avx2(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
RTCD_EXTERN int64_t (*vp9_block_error_fp)(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
void vp9_fdct8x8_quant_c(const int16_t *input, int stride, tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_fdct8x8_quant_ssse3(const int16_t *input, int stride, tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vp9_fdct8x8_quant)(const int16_t *input, int stride, tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
int vp9_diamond_search_sad_c(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, uint32_t start_mv_sad, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_sad_table *sad_fn_ptr, const struct mv *center_mv);
#define vp9_diamond_search_sad vp9_diamond_search_sad_c
void vp9_fht16x16_c(const int16_t *input, tran_low_t *output, int stride, int tx_type);
void vp9_fht16x16_sse2(const int16_t *input, tran_low_t *output, int stride, int tx_type);
@@ -52,24 +72,18 @@ void vp9_fht8x8_c(const int16_t *input, tran_low_t *output, int stride, int tx_t
void vp9_fht8x8_sse2(const int16_t *input, tran_low_t *output, int stride, int tx_type);
RTCD_EXTERN void (*vp9_fht8x8)(const int16_t *input, tran_low_t *output, int stride, int tx_type);
int vp9_full_search_sad_c(const struct macroblock *x, const struct mv *ref_mv, int sad_per_bit, int distance, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv, struct mv *best_mv);
int vp9_full_search_sadx3(const struct macroblock *x, const struct mv *ref_mv, int sad_per_bit, int distance, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv, struct mv *best_mv);
int vp9_full_search_sadx8(const struct macroblock *x, const struct mv *ref_mv, int sad_per_bit, int distance, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv, struct mv *best_mv);
RTCD_EXTERN int (*vp9_full_search_sad)(const struct macroblock *x, const struct mv *ref_mv, int sad_per_bit, int distance, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv, struct mv *best_mv);
void vp9_fwht4x4_c(const int16_t *input, tran_low_t *output, int stride);
void vp9_fwht4x4_sse2(const int16_t *input, tran_low_t *output, int stride);
RTCD_EXTERN void (*vp9_fwht4x4)(const int16_t *input, tran_low_t *output, int stride);
void vp9_highbd_apply_temporal_filter_c(const uint16_t *y_src, int y_src_stride, const uint16_t *y_pre, int y_pre_stride, const uint16_t *u_src, const uint16_t *v_src, int uv_src_stride, const uint16_t *u_pre, const uint16_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accum, uint16_t *y_count, uint32_t *u_accum, uint16_t *u_count, uint32_t *v_accum, uint16_t *v_count);
void vp9_highbd_apply_temporal_filter_sse4_1(const uint16_t *y_src, int y_src_stride, const uint16_t *y_pre, int y_pre_stride, const uint16_t *u_src, const uint16_t *v_src, int uv_src_stride, const uint16_t *u_pre, const uint16_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accum, uint16_t *y_count, uint32_t *u_accum, uint16_t *u_count, uint32_t *v_accum, uint16_t *v_count);
RTCD_EXTERN void (*vp9_highbd_apply_temporal_filter)(const uint16_t *y_src, int y_src_stride, const uint16_t *y_pre, int y_pre_stride, const uint16_t *u_src, const uint16_t *v_src, int uv_src_stride, const uint16_t *u_pre, const uint16_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accum, uint16_t *y_count, uint32_t *u_accum, uint16_t *u_count, uint32_t *v_accum, uint16_t *v_count);
int64_t vp9_highbd_block_error_c(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz, int bd);
int64_t vp9_highbd_block_error_sse2(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz, int bd);
RTCD_EXTERN int64_t (*vp9_highbd_block_error)(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz, int bd);
int64_t vp9_highbd_block_error_8bit_c(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
int64_t vp9_highbd_block_error_8bit_sse2(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
int64_t vp9_highbd_block_error_8bit_avx(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
RTCD_EXTERN int64_t (*vp9_highbd_block_error_8bit)(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
void vp9_highbd_fht16x16_c(const int16_t *input, tran_low_t *output, int stride, int tx_type);
#define vp9_highbd_fht16x16 vp9_highbd_fht16x16_c
@@ -82,27 +96,32 @@ void vp9_highbd_fht8x8_c(const int16_t *input, tran_low_t *output, int stride, i
void vp9_highbd_fwht4x4_c(const int16_t *input, tran_low_t *output, int stride);
#define vp9_highbd_fwht4x4 vp9_highbd_fwht4x4_c
void vp9_highbd_iht16x16_256_add_c(const tran_low_t *input, uint8_t *output, int pitch, int tx_type, int bd);
#define vp9_highbd_iht16x16_256_add vp9_highbd_iht16x16_256_add_c
void vp9_highbd_iht16x16_256_add_c(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_iht16x16_256_add_sse4_1(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
RTCD_EXTERN void (*vp9_highbd_iht16x16_256_add)(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_iht4x4_16_add_c(const tran_low_t *input, uint8_t *dest, int stride, int tx_type, int bd);
#define vp9_highbd_iht4x4_16_add vp9_highbd_iht4x4_16_add_c
void vp9_highbd_iht4x4_16_add_c(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_iht4x4_16_add_sse4_1(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
RTCD_EXTERN void (*vp9_highbd_iht4x4_16_add)(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_iht8x8_64_add_c(const tran_low_t *input, uint8_t *dest, int stride, int tx_type, int bd);
#define vp9_highbd_iht8x8_64_add vp9_highbd_iht8x8_64_add_c
void vp9_highbd_iht8x8_64_add_c(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_iht8x8_64_add_sse4_1(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
RTCD_EXTERN void (*vp9_highbd_iht8x8_64_add)(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
#define vp9_highbd_quantize_fp vp9_highbd_quantize_fp_c
void vp9_highbd_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_highbd_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
#define vp9_highbd_quantize_fp_32x32 vp9_highbd_quantize_fp_32x32_c
void vp9_highbd_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_highbd_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_temporal_filter_apply_c(uint8_t *frame1, unsigned int stride, uint8_t *frame2, unsigned int block_width, unsigned int block_height, int strength, int filter_weight, unsigned int *accumulator, uint16_t *count);
void vp9_highbd_temporal_filter_apply_c(const uint8_t *frame1, unsigned int stride, const uint8_t *frame2, unsigned int block_width, unsigned int block_height, int strength, int *blk_fw, int use_32x32, uint32_t *accumulator, uint16_t *count);
#define vp9_highbd_temporal_filter_apply vp9_highbd_temporal_filter_apply_c
void vp9_iht16x16_256_add_c(const tran_low_t *input, uint8_t *output, int pitch, int tx_type);
void vp9_iht16x16_256_add_sse2(const tran_low_t *input, uint8_t *output, int pitch, int tx_type);
RTCD_EXTERN void (*vp9_iht16x16_256_add)(const tran_low_t *input, uint8_t *output, int pitch, int tx_type);
void vp9_iht16x16_256_add_c(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
void vp9_iht16x16_256_add_sse2(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
RTCD_EXTERN void (*vp9_iht16x16_256_add)(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
void vp9_iht4x4_16_add_c(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
void vp9_iht4x4_16_add_sse2(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
@@ -112,15 +131,20 @@ void vp9_iht8x8_64_add_c(const tran_low_t *input, uint8_t *dest, int stride, int
void vp9_iht8x8_64_add_sse2(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
RTCD_EXTERN void (*vp9_iht8x8_64_add)(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
void vp9_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
#define vp9_quantize_fp vp9_quantize_fp_c
void vp9_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
#define vp9_quantize_fp_32x32 vp9_quantize_fp_32x32_c
void vp9_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_temporal_filter_apply_c(uint8_t *frame1, unsigned int stride, uint8_t *frame2, unsigned int block_width, unsigned int block_height, int strength, int filter_weight, unsigned int *accumulator, uint16_t *count);
void vp9_temporal_filter_apply_sse2(uint8_t *frame1, unsigned int stride, uint8_t *frame2, unsigned int block_width, unsigned int block_height, int strength, int filter_weight, unsigned int *accumulator, uint16_t *count);
RTCD_EXTERN void (*vp9_temporal_filter_apply)(uint8_t *frame1, unsigned int stride, uint8_t *frame2, unsigned int block_width, unsigned int block_height, int strength, int filter_weight, unsigned int *accumulator, uint16_t *count);
void vp9_scale_and_extend_frame_c(const struct yv12_buffer_config *src, struct yv12_buffer_config *dst, INTERP_FILTER filter_type, int phase_scaler);
void vp9_scale_and_extend_frame_ssse3(const struct yv12_buffer_config *src, struct yv12_buffer_config *dst, INTERP_FILTER filter_type, int phase_scaler);
RTCD_EXTERN void (*vp9_scale_and_extend_frame)(const struct yv12_buffer_config *src, struct yv12_buffer_config *dst, INTERP_FILTER filter_type, int phase_scaler);
void vp9_rtcd(void);
@@ -132,34 +156,51 @@ static void setup_rtcd_internal(void)
(void)flags;
vp9_diamond_search_sad = vp9_diamond_search_sad_c;
if (flags & HAS_AVX) vp9_diamond_search_sad = vp9_diamond_search_sad_avx;
vp9_fdct8x8_quant = vp9_fdct8x8_quant_c;
if (flags & HAS_SSSE3) vp9_fdct8x8_quant = vp9_fdct8x8_quant_ssse3;
vp9_apply_temporal_filter = vp9_apply_temporal_filter_c;
if (flags & HAS_SSE4_1) vp9_apply_temporal_filter = vp9_apply_temporal_filter_sse4_1;
vp9_block_error = vp9_block_error_c;
if (flags & HAS_SSE2) vp9_block_error = vp9_block_error_sse2;
if (flags & HAS_AVX2) vp9_block_error = vp9_block_error_avx2;
vp9_block_error_fp = vp9_block_error_fp_c;
if (flags & HAS_SSE2) vp9_block_error_fp = vp9_block_error_fp_sse2;
if (flags & HAS_AVX2) vp9_block_error_fp = vp9_block_error_fp_avx2;
vp9_fht16x16 = vp9_fht16x16_c;
if (flags & HAS_SSE2) vp9_fht16x16 = vp9_fht16x16_sse2;
vp9_fht4x4 = vp9_fht4x4_c;
if (flags & HAS_SSE2) vp9_fht4x4 = vp9_fht4x4_sse2;
vp9_fht8x8 = vp9_fht8x8_c;
if (flags & HAS_SSE2) vp9_fht8x8 = vp9_fht8x8_sse2;
vp9_full_search_sad = vp9_full_search_sad_c;
if (flags & HAS_SSE3) vp9_full_search_sad = vp9_full_search_sadx3;
if (flags & HAS_SSE4_1) vp9_full_search_sad = vp9_full_search_sadx8;
vp9_fwht4x4 = vp9_fwht4x4_c;
if (flags & HAS_SSE2) vp9_fwht4x4 = vp9_fwht4x4_sse2;
vp9_highbd_apply_temporal_filter = vp9_highbd_apply_temporal_filter_c;
if (flags & HAS_SSE4_1) vp9_highbd_apply_temporal_filter = vp9_highbd_apply_temporal_filter_sse4_1;
vp9_highbd_block_error = vp9_highbd_block_error_c;
if (flags & HAS_SSE2) vp9_highbd_block_error = vp9_highbd_block_error_sse2;
vp9_highbd_block_error_8bit = vp9_highbd_block_error_8bit_c;
if (flags & HAS_SSE2) vp9_highbd_block_error_8bit = vp9_highbd_block_error_8bit_sse2;
if (flags & HAS_AVX) vp9_highbd_block_error_8bit = vp9_highbd_block_error_8bit_avx;
vp9_highbd_iht16x16_256_add = vp9_highbd_iht16x16_256_add_c;
if (flags & HAS_SSE4_1) vp9_highbd_iht16x16_256_add = vp9_highbd_iht16x16_256_add_sse4_1;
vp9_highbd_iht4x4_16_add = vp9_highbd_iht4x4_16_add_c;
if (flags & HAS_SSE4_1) vp9_highbd_iht4x4_16_add = vp9_highbd_iht4x4_16_add_sse4_1;
vp9_highbd_iht8x8_64_add = vp9_highbd_iht8x8_64_add_c;
if (flags & HAS_SSE4_1) vp9_highbd_iht8x8_64_add = vp9_highbd_iht8x8_64_add_sse4_1;
vp9_highbd_quantize_fp = vp9_highbd_quantize_fp_c;
if (flags & HAS_AVX2) vp9_highbd_quantize_fp = vp9_highbd_quantize_fp_avx2;
vp9_highbd_quantize_fp_32x32 = vp9_highbd_quantize_fp_32x32_c;
if (flags & HAS_AVX2) vp9_highbd_quantize_fp_32x32 = vp9_highbd_quantize_fp_32x32_avx2;
vp9_iht16x16_256_add = vp9_iht16x16_256_add_c;
if (flags & HAS_SSE2) vp9_iht16x16_256_add = vp9_iht16x16_256_add_sse2;
vp9_iht4x4_16_add = vp9_iht4x4_16_add_c;
if (flags & HAS_SSE2) vp9_iht4x4_16_add = vp9_iht4x4_16_add_sse2;
vp9_iht8x8_64_add = vp9_iht8x8_64_add_c;
if (flags & HAS_SSE2) vp9_iht8x8_64_add = vp9_iht8x8_64_add_sse2;
vp9_temporal_filter_apply = vp9_temporal_filter_apply_c;
if (flags & HAS_SSE2) vp9_temporal_filter_apply = vp9_temporal_filter_apply_sse2;
vp9_quantize_fp = vp9_quantize_fp_c;
if (flags & HAS_SSE2) vp9_quantize_fp = vp9_quantize_fp_sse2;
if (flags & HAS_SSSE3) vp9_quantize_fp = vp9_quantize_fp_ssse3;
if (flags & HAS_AVX2) vp9_quantize_fp = vp9_quantize_fp_avx2;
vp9_quantize_fp_32x32 = vp9_quantize_fp_32x32_c;
if (flags & HAS_SSSE3) vp9_quantize_fp_32x32 = vp9_quantize_fp_32x32_ssse3;
if (flags & HAS_AVX2) vp9_quantize_fp_32x32 = vp9_quantize_fp_32x32_avx2;
vp9_scale_and_extend_frame = vp9_scale_and_extend_frame_c;
if (flags & HAS_SSSE3) vp9_scale_and_extend_frame = vp9_scale_and_extend_frame_ssse3;
}
#endif
@@ -167,4 +208,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VP9_RTCD_H_
+1716 -1181
View File
File diff suppressed because it is too large Load Diff
+15 -1
View File
@@ -1,3 +1,14 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VPX_SCALE_RTCD_H_
#define VPX_SCALE_RTCD_H_
@@ -46,6 +57,9 @@ void vpx_extend_frame_borders_c(struct yv12_buffer_config *ybf);
void vpx_extend_frame_inner_borders_c(struct yv12_buffer_config *ybf);
#define vpx_extend_frame_inner_borders vpx_extend_frame_inner_borders_c
void vpx_yv12_copy_frame_c(const struct yv12_buffer_config *src_ybc, struct yv12_buffer_config *dst_ybc);
#define vpx_yv12_copy_frame vpx_yv12_copy_frame_c
void vpx_yv12_copy_y_c(const struct yv12_buffer_config *src_ybc, struct yv12_buffer_config *dst_ybc);
#define vpx_yv12_copy_y vpx_yv12_copy_y_c
@@ -66,4 +80,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VPX_SCALE_RTCD_H_
+81 -87
View File
@@ -1,3 +1,14 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VP8_RTCD_H_
#define VP8_RTCD_H_
@@ -26,56 +37,47 @@ struct yv12_buffer_config;
extern "C" {
#endif
void vp8_bilinear_predict16x16_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict16x16_sse2(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict16x16_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict16x16)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict16x16_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict16x16_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict16x16_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict16x16)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict4x4_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict4x4_mmx(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
#define vp8_bilinear_predict4x4 vp8_bilinear_predict4x4_mmx
void vp8_bilinear_predict4x4_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict4x4_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
#define vp8_bilinear_predict4x4 vp8_bilinear_predict4x4_sse2
void vp8_bilinear_predict8x4_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict8x4_mmx(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
#define vp8_bilinear_predict8x4 vp8_bilinear_predict8x4_mmx
void vp8_bilinear_predict8x4_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict8x4_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
#define vp8_bilinear_predict8x4 vp8_bilinear_predict8x4_sse2
void vp8_bilinear_predict8x8_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict8x8_sse2(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_bilinear_predict8x8_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict8x8)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_blend_b_c(unsigned char *y, unsigned char *u, unsigned char *v, int y1, int u1, int v1, int alpha, int stride);
#define vp8_blend_b vp8_blend_b_c
void vp8_blend_mb_inner_c(unsigned char *y, unsigned char *u, unsigned char *v, int y1, int u1, int v1, int alpha, int stride);
#define vp8_blend_mb_inner vp8_blend_mb_inner_c
void vp8_blend_mb_outer_c(unsigned char *y, unsigned char *u, unsigned char *v, int y1, int u1, int v1, int alpha, int stride);
#define vp8_blend_mb_outer vp8_blend_mb_outer_c
void vp8_bilinear_predict8x8_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict8x8_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_bilinear_predict8x8_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict8x8)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
int vp8_block_error_c(short *coeff, short *dqcoeff);
int vp8_block_error_sse2(short *coeff, short *dqcoeff);
#define vp8_block_error vp8_block_error_sse2
void vp8_copy32xn_c(const unsigned char *src_ptr, int source_stride, unsigned char *dst_ptr, int dst_stride, int n);
void vp8_copy32xn_sse2(const unsigned char *src_ptr, int source_stride, unsigned char *dst_ptr, int dst_stride, int n);
void vp8_copy32xn_sse3(const unsigned char *src_ptr, int source_stride, unsigned char *dst_ptr, int dst_stride, int n);
RTCD_EXTERN void (*vp8_copy32xn)(const unsigned char *src_ptr, int source_stride, unsigned char *dst_ptr, int dst_stride, int n);
void vp8_copy32xn_c(const unsigned char *src_ptr, int src_stride, unsigned char *dst_ptr, int dst_stride, int height);
void vp8_copy32xn_sse2(const unsigned char *src_ptr, int src_stride, unsigned char *dst_ptr, int dst_stride, int height);
void vp8_copy32xn_sse3(const unsigned char *src_ptr, int src_stride, unsigned char *dst_ptr, int dst_stride, int height);
RTCD_EXTERN void (*vp8_copy32xn)(const unsigned char *src_ptr, int src_stride, unsigned char *dst_ptr, int dst_stride, int height);
void vp8_copy_mem16x16_c(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem16x16_sse2(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem16x16_c(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
void vp8_copy_mem16x16_sse2(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
#define vp8_copy_mem16x16 vp8_copy_mem16x16_sse2
void vp8_copy_mem8x4_c(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem8x4_mmx(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem8x4_c(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
void vp8_copy_mem8x4_mmx(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
#define vp8_copy_mem8x4 vp8_copy_mem8x4_mmx
void vp8_copy_mem8x8_c(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem8x8_mmx(unsigned char *src, int src_pitch, unsigned char *dst, int dst_pitch);
void vp8_copy_mem8x8_c(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
void vp8_copy_mem8x8_mmx(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride);
#define vp8_copy_mem8x8 vp8_copy_mem8x8_mmx
void vp8_dc_only_idct_add_c(short input, unsigned char *pred, int pred_stride, unsigned char *dst, int dst_stride);
void vp8_dc_only_idct_add_mmx(short input, unsigned char *pred, int pred_stride, unsigned char *dst, int dst_stride);
void vp8_dc_only_idct_add_c(short input_dc, unsigned char *pred_ptr, int pred_stride, unsigned char *dst_ptr, int dst_stride);
void vp8_dc_only_idct_add_mmx(short input_dc, unsigned char *pred_ptr, int pred_stride, unsigned char *dst_ptr, int dst_stride);
#define vp8_dc_only_idct_add vp8_dc_only_idct_add_mmx
int vp8_denoiser_filter_c(unsigned char *mc_running_avg_y, int mc_avg_y_stride, unsigned char *running_avg_y, int avg_y_stride, unsigned char *sig, int sig_stride, unsigned int motion_magnitude, int increase_denoising);
@@ -86,8 +88,8 @@ int vp8_denoiser_filter_uv_c(unsigned char *mc_running_avg, int mc_avg_stride, u
int vp8_denoiser_filter_uv_sse2(unsigned char *mc_running_avg, int mc_avg_stride, unsigned char *running_avg, int avg_stride, unsigned char *sig, int sig_stride, unsigned int motion_magnitude, int increase_denoising);
#define vp8_denoiser_filter_uv vp8_denoiser_filter_uv_sse2
void vp8_dequant_idct_add_c(short *input, short *dq, unsigned char *output, int stride);
void vp8_dequant_idct_add_mmx(short *input, short *dq, unsigned char *output, int stride);
void vp8_dequant_idct_add_c(short *input, short *dq, unsigned char *dest, int stride);
void vp8_dequant_idct_add_mmx(short *input, short *dq, unsigned char *dest, int stride);
#define vp8_dequant_idct_add vp8_dequant_idct_add_mmx
void vp8_dequant_idct_add_uv_block_c(short *q, short *dq, unsigned char *dst_u, unsigned char *dst_v, int stride, char *eobs);
@@ -98,8 +100,8 @@ void vp8_dequant_idct_add_y_block_c(short *q, short *dq, unsigned char *dst, int
void vp8_dequant_idct_add_y_block_sse2(short *q, short *dq, unsigned char *dst, int stride, char *eobs);
#define vp8_dequant_idct_add_y_block vp8_dequant_idct_add_y_block_sse2
void vp8_dequantize_b_c(struct blockd*, short *dqc);
void vp8_dequantize_b_mmx(struct blockd*, short *dqc);
void vp8_dequantize_b_c(struct blockd*, short *DQC);
void vp8_dequantize_b_mmx(struct blockd*, short *DQC);
#define vp8_dequantize_b vp8_dequantize_b_mmx
int vp8_diamond_search_sad_c(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, union int_mv *best_mv, int search_param, int sad_per_bit, int *num00, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
@@ -122,41 +124,36 @@ void vp8_filter_by_weight8x8_c(unsigned char *src, int src_stride, unsigned char
void vp8_filter_by_weight8x8_sse2(unsigned char *src, int src_stride, unsigned char *dst, int dst_stride, int src_weight);
#define vp8_filter_by_weight8x8 vp8_filter_by_weight8x8_sse2
int vp8_full_search_sad_c(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
int vp8_full_search_sadx3(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
int vp8_full_search_sadx8(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
RTCD_EXTERN int (*vp8_full_search_sad)(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
void vp8_loop_filter_bh_c(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bh_sse2(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bh_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bh_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
#define vp8_loop_filter_bh vp8_loop_filter_bh_sse2
void vp8_loop_filter_bv_c(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bv_sse2(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bv_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_bv_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
#define vp8_loop_filter_bv vp8_loop_filter_bv_sse2
void vp8_loop_filter_mbh_c(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbh_sse2(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbh_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbh_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
#define vp8_loop_filter_mbh vp8_loop_filter_mbh_sse2
void vp8_loop_filter_mbv_c(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbv_sse2(unsigned char *y, unsigned char *u, unsigned char *v, int ystride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbv_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
void vp8_loop_filter_mbv_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr, int y_stride, int uv_stride, struct loop_filter_info *lfi);
#define vp8_loop_filter_mbv vp8_loop_filter_mbv_sse2
void vp8_loop_filter_bhs_c(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_bhs_sse2(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_bhs_c(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
void vp8_loop_filter_bhs_sse2(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
#define vp8_loop_filter_simple_bh vp8_loop_filter_bhs_sse2
void vp8_loop_filter_bvs_c(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_bvs_sse2(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_bvs_c(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
void vp8_loop_filter_bvs_sse2(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
#define vp8_loop_filter_simple_bv vp8_loop_filter_bvs_sse2
void vp8_loop_filter_simple_horizontal_edge_c(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_simple_horizontal_edge_sse2(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_simple_horizontal_edge_c(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
void vp8_loop_filter_simple_horizontal_edge_sse2(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
#define vp8_loop_filter_simple_mbh vp8_loop_filter_simple_horizontal_edge_sse2
void vp8_loop_filter_simple_vertical_edge_c(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_simple_vertical_edge_sse2(unsigned char *y, int ystride, const unsigned char *blimit);
void vp8_loop_filter_simple_vertical_edge_c(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
void vp8_loop_filter_simple_vertical_edge_sse2(unsigned char *y_ptr, int y_stride, const unsigned char *blimit);
#define vp8_loop_filter_simple_mbv vp8_loop_filter_simple_vertical_edge_sse2
int vp8_mbblock_error_c(struct macroblock *mb, int dc);
@@ -167,8 +164,8 @@ int vp8_mbuverror_c(struct macroblock *mb);
int vp8_mbuverror_sse2(struct macroblock *mb);
#define vp8_mbuverror vp8_mbuverror_sse2
int vp8_refining_search_sad_c(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
int vp8_refining_search_sadx4(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int sad_per_bit, int distance, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
int vp8_refining_search_sad_c(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int error_per_bit, int search_range, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
int vp8_refining_search_sadx4(struct macroblock *x, struct block *b, struct blockd *d, union int_mv *ref_mv, int error_per_bit, int search_range, struct variance_vtable *fn_ptr, int *mvcost[2], union int_mv *center_mv);
#define vp8_refining_search_sad vp8_refining_search_sadx4
void vp8_regular_quantize_b_c(struct block *, struct blockd *);
@@ -184,40 +181,40 @@ void vp8_short_fdct8x4_c(short *input, short *output, int pitch);
void vp8_short_fdct8x4_sse2(short *input, short *output, int pitch);
#define vp8_short_fdct8x4 vp8_short_fdct8x4_sse2
void vp8_short_idct4x4llm_c(short *input, unsigned char *pred, int pitch, unsigned char *dst, int dst_stride);
void vp8_short_idct4x4llm_mmx(short *input, unsigned char *pred, int pitch, unsigned char *dst, int dst_stride);
void vp8_short_idct4x4llm_c(short *input, unsigned char *pred_ptr, int pred_stride, unsigned char *dst_ptr, int dst_stride);
void vp8_short_idct4x4llm_mmx(short *input, unsigned char *pred_ptr, int pred_stride, unsigned char *dst_ptr, int dst_stride);
#define vp8_short_idct4x4llm vp8_short_idct4x4llm_mmx
void vp8_short_inv_walsh4x4_c(short *input, short *output);
void vp8_short_inv_walsh4x4_sse2(short *input, short *output);
void vp8_short_inv_walsh4x4_c(short *input, short *mb_dqcoeff);
void vp8_short_inv_walsh4x4_sse2(short *input, short *mb_dqcoeff);
#define vp8_short_inv_walsh4x4 vp8_short_inv_walsh4x4_sse2
void vp8_short_inv_walsh4x4_1_c(short *input, short *output);
void vp8_short_inv_walsh4x4_1_c(short *input, short *mb_dqcoeff);
#define vp8_short_inv_walsh4x4_1 vp8_short_inv_walsh4x4_1_c
void vp8_short_walsh4x4_c(short *input, short *output, int pitch);
void vp8_short_walsh4x4_sse2(short *input, short *output, int pitch);
#define vp8_short_walsh4x4 vp8_short_walsh4x4_sse2
void vp8_sixtap_predict16x16_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict16x16_sse2(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict16x16_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict16x16)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict16x16_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict16x16_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict16x16_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict16x16)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict4x4_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict4x4_mmx(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict4x4_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict4x4)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict4x4_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict4x4_mmx(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict4x4_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict4x4)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x4_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x4_sse2(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x4_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict8x4)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x4_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x4_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x4_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict8x4)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x8_c(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x8_sse2(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x8_ssse3(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict8x8)(unsigned char *src, int src_pitch, int xofst, int yofst, unsigned char *dst, int dst_pitch);
void vp8_sixtap_predict8x8_c(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x8_sse2(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_sixtap_predict8x8_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_sixtap_predict8x8)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_temporal_filter_apply_c(unsigned char *frame1, unsigned int stride, unsigned char *frame2, unsigned int block_size, int strength, int filter_weight, unsigned int *accumulator, unsigned short *count);
void vp8_temporal_filter_apply_sse2(unsigned char *frame1, unsigned int stride, unsigned char *frame2, unsigned int block_size, int strength, int filter_weight, unsigned int *accumulator, unsigned short *count);
@@ -241,9 +238,6 @@ static void setup_rtcd_internal(void)
if (flags & HAS_SSE3) vp8_copy32xn = vp8_copy32xn_sse3;
vp8_fast_quantize_b = vp8_fast_quantize_b_sse2;
if (flags & HAS_SSSE3) vp8_fast_quantize_b = vp8_fast_quantize_b_ssse3;
vp8_full_search_sad = vp8_full_search_sad_c;
if (flags & HAS_SSE3) vp8_full_search_sad = vp8_full_search_sadx3;
if (flags & HAS_SSE4_1) vp8_full_search_sad = vp8_full_search_sadx8;
vp8_regular_quantize_b = vp8_regular_quantize_b_sse2;
if (flags & HAS_SSE4_1) vp8_regular_quantize_b = vp8_regular_quantize_b_sse4_1;
vp8_sixtap_predict16x16 = vp8_sixtap_predict16x16_sse2;
@@ -261,4 +255,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VP8_RTCD_H_
+89 -48
View File
@@ -1,3 +1,14 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VP9_RTCD_H_
#define VP9_RTCD_H_
@@ -14,12 +25,15 @@
#include "vpx/vpx_integer.h"
#include "vp9/common/vp9_common.h"
#include "vp9/common/vp9_enums.h"
#include "vp9/common/vp9_filter.h"
struct macroblockd;
/* Encoder forward decls */
struct macroblock;
struct vp9_variance_vtable;
struct macroblock_plane;
struct vp9_sad_table;
struct ScanOrder;
struct search_site_config;
struct mv;
union int_mv;
@@ -29,16 +43,22 @@ struct yv12_buffer_config;
extern "C" {
#endif
void vp9_apply_temporal_filter_c(const uint8_t *y_src, int y_src_stride, const uint8_t *y_pre, int y_pre_stride, const uint8_t *u_src, const uint8_t *v_src, int uv_src_stride, const uint8_t *u_pre, const uint8_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accumulator, uint16_t *y_count, uint32_t *u_accumulator, uint16_t *u_count, uint32_t *v_accumulator, uint16_t *v_count);
void vp9_apply_temporal_filter_sse4_1(const uint8_t *y_src, int y_src_stride, const uint8_t *y_pre, int y_pre_stride, const uint8_t *u_src, const uint8_t *v_src, int uv_src_stride, const uint8_t *u_pre, const uint8_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accumulator, uint16_t *y_count, uint32_t *u_accumulator, uint16_t *u_count, uint32_t *v_accumulator, uint16_t *v_count);
RTCD_EXTERN void (*vp9_apply_temporal_filter)(const uint8_t *y_src, int y_src_stride, const uint8_t *y_pre, int y_pre_stride, const uint8_t *u_src, const uint8_t *v_src, int uv_src_stride, const uint8_t *u_pre, const uint8_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accumulator, uint16_t *y_count, uint32_t *u_accumulator, uint16_t *u_count, uint32_t *v_accumulator, uint16_t *v_count);
int64_t vp9_block_error_c(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
#define vp9_block_error vp9_block_error_c
int64_t vp9_block_error_sse2(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
int64_t vp9_block_error_avx2(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
RTCD_EXTERN int64_t (*vp9_block_error)(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
int vp9_diamond_search_sad_c(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
int vp9_diamond_search_sad_avx(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
RTCD_EXTERN int (*vp9_diamond_search_sad)(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
int64_t vp9_block_error_fp_c(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
int64_t vp9_block_error_fp_sse2(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
int64_t vp9_block_error_fp_avx2(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
RTCD_EXTERN int64_t (*vp9_block_error_fp)(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
void vp9_fdct8x8_quant_c(const int16_t *input, int stride, tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_fdct8x8_quant_ssse3(const int16_t *input, int stride, tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vp9_fdct8x8_quant)(const int16_t *input, int stride, tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
int vp9_diamond_search_sad_c(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, uint32_t start_mv_sad, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_sad_table *sad_fn_ptr, const struct mv *center_mv);
#define vp9_diamond_search_sad vp9_diamond_search_sad_c
void vp9_fht16x16_c(const int16_t *input, tran_low_t *output, int stride, int tx_type);
void vp9_fht16x16_sse2(const int16_t *input, tran_low_t *output, int stride, int tx_type);
@@ -52,24 +72,18 @@ void vp9_fht8x8_c(const int16_t *input, tran_low_t *output, int stride, int tx_t
void vp9_fht8x8_sse2(const int16_t *input, tran_low_t *output, int stride, int tx_type);
#define vp9_fht8x8 vp9_fht8x8_sse2
int vp9_full_search_sad_c(const struct macroblock *x, const struct mv *ref_mv, int sad_per_bit, int distance, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv, struct mv *best_mv);
int vp9_full_search_sadx3(const struct macroblock *x, const struct mv *ref_mv, int sad_per_bit, int distance, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv, struct mv *best_mv);
int vp9_full_search_sadx8(const struct macroblock *x, const struct mv *ref_mv, int sad_per_bit, int distance, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv, struct mv *best_mv);
RTCD_EXTERN int (*vp9_full_search_sad)(const struct macroblock *x, const struct mv *ref_mv, int sad_per_bit, int distance, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv, struct mv *best_mv);
void vp9_fwht4x4_c(const int16_t *input, tran_low_t *output, int stride);
void vp9_fwht4x4_sse2(const int16_t *input, tran_low_t *output, int stride);
#define vp9_fwht4x4 vp9_fwht4x4_sse2
void vp9_highbd_apply_temporal_filter_c(const uint16_t *y_src, int y_src_stride, const uint16_t *y_pre, int y_pre_stride, const uint16_t *u_src, const uint16_t *v_src, int uv_src_stride, const uint16_t *u_pre, const uint16_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accum, uint16_t *y_count, uint32_t *u_accum, uint16_t *u_count, uint32_t *v_accum, uint16_t *v_count);
void vp9_highbd_apply_temporal_filter_sse4_1(const uint16_t *y_src, int y_src_stride, const uint16_t *y_pre, int y_pre_stride, const uint16_t *u_src, const uint16_t *v_src, int uv_src_stride, const uint16_t *u_pre, const uint16_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accum, uint16_t *y_count, uint32_t *u_accum, uint16_t *u_count, uint32_t *v_accum, uint16_t *v_count);
RTCD_EXTERN void (*vp9_highbd_apply_temporal_filter)(const uint16_t *y_src, int y_src_stride, const uint16_t *y_pre, int y_pre_stride, const uint16_t *u_src, const uint16_t *v_src, int uv_src_stride, const uint16_t *u_pre, const uint16_t *v_pre, int uv_pre_stride, unsigned int block_width, unsigned int block_height, int ss_x, int ss_y, int strength, const int *const blk_fw, int use_32x32, uint32_t *y_accum, uint16_t *y_count, uint32_t *u_accum, uint16_t *u_count, uint32_t *v_accum, uint16_t *v_count);
int64_t vp9_highbd_block_error_c(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz, int bd);
int64_t vp9_highbd_block_error_sse2(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz, int bd);
#define vp9_highbd_block_error vp9_highbd_block_error_sse2
int64_t vp9_highbd_block_error_8bit_c(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
int64_t vp9_highbd_block_error_8bit_sse2(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
int64_t vp9_highbd_block_error_8bit_avx(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
RTCD_EXTERN int64_t (*vp9_highbd_block_error_8bit)(const tran_low_t *coeff, const tran_low_t *dqcoeff, intptr_t block_size, int64_t *ssz);
void vp9_highbd_fht16x16_c(const int16_t *input, tran_low_t *output, int stride, int tx_type);
#define vp9_highbd_fht16x16 vp9_highbd_fht16x16_c
@@ -82,26 +96,31 @@ void vp9_highbd_fht8x8_c(const int16_t *input, tran_low_t *output, int stride, i
void vp9_highbd_fwht4x4_c(const int16_t *input, tran_low_t *output, int stride);
#define vp9_highbd_fwht4x4 vp9_highbd_fwht4x4_c
void vp9_highbd_iht16x16_256_add_c(const tran_low_t *input, uint8_t *output, int pitch, int tx_type, int bd);
#define vp9_highbd_iht16x16_256_add vp9_highbd_iht16x16_256_add_c
void vp9_highbd_iht16x16_256_add_c(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_iht16x16_256_add_sse4_1(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
RTCD_EXTERN void (*vp9_highbd_iht16x16_256_add)(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_iht4x4_16_add_c(const tran_low_t *input, uint8_t *dest, int stride, int tx_type, int bd);
#define vp9_highbd_iht4x4_16_add vp9_highbd_iht4x4_16_add_c
void vp9_highbd_iht4x4_16_add_c(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_iht4x4_16_add_sse4_1(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
RTCD_EXTERN void (*vp9_highbd_iht4x4_16_add)(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_iht8x8_64_add_c(const tran_low_t *input, uint8_t *dest, int stride, int tx_type, int bd);
#define vp9_highbd_iht8x8_64_add vp9_highbd_iht8x8_64_add_c
void vp9_highbd_iht8x8_64_add_c(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_iht8x8_64_add_sse4_1(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
RTCD_EXTERN void (*vp9_highbd_iht8x8_64_add)(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
#define vp9_highbd_quantize_fp vp9_highbd_quantize_fp_c
void vp9_highbd_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_highbd_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
#define vp9_highbd_quantize_fp_32x32 vp9_highbd_quantize_fp_32x32_c
void vp9_highbd_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_highbd_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_temporal_filter_apply_c(uint8_t *frame1, unsigned int stride, uint8_t *frame2, unsigned int block_width, unsigned int block_height, int strength, int filter_weight, unsigned int *accumulator, uint16_t *count);
void vp9_highbd_temporal_filter_apply_c(const uint8_t *frame1, unsigned int stride, const uint8_t *frame2, unsigned int block_width, unsigned int block_height, int strength, int *blk_fw, int use_32x32, uint32_t *accumulator, uint16_t *count);
#define vp9_highbd_temporal_filter_apply vp9_highbd_temporal_filter_apply_c
void vp9_iht16x16_256_add_c(const tran_low_t *input, uint8_t *output, int pitch, int tx_type);
void vp9_iht16x16_256_add_sse2(const tran_low_t *input, uint8_t *output, int pitch, int tx_type);
void vp9_iht16x16_256_add_c(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
void vp9_iht16x16_256_add_sse2(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
#define vp9_iht16x16_256_add vp9_iht16x16_256_add_sse2
void vp9_iht4x4_16_add_c(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
@@ -112,15 +131,20 @@ void vp9_iht8x8_64_add_c(const tran_low_t *input, uint8_t *dest, int stride, int
void vp9_iht8x8_64_add_sse2(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
#define vp9_iht8x8_64_add vp9_iht8x8_64_add_sse2
void vp9_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
#define vp9_quantize_fp vp9_quantize_fp_c
void vp9_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, int skip_block, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
#define vp9_quantize_fp_32x32 vp9_quantize_fp_32x32_c
void vp9_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_temporal_filter_apply_c(uint8_t *frame1, unsigned int stride, uint8_t *frame2, unsigned int block_width, unsigned int block_height, int strength, int filter_weight, unsigned int *accumulator, uint16_t *count);
void vp9_temporal_filter_apply_sse2(uint8_t *frame1, unsigned int stride, uint8_t *frame2, unsigned int block_width, unsigned int block_height, int strength, int filter_weight, unsigned int *accumulator, uint16_t *count);
#define vp9_temporal_filter_apply vp9_temporal_filter_apply_sse2
void vp9_scale_and_extend_frame_c(const struct yv12_buffer_config *src, struct yv12_buffer_config *dst, INTERP_FILTER filter_type, int phase_scaler);
void vp9_scale_and_extend_frame_ssse3(const struct yv12_buffer_config *src, struct yv12_buffer_config *dst, INTERP_FILTER filter_type, int phase_scaler);
RTCD_EXTERN void (*vp9_scale_and_extend_frame)(const struct yv12_buffer_config *src, struct yv12_buffer_config *dst, INTERP_FILTER filter_type, int phase_scaler);
void vp9_rtcd(void);
@@ -132,15 +156,32 @@ static void setup_rtcd_internal(void)
(void)flags;
vp9_diamond_search_sad = vp9_diamond_search_sad_c;
if (flags & HAS_AVX) vp9_diamond_search_sad = vp9_diamond_search_sad_avx;
vp9_fdct8x8_quant = vp9_fdct8x8_quant_c;
if (flags & HAS_SSSE3) vp9_fdct8x8_quant = vp9_fdct8x8_quant_ssse3;
vp9_full_search_sad = vp9_full_search_sad_c;
if (flags & HAS_SSE3) vp9_full_search_sad = vp9_full_search_sadx3;
if (flags & HAS_SSE4_1) vp9_full_search_sad = vp9_full_search_sadx8;
vp9_highbd_block_error_8bit = vp9_highbd_block_error_8bit_sse2;
if (flags & HAS_AVX) vp9_highbd_block_error_8bit = vp9_highbd_block_error_8bit_avx;
vp9_apply_temporal_filter = vp9_apply_temporal_filter_c;
if (flags & HAS_SSE4_1) vp9_apply_temporal_filter = vp9_apply_temporal_filter_sse4_1;
vp9_block_error = vp9_block_error_sse2;
if (flags & HAS_AVX2) vp9_block_error = vp9_block_error_avx2;
vp9_block_error_fp = vp9_block_error_fp_sse2;
if (flags & HAS_AVX2) vp9_block_error_fp = vp9_block_error_fp_avx2;
vp9_highbd_apply_temporal_filter = vp9_highbd_apply_temporal_filter_c;
if (flags & HAS_SSE4_1) vp9_highbd_apply_temporal_filter = vp9_highbd_apply_temporal_filter_sse4_1;
vp9_highbd_iht16x16_256_add = vp9_highbd_iht16x16_256_add_c;
if (flags & HAS_SSE4_1) vp9_highbd_iht16x16_256_add = vp9_highbd_iht16x16_256_add_sse4_1;
vp9_highbd_iht4x4_16_add = vp9_highbd_iht4x4_16_add_c;
if (flags & HAS_SSE4_1) vp9_highbd_iht4x4_16_add = vp9_highbd_iht4x4_16_add_sse4_1;
vp9_highbd_iht8x8_64_add = vp9_highbd_iht8x8_64_add_c;
if (flags & HAS_SSE4_1) vp9_highbd_iht8x8_64_add = vp9_highbd_iht8x8_64_add_sse4_1;
vp9_highbd_quantize_fp = vp9_highbd_quantize_fp_c;
if (flags & HAS_AVX2) vp9_highbd_quantize_fp = vp9_highbd_quantize_fp_avx2;
vp9_highbd_quantize_fp_32x32 = vp9_highbd_quantize_fp_32x32_c;
if (flags & HAS_AVX2) vp9_highbd_quantize_fp_32x32 = vp9_highbd_quantize_fp_32x32_avx2;
vp9_quantize_fp = vp9_quantize_fp_sse2;
if (flags & HAS_SSSE3) vp9_quantize_fp = vp9_quantize_fp_ssse3;
if (flags & HAS_AVX2) vp9_quantize_fp = vp9_quantize_fp_avx2;
vp9_quantize_fp_32x32 = vp9_quantize_fp_32x32_c;
if (flags & HAS_SSSE3) vp9_quantize_fp_32x32 = vp9_quantize_fp_32x32_ssse3;
if (flags & HAS_AVX2) vp9_quantize_fp_32x32 = vp9_quantize_fp_32x32_avx2;
vp9_scale_and_extend_frame = vp9_scale_and_extend_frame_c;
if (flags & HAS_SSSE3) vp9_scale_and_extend_frame = vp9_scale_and_extend_frame_ssse3;
}
#endif
@@ -148,4 +189,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VP9_RTCD_H_
+1472 -1047
View File
File diff suppressed because it is too large Load Diff
+15 -1
View File
@@ -1,3 +1,14 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VPX_SCALE_RTCD_H_
#define VPX_SCALE_RTCD_H_
@@ -46,6 +57,9 @@ void vpx_extend_frame_borders_c(struct yv12_buffer_config *ybf);
void vpx_extend_frame_inner_borders_c(struct yv12_buffer_config *ybf);
#define vpx_extend_frame_inner_borders vpx_extend_frame_inner_borders_c
void vpx_yv12_copy_frame_c(const struct yv12_buffer_config *src_ybc, struct yv12_buffer_config *dst_ybc);
#define vpx_yv12_copy_frame vpx_yv12_copy_frame_c
void vpx_yv12_copy_y_c(const struct yv12_buffer_config *src_ybc, struct yv12_buffer_config *dst_ybc);
#define vpx_yv12_copy_y vpx_yv12_copy_y_c
@@ -66,4 +80,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VPX_SCALE_RTCD_H_
+6 -6
View File
@@ -8,16 +8,18 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include "args.h"
#include "vpx/vpx_integer.h"
#include "vpx_ports/msvc.h"
#if defined(__GNUC__) && __GNUC__
extern void die(const char *fmt, ...) __attribute__((noreturn));
#if defined(__GNUC__)
__attribute__((noreturn)) extern void die(const char *fmt, ...);
#elif defined(_MSC_VER)
__declspec(noreturn) extern void die(const char *fmt, ...);
#else
extern void die(const char *fmt, ...);
#endif
@@ -81,6 +83,7 @@ const char *arg_next(struct arg *arg) {
char **argv_dup(int argc, const char **argv) {
char **new_argv = malloc((argc + 1) * sizeof(*argv));
if (!new_argv) return NULL;
memcpy(new_argv, argv, argc * sizeof(*argv));
new_argv[argc] = NULL;
@@ -132,7 +135,6 @@ unsigned int arg_parse_uint(const struct arg *arg) {
}
die("Option %s: Invalid character '%c'\n", arg->name, *endptr);
return 0;
}
int arg_parse_int(const struct arg *arg) {
@@ -149,7 +151,6 @@ int arg_parse_int(const struct arg *arg) {
}
die("Option %s: Invalid character '%c'\n", arg->name, *endptr);
return 0;
}
struct vpx_rational {
@@ -206,7 +207,6 @@ int arg_parse_enum(const struct arg *arg) {
if (!strcmp(arg->val, listptr->name)) return listptr->val;
die("Option %s: Invalid value '%s'\n", arg->name, arg->val);
return 0;
}
int arg_parse_enum_or_int(const struct arg *arg) {
+3 -3
View File
@@ -8,8 +8,8 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef ARGS_H_
#define ARGS_H_
#ifndef VPX_ARGS_H_
#define VPX_ARGS_H_
#include <stdio.h>
#ifdef __cplusplus
@@ -60,4 +60,4 @@ int arg_parse_enum_or_int(const struct arg *arg);
} // extern "C"
#endif
#endif // ARGS_H_
#endif // VPX_ARGS_H_
-2
View File
@@ -1,2 +0,0 @@
*-vs8/*.rules -crlf
*-msvs/*.rules -crlf
-1
View File
@@ -1 +0,0 @@
x86*-win32-vs*
+21 -33
View File
@@ -8,17 +8,15 @@
## be found in the AUTHORS file in the root of the source tree.
##
# Ignore this file during non-NDK builds.
ifdef NDK_ROOT
#
# This file is to be used for compiling libvpx for Android using the NDK.
# In an Android project place a libvpx checkout in the jni directory.
# Run the configure script from the jni directory. Base libvpx
# encoder/decoder configuration will look similar to:
# ./libvpx/configure --target=armv7-android-gcc --disable-examples \
# --sdk-path=/opt/android-ndk-r6b/
#
# When targeting Android, realtime-only is enabled by default. This can
# be overridden by adding the command line flag:
# --disable-realtime-only
# ./libvpx/configure --target=arm64-android-gcc --disable-examples \
# --enable-external-build
#
# This will create .mk files that contain variables that contain the
# source files to compile.
@@ -29,37 +27,23 @@
# include $(CLEAR_VARS)
# include jni/libvpx/build/make/Android.mk
#
# By default libvpx will detect at runtime the existance of NEON extension.
# For this we import the 'cpufeatures' module from the NDK sources.
# libvpx can also be configured without this runtime detection method.
# Configuring with --disable-runtime-cpu-detect will assume presence of NEON.
# Configuring with --disable-runtime-cpu-detect --disable-neon \
# --disable-neon-asm
# will remove any NEON dependency.
# By default libvpx will use the 'cpufeatures' module from the NDK. This allows
# the library to be built with all available optimizations (SSE2->AVX512 for
# x86, NEON for arm, DSPr2 for mips). This can be disabled with
# --disable-runtime-cpu-detect
# but the resulting library *must* be run on devices supporting all of the
# enabled extensions. They can be disabled individually with
# --disable-{sse2, sse3, ssse3, sse4_1, avx, avx2, avx512}
# --disable-neon{, -asm, -neon-dotprod, -neon-i8mm}
# --disable-sve
# --disable-{dspr2, msa}
#
# Running ndk-build will build libvpx and include it in your project.
# Running ndk-build will build libvpx and include it in your project. Set
# APP_ABI to match the --target passed to configure:
# https://developer.android.com/ndk/guides/application_mk#app_abi.
#
# Alternatively, building the examples and unit tests can be accomplished in the
# following way:
#
# Create a standalone toolchain from the NDK:
# https://developer.android.com/ndk/guides/standalone_toolchain.html
#
# For example - to test on arm64 devices with clang:
# $NDK/build/tools/make_standalone_toolchain.py \
# --arch arm64 --install-dir=/tmp/my-android-toolchain
# export PATH=/tmp/my-android-toolchain/bin:$PATH
# CROSS=aarch64-linux-android- CC=clang CXX=clang++ /path/to/libvpx/configure \
# --target=arm64-android-gcc
#
# Push the resulting binaries to a device and run them:
# adb push test_libvpx /data/tmp/test_libvpx
# adb shell /data/tmp/test_libvpx --gtest_filter=\*Sixtap\*
#
# Make sure to push the test data as well and set LIBVPX_TEST_DATA
CONFIG_DIR := $(LOCAL_PATH)/
LIBVPX_PATH := $(LOCAL_PATH)/libvpx
ASM_CNV_PATH_LOCAL := $(TARGET_ARCH_ABI)/ads2gas
@@ -183,6 +167,9 @@ LOCAL_CFLAGS += \
-I$(ASM_CNV_PATH)/libvpx
LOCAL_MODULE := libvpx
LOCAL_LICENSE_KINDS := SPDX-license-identifier-BSD
LOCAL_LICENSE_CONDITIONS := notice
LOCAL_NOTICE_FILE := $(LOCAL_PATH)/../../LICENSE $(LOCAL_PATH)/../../PATENTS
ifeq ($(CONFIG_RUNTIME_CPU_DETECT),yes)
LOCAL_STATIC_LIBRARIES := cpufeatures
@@ -226,3 +213,4 @@ endif
ifeq ($(CONFIG_RUNTIME_CPU_DETECT),yes)
$(call import-module,android/cpufeatures)
endif
endif # NDK_ROOT
+54 -12
View File
@@ -21,9 +21,9 @@ all: .DEFAULT
clean:: .DEFAULT
exampletest: .DEFAULT
install:: .DEFAULT
test:: .DEFAULT
test-no-data-check:: .DEFAULT
testdata:: .DEFAULT
test: .DEFAULT
test-no-data-check: .DEFAULT
testdata: .DEFAULT
utiltest: .DEFAULT
exampletest-no-data-check utiltest-no-data-check: .DEFAULT
test_%: .DEFAULT ;
@@ -99,6 +99,7 @@ distclean: clean
rm -f Makefile; \
rm -f config.log config.mk; \
rm -f vpx_config.[hc] vpx_config.asm; \
rm -f arm_neon.h; \
else \
rm -f $(target)-$(TOOLCHAIN).mk; \
fi
@@ -110,13 +111,13 @@ exampletest:
.PHONY: install
install::
.PHONY: test
test::
test:
.PHONY: testdata
testdata::
testdata:
.PHONY: utiltest
utiltest:
.PHONY: test-no-data-check exampletest-no-data-check utiltest-no-data-check
test-no-data-check::
test-no-data-check:
exampletest-no-data-check utiltest-no-data-check:
# Force to realign stack always on OS/2
@@ -124,6 +125,7 @@ ifeq ($(TOOLCHAIN), x86-os2-gcc)
CFLAGS += -mstackrealign
endif
# x86[_64]
$(BUILD_PFX)%_mmx.c.d: CFLAGS += -mmmx
$(BUILD_PFX)%_mmx.c.o: CFLAGS += -mmmx
$(BUILD_PFX)%_sse2.c.d: CFLAGS += -msse2
@@ -138,6 +140,32 @@ $(BUILD_PFX)%_avx.c.d: CFLAGS += -mavx
$(BUILD_PFX)%_avx.c.o: CFLAGS += -mavx
$(BUILD_PFX)%_avx2.c.d: CFLAGS += -mavx2
$(BUILD_PFX)%_avx2.c.o: CFLAGS += -mavx2
$(BUILD_PFX)%_avx512.c.d: CFLAGS += -mavx512f -mavx512cd -mavx512bw -mavx512dq -mavx512vl
$(BUILD_PFX)%_avx512.c.o: CFLAGS += -mavx512f -mavx512cd -mavx512bw -mavx512dq -mavx512vl
# AARCH64
$(BUILD_PFX)%_neon_dotprod.c.d: CFLAGS += -march=armv8.2-a+dotprod
$(BUILD_PFX)%_neon_dotprod.c.o: CFLAGS += -march=armv8.2-a+dotprod
$(BUILD_PFX)%_neon_i8mm.c.d: CFLAGS += -march=armv8.2-a+dotprod+i8mm
$(BUILD_PFX)%_neon_i8mm.c.o: CFLAGS += -march=armv8.2-a+dotprod+i8mm
$(BUILD_PFX)%_sve.c.d: CFLAGS += -march=armv8.2-a+dotprod+i8mm+sve
$(BUILD_PFX)%_sve.c.o: CFLAGS += -march=armv8.2-a+dotprod+i8mm+sve
$(BUILD_PFX)%_sve2.c.d: CFLAGS += -march=armv9-a+sve2
$(BUILD_PFX)%_sve2.c.o: CFLAGS += -march=armv9-a+sve2
# POWER
$(BUILD_PFX)%_vsx.c.d: CFLAGS += -maltivec -mvsx
$(BUILD_PFX)%_vsx.c.o: CFLAGS += -maltivec -mvsx
# MIPS
$(BUILD_PFX)%_msa.c.d: CFLAGS += -mmsa
$(BUILD_PFX)%_msa.c.o: CFLAGS += -mmsa
# LOONGARCH
$(BUILD_PFX)%_lsx.c.d: CFLAGS += -mlsx
$(BUILD_PFX)%_lsx.c.o: CFLAGS += -mlsx
$(BUILD_PFX)%_lasx.c.d: CFLAGS += -mlasx
$(BUILD_PFX)%_lasx.c.o: CFLAGS += -mlasx
$(BUILD_PFX)%.c.d: %.c
$(if $(quiet),@echo " [DEP] $@")
@@ -286,6 +314,19 @@ $(1):
$(qexec)$$(AR) $$(ARFLAGS) $$@ $$^
endef
# Don't use -Wl,-z,defs with Clang's sanitizers.
#
# Clang's AddressSanitizer documentation says "When linking shared libraries,
# the AddressSanitizer run-time is not linked, so -Wl,-z,defs may cause link
# errors (don't use it with AddressSanitizer)." See
# https://clang.llvm.org/docs/AddressSanitizer.html#usage.
NO_UNDEFINED := -Wl,-z,defs
ifeq ($(findstring clang,$(CC)),clang)
ifneq ($(filter -fsanitize=%,$(LDFLAGS)),)
NO_UNDEFINED :=
endif
endif
define so_template
# Not using a pattern rule here because we don't want to generate empty
# archives when they are listed as a dependency in files not responsible
@@ -295,7 +336,8 @@ define so_template
$(1):
$(if $(quiet),@echo " [LD] $$@")
$(qexec)$$(LD) -shared $$(LDFLAGS) \
-Wl,--no-undefined -Wl,-soname,$$(SONAME) \
$(NO_UNDEFINED) \
-Wl,-soname,$$(SONAME) \
-Wl,--version-script,$$(EXPORTS_FILE) -o $$@ \
$$(filter %.o,$$^) $$(extralibs)
endef
@@ -422,10 +464,10 @@ ifneq ($(call enabled,DIST-SRCS),)
DIST-SRCS-$(CONFIG_MSVS) += build/make/gen_msvs_vcxproj.sh
DIST-SRCS-$(CONFIG_MSVS) += build/make/msvs_common.sh
DIST-SRCS-$(CONFIG_RVCT) += build/make/armlink_adapter.sh
DIST-SRCS-$(ARCH_ARM) += build/make/ads2gas.pl
DIST-SRCS-$(ARCH_ARM) += build/make/ads2gas_apple.pl
DIST-SRCS-$(ARCH_ARM) += build/make/ads2armasm_ms.pl
DIST-SRCS-$(ARCH_ARM) += build/make/thumb.pm
DIST-SRCS-$(VPX_ARCH_ARM) += build/make/ads2gas.pl
DIST-SRCS-$(VPX_ARCH_ARM) += build/make/ads2gas_apple.pl
DIST-SRCS-$(VPX_ARCH_ARM) += build/make/ads2armasm_ms.pl
DIST-SRCS-$(VPX_ARCH_ARM) += build/make/thumb.pm
DIST-SRCS-yes += $(target:-$(TOOLCHAIN)=).mk
endif
INSTALL-SRCS := $(call cond_enabled,CONFIG_INSTALL_SRCS,INSTALL-SRCS)
@@ -447,6 +489,6 @@ INSTALL_TARGETS += .install-docs .install-srcs .install-libs .install-bins
all: $(BUILD_TARGETS)
install:: $(INSTALL_TARGETS)
dist: $(INSTALL_TARGETS)
test::
test:
.SUFFIXES: # Delete default suffix rules
+1 -1
View File
@@ -28,7 +28,7 @@ while (<STDIN>)
s/qsubaddx/qsax/i;
s/qaddsubx/qasx/i;
thumb::FixThumbInstructions($_, 1);
thumb::FixThumbInstructions($_);
s/ldrneb/ldrbne/i;
s/ldrneh/ldrhne/i;
+45 -116
View File
@@ -23,16 +23,17 @@ use lib $FindBin::Bin;
use thumb;
my $thumb = 0;
my $elf = 1;
foreach my $arg (@ARGV) {
$thumb = 1 if ($arg eq "-thumb");
$elf = 0 if ($arg eq "-noelf");
}
print "@ This file was created from a .asm file\n";
print "@ using the ads2gas.pl script.\n";
print "\t.equ DO1STROUNDING, 0\n";
print ".syntax unified\n";
if ($thumb) {
print "\t.syntax unified\n";
print "\t.thumb\n";
}
@@ -41,39 +42,11 @@ if ($thumb) {
while (<STDIN>)
{
undef $comment;
undef $line;
$comment_char = ";";
$comment_sub = "@";
# Handle comments.
if (/$comment_char/)
{
$comment = "";
($line, $comment) = /(.*?)$comment_char(.*)/;
$_ = $line;
}
# Load and store alignment
s/@/,:/g;
# Hexadecimal constants prefaced by 0x
s/#&/#0x/g;
# Convert :OR: to |
s/:OR:/ | /g;
# Convert :AND: to &
s/:AND:/ & /g;
# Convert :NOT: to ~
s/:NOT:/ ~ /g;
# Convert :SHL: to <<
s/:SHL:/ << /g;
# Convert :SHR: to >>
s/:SHR:/ >> /g;
# Comment character
s/;/@/;
# Convert ELSE to .else
s/\bELSE\b/.else/g;
@@ -81,128 +54,83 @@ while (<STDIN>)
# Convert ENDIF to .endif
s/\bENDIF\b/.endif/g;
# Convert ELSEIF to .elseif
s/\bELSEIF\b/.elseif/g;
# Convert LTORG to .ltorg
s/\bLTORG\b/.ltorg/g;
# Convert endfunc to nothing.
s/\bendfunc\b//ig;
# Convert FUNCTION to nothing.
s/\bFUNCTION\b//g;
s/\bfunction\b//g;
s/\bENTRY\b//g;
s/\bMSARMASM\b/0/g;
s/^\s+end\s+$//g;
# Convert IF :DEF:to .if
# gcc doesn't have the ability to do a conditional
# if defined variable that is set by IF :DEF: on
# armasm, so convert it to a normal .if and then
# make sure to define a value elesewhere
if (s/\bIF :DEF:\b/.if /g)
{
s/=/==/g;
}
# Convert IF to .if
if (s/\bIF\b/.if/g)
{
if (s/\bIF\b/.if/g) {
s/=+/==/g;
}
# Convert INCLUDE to .INCLUDE "file"
s/INCLUDE(\s*)(.*)$/.include $1\"$2\"/;
# Code directive (ARM vs Thumb)
s/CODE([0-9][0-9])/.code $1/;
s/INCLUDE\s?(.*)$/.include \"$1\"/;
# No AREA required
# But ALIGNs in AREA must be obeyed
s/^\s*AREA.*ALIGN=([0-9])$/.text\n.p2align $1/;
s/^(\s*)\bAREA\b.*ALIGN=([0-9])$/$1.text\n$1.p2align $2/;
# If no ALIGN, strip the AREA and align to 4 bytes
s/^\s*AREA.*$/.text\n.p2align 2/;
s/^(\s*)\bAREA\b.*$/$1.text\n$1.p2align 2/;
# DCD to .word
# This one is for incoming symbols
s/DCD\s+\|(\w*)\|/.long $1/;
# Make function visible to linker.
if ($elf) {
s/(\s*)EXPORT\s+\|([\$\w]*)\|/$1.global $2\n$1.type $2, function/;
} else {
s/(\s*)EXPORT\s+\|([\$\w]*)\|/$1.global $2/;
}
# DCW to .short
s/DCW\s+\|(\w*)\|/.short $1/;
s/DCW(.*)/.short $1/;
# Constants defined in scope
s/DCD(.*)/.long $1/;
s/DCB(.*)/.byte $1/;
# Make function visible to linker, and make additional symbol with
# prepended underscore
s/EXPORT\s+\|([\$\w]*)\|/.global $1 \n\t.type $1, function/;
s/IMPORT\s+\|([\$\w]*)\|/.global $1/;
s/EXPORT\s+([\$\w]*)/.global $1/;
s/export\s+([\$\w]*)/.global $1/;
# No vertical bars required; make additional symbol with prepended
# underscore
s/^\|(\$?\w+)\|/_$1\n\t$1:/g;
# No vertical bars on function names
s/^\|(\$?\w+)\|/$1/g;
# Labels need trailing colon
# s/^(\w+)/$1:/ if !/EQU/;
# put the colon at the end of the line in the macro
s/^([a-zA-Z_0-9\$]+)/$1:/ if !/EQU/;
# ALIGN directive
s/\bALIGN\b/.balign/g;
if ($thumb) {
# ARM code - we force everything to thumb with the declaration in the header
s/\sARM//g;
# ARM code - we force everything to thumb with the declaration in the
# header
s/\bARM\b//g;
} else {
# ARM code
s/\sARM/.arm/g;
s/\bARM\b/.arm/g;
}
# push/pop
s/(push\s+)(r\d+)/stmdb sp\!, \{$2\}/g;
s/(pop\s+)(r\d+)/ldmia sp\!, \{$2\}/g;
# NEON code
s/(vld1.\d+\s+)(q\d+)/$1\{$2\}/g;
s/(vtbl.\d+\s+[^,]+),([^,]+)/$1,\{$2\}/g;
if ($thumb) {
thumb::FixThumbInstructions($_, 0);
thumb::FixThumbInstructions($_);
}
# eabi_attributes numerical equivalents can be found in the
# "ARM IHI 0045C" document.
# REQUIRE8 Stack is required to be 8-byte aligned
s/\sREQUIRE8/.eabi_attribute 24, 1 \@Tag_ABI_align_needed/g;
if ($elf) {
# REQUIRE8 Stack is required to be 8-byte aligned
s/\bREQUIRE8\b/.eabi_attribute 24, 1 \@Tag_ABI_align_needed/g;
# PRESERVE8 Stack 8-byte align is preserved
s/\sPRESERVE8/.eabi_attribute 25, 1 \@Tag_ABI_align_preserved/g;
# PRESERVE8 Stack 8-byte align is preserved
s/\bPRESERVE8\b/.eabi_attribute 25, 1 \@Tag_ABI_align_preserved/g;
} else {
s/\bREQUIRE8\b//;
s/\bPRESERVE8\b//;
}
# Use PROC and ENDP to give the symbols a .size directive.
# This makes them show up properly in debugging tools like gdb and valgrind.
if (/\bPROC\b/)
{
if (/\bPROC\b/) {
my $proc;
/^_([\.0-9A-Z_a-z]\w+)\b/;
# Match the function name so it can be stored in $proc
/^([\.0-9A-Z_a-z]\w+)\b/;
$proc = $1;
push(@proc_stack, $proc) if ($proc);
s/\bPROC\b/@ $&/;
}
if (/\bENDP\b/)
{
if (/\bENDP\b/) {
my $proc;
s/\bENDP\b/@ $&/;
$proc = pop(@proc_stack);
$_ = "\t.size $proc, .-$proc".$_ if ($proc);
$_ = ".size $proc, .-$proc".$_ if ($proc and $elf);
}
# EQU directive
@@ -210,19 +138,20 @@ while (<STDIN>)
# Begin macro definition
if (/\bMACRO\b/) {
# Process next line down, which will be the macro definition
$_ = <STDIN>;
s/^/.macro/;
s/\$//g; # remove formal param reference
s/;/@/g; # change comment characters
s/\$//g; # Remove $ from the variables in the declaration
}
# For macros, use \ to reference formal params
s/\$/\\/g; # End macro definition
s/\bMEND\b/.endm/; # No need to tell it where to stop assembling
s/\$/\\/g; # Use \ to reference formal parameters
# End macro definition
s/\bMEND\b/.endm/; # No need to tell it where to stop assembling
next if /^\s*END\s*$/;
s/[ \t]+$//;
print;
print "$comment_sub$comment\n" if defined $comment;
}
# Mark that this object doesn't need an executable stack.
printf ("\t.section\t.note.GNU-stack,\"\",\%\%progbits\n");
printf (" .section .note.GNU-stack,\"\",\%\%progbits\n") if $elf;
+28 -118
View File
@@ -20,19 +20,14 @@
print "@ This file was created from a .asm file\n";
print "@ using the ads2gas_apple.pl script.\n\n";
print "\t.set WIDE_REFERENCE, 0\n";
print "\t.set ARCHITECTURE, 5\n";
print "\t.set DO1STROUNDING, 0\n";
print ".syntax unified\n";
my %register_aliases;
my %macro_aliases;
my @mapping_list = ("\$0", "\$1", "\$2", "\$3", "\$4", "\$5", "\$6", "\$7", "\$8", "\$9");
my @incoming_array;
my @imported_functions;
# Perl trim function to remove whitespace from the start and end of the string
sub trim($)
{
@@ -48,25 +43,7 @@ while (<STDIN>)
s/@/,:/g;
# Comment character
s/;/ @/g;
# Hexadecimal constants prefaced by 0x
s/#&/#0x/g;
# Convert :OR: to |
s/:OR:/ | /g;
# Convert :AND: to &
s/:AND:/ & /g;
# Convert :NOT: to ~
s/:NOT:/ ~ /g;
# Convert :SHL: to <<
s/:SHL:/ << /g;
# Convert :SHR: to >>
s/:SHR:/ >> /g;
s/;/@/;
# Convert ELSE to .else
s/\bELSE\b/.else/g;
@@ -74,131 +51,64 @@ while (<STDIN>)
# Convert ENDIF to .endif
s/\bENDIF\b/.endif/g;
# Convert ELSEIF to .elseif
s/\bELSEIF\b/.elseif/g;
# Convert LTORG to .ltorg
s/\bLTORG\b/.ltorg/g;
# Convert IF :DEF:to .if
# gcc doesn't have the ability to do a conditional
# if defined variable that is set by IF :DEF: on
# armasm, so convert it to a normal .if and then
# make sure to define a value elesewhere
if (s/\bIF :DEF:\b/.if /g)
{
s/=/==/g;
}
# Convert IF to .if
if (s/\bIF\b/.if/g)
{
s/=/==/g;
if (s/\bIF\b/.if/g) {
s/=+/==/g;
}
# Convert INCLUDE to .INCLUDE "file"
s/INCLUDE(\s*)(.*)$/.include $1\"$2\"/;
# Code directive (ARM vs Thumb)
s/CODE([0-9][0-9])/.code $1/;
s/INCLUDE\s?(.*)$/.include \"$1\"/;
# No AREA required
# But ALIGNs in AREA must be obeyed
s/^\s*AREA.*ALIGN=([0-9])$/.text\n.p2align $1/;
s/^(\s*)\bAREA\b.*ALIGN=([0-9])$/$1.text\n$1.p2align $2/;
# If no ALIGN, strip the AREA and align to 4 bytes
s/^\s*AREA.*$/.text\n.p2align 2/;
s/^(\s*)\bAREA\b.*$/$1.text\n$1.p2align 2/;
# DCD to .word
# This one is for incoming symbols
s/DCD\s+\|(\w*)\|/.long $1/;
# Make function visible to linker.
s/EXPORT\s+\|([\$\w]*)\|/.globl _$1/;
# DCW to .short
s/DCW\s+\|(\w*)\|/.short $1/;
s/DCW(.*)/.short $1/;
# No vertical bars on function names
s/^\|(\$?\w+)\|/$1/g;
# Constants defined in scope
s/DCD(.*)/.long $1/;
s/DCB(.*)/.byte $1/;
# Labels and functions need a leading underscore and trailing colon
s/^([a-zA-Z_0-9\$]+)/_$1:/ if !/EQU/;
# Make function visible to linker, and make additional symbol with
# prepended underscore
s/EXPORT\s+\|([\$\w]*)\|/.globl _$1\n\t.globl $1/;
# Prepend imported functions with _
if (s/IMPORT\s+\|([\$\w]*)\|/.globl $1/)
{
$function = trim($1);
push(@imported_functions, $function);
}
foreach $function (@imported_functions)
{
s/$function/_$function/;
}
# No vertical bars required; make additional symbol with prepended
# underscore
s/^\|(\$?\w+)\|/_$1\n\t$1:/g;
# Labels need trailing colon
# s/^(\w+)/$1:/ if !/EQU/;
# put the colon at the end of the line in the macro
s/^([a-zA-Z_0-9\$]+)/$1:/ if !/EQU/;
# Branches need to call the correct, underscored, function
s/^(\s+b[egln]?[teq]?\s+)([a-zA-Z_0-9\$]+)/$1 _$2/ if !/EQU/;
# ALIGN directive
s/\bALIGN\b/.balign/g;
# Strip ARM
s/\sARM/@ ARM/g;
s/\s+ARM//;
# Strip REQUIRE8
#s/\sREQUIRE8/@ REQUIRE8/g;
s/\sREQUIRE8/@ /g;
s/\s+REQUIRE8//;
# Strip PRESERVE8
s/\sPRESERVE8/@ PRESERVE8/g;
s/\s+PRESERVE8//;
# Strip PROC and ENDPROC
s/\bPROC\b/@/g;
s/\bENDP\b/@/g;
s/\bPROC\b//g;
s/\bENDP\b//g;
# EQU directive
s/(.*)EQU(.*)/.set $1, $2/;
s/(\S+\s+)EQU(\s+\S+)/.equ $1, $2/;
# Begin macro definition
if (/\bMACRO\b/)
{
if (/\bMACRO\b/) {
# Process next line down, which will be the macro definition
$_ = <STDIN>;
$trimmed = trim($_);
# remove commas that are separating list
$trimmed =~ s/,//g;
# string to array
@incoming_array = split(/\s+/, $trimmed);
print ".macro @incoming_array[0]\n";
# remove the first element, as that is the name of the macro
shift (@incoming_array);
@macro_aliases{@incoming_array} = @mapping_list;
next;
s/^/.macro/;
s/\$//g; # Remove $ from the variables in the declaration
}
while (($key, $value) = each(%macro_aliases))
{
$key =~ s/\$/\\\$/;
s/$key\b/$value/g;
}
s/\$/\\/g; # Use \ to reference formal parameters
# End macro definition
# For macros, use \ to reference formal params
# s/\$/\\/g; # End macro definition
s/\bMEND\b/.endm/; # No need to tell it where to stop assembling
s/\bMEND\b/.endm/; # No need to tell it where to stop assembling
next if /^\s*END\s*$/;
s/[ \t]+$//;
print;
}
+452 -199
View File
@@ -74,6 +74,8 @@ Build options:
--cpu=CPU optimize for a specific cpu rather than a family
--extra-cflags=ECFLAGS add ECFLAGS to CFLAGS [$CFLAGS]
--extra-cxxflags=ECXXFLAGS add ECXXFLAGS to CXXFLAGS [$CXXFLAGS]
--use-profile=PROFILE_FILE
Use PROFILE_FILE for PGO
${toggle_extra_warnings} emit harmless warnings (always non-fatal)
${toggle_werror} treat warnings as errors, if possible
(not available with all compilers)
@@ -81,6 +83,7 @@ Build options:
${toggle_pic} turn on/off Position Independent Code
${toggle_ccache} turn on/off compiler cache
${toggle_debug} enable/disable debug mode
${toggle_profile} enable/disable profiling
${toggle_gprof} enable/disable gprof profiling instrumentation
${toggle_gcov} enable/disable gcov coverage instrumentation
${toggle_thumb} enable/disable building arm assembly in thumb mode
@@ -262,6 +265,9 @@ if [ -z "$source_path" ] || [ "$source_path" = "." ]; then
source_path="`pwd`"
disable_feature source_path_used
fi
# Makefiles greedily process the '#' character as a comment, even if it is
# inside quotes. So, this character must be escaped in all paths in Makefiles.
source_path_mk=$(echo $source_path | sed -e 's;\#;\\\#;g')
if test ! -z "$TMPDIR" ; then
TMPDIRx="${TMPDIR}"
@@ -319,6 +325,12 @@ check_ld() {
&& check_cmd ${LD} ${LDFLAGS} "$@" -o ${TMP_X} ${TMP_O} ${extralibs}
}
check_lib() {
log check_lib "$@"
check_cc $@ \
&& check_cmd ${LD} ${LDFLAGS} -o ${TMP_X} ${TMP_O} "$@" ${extralibs}
}
check_header(){
log check_header "$@"
header=$1
@@ -403,6 +415,90 @@ check_gcc_machine_option() {
fi
}
# tests for -m$2, -m$3, -m$4... toggling the feature given in $1.
check_gcc_machine_options() {
feature="$1"
shift
flags="-m$1"
shift
for opt in $*; do
flags="$flags -m$opt"
done
if enabled gcc && ! disabled "$feature" && ! check_cflags $flags; then
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-$feature "
else
soft_enable "$feature"
fi
}
check_neon_sve_bridge_compiles() {
if enabled sve; then
check_cc -march=armv8.2-a+dotprod+i8mm+sve <<EOF
#ifndef __ARM_NEON_SVE_BRIDGE
#error 1
#endif
#include <arm_sve.h>
#include <arm_neon_sve_bridge.h>
EOF
compile_result=$?
if [ ${compile_result} -eq 0 ]; then
# Check whether the compiler can compile SVE functions that require
# backup/restore of SVE registers according to AAPCS. Clang for Windows
# used to fail this, see
# https://github.com/llvm/llvm-project/issues/80009.
check_cc -march=armv8.2-a+dotprod+i8mm+sve <<EOF
#include <arm_sve.h>
void other(void);
svfloat32_t func(svfloat32_t a) {
other();
return a;
}
EOF
compile_result=$?
fi
if [ ${compile_result} -ne 0 ]; then
log_echo " disabling sve: arm_neon_sve_bridge.h not supported by compiler"
log_echo " disabling sve2: arm_neon_sve_bridge.h not supported by compiler"
disable_feature sve
disable_feature sve2
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-sve --disable-sve2 "
fi
fi
}
check_gcc_avx512_compiles() {
if disabled gcc; then
return
fi
check_cc -mavx512f <<EOF
#include <immintrin.h>
void f(void) {
__m512i x = _mm512_set1_epi16(0);
(void)x;
}
EOF
compile_result=$?
if [ ${compile_result} -ne 0 ]; then
log_echo " disabling avx512: not supported by compiler"
disable_feature avx512
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-avx512 "
fi
}
check_inline_asm() {
log check_inline_asm "$@"
name="$1"
code="$2"
shift 2
disable_feature $name
check_cc "$@" <<EOF && enable_feature $name
void foo(void) { __asm__ volatile($code); }
EOF
}
write_common_config_banner() {
print_webm_license config.mk "##" ""
echo '# This file automatically generated by configure. Do not edit!' >> config.mk
@@ -438,11 +534,11 @@ write_common_target_config_mk() {
cat >> $1 << EOF
# This file automatically generated by configure. Do not edit!
SRC_PATH="$source_path"
SRC_PATH_BARE=$source_path
SRC_PATH="$source_path_mk"
SRC_PATH_BARE=$source_path_mk
BUILD_PFX=${BUILD_PFX}
TOOLCHAIN=${toolchain}
ASM_CONVERSION=${asm_conversion_cmd:-${source_path}/build/make/ads2gas.pl}
ASM_CONVERSION=${asm_conversion_cmd:-${source_path_mk}/build/make/ads2gas.pl}
GEN_VCPROJ=${gen_vcproj_cmd}
MSVS_ARCH_DIR=${msvs_arch_dir}
@@ -452,7 +548,6 @@ AR=${AR}
LD=${LD}
AS=${AS}
STRIP=${STRIP}
NM=${NM}
CFLAGS = ${CFLAGS}
CXXFLAGS = ${CXXFLAGS}
@@ -464,6 +559,8 @@ AS_SFX = ${AS_SFX:-.asm}
EXE_SFX = ${EXE_SFX}
VCPROJ_SFX = ${VCPROJ_SFX}
RTCD_OPTIONS = ${RTCD_OPTIONS}
LIBWEBM_CXXFLAGS = ${LIBWEBM_CXXFLAGS}
LIBYUV_CXXFLAGS = ${LIBYUV_CXXFLAGS}
EOF
if enabled rvct; then cat >> $1 << EOF
@@ -474,10 +571,10 @@ fmt_deps = sed -e 's;^\([a-zA-Z0-9_]*\)\.o;\${@:.d=.o} \$@;'
EOF
fi
print_config_mk ARCH "${1}" ${ARCH_LIST}
print_config_mk HAVE "${1}" ${HAVE_LIST}
print_config_mk CONFIG "${1}" ${CONFIG_LIST}
print_config_mk HAVE "${1}" gnu_strip
print_config_mk VPX_ARCH "${1}" ${ARCH_LIST}
print_config_mk HAVE "${1}" ${HAVE_LIST}
print_config_mk CONFIG "${1}" ${CONFIG_LIST}
print_config_mk HAVE "${1}" gnu_strip
enabled msvs && echo "CONFIG_VS_VERSION=${vs_version}" >> "${1}"
@@ -494,15 +591,33 @@ write_common_target_config_h() {
#define RESTRICT ${RESTRICT}
#define INLINE ${INLINE}
EOF
print_config_h ARCH "${TMP_H}" ${ARCH_LIST}
print_config_h HAVE "${TMP_H}" ${HAVE_LIST}
print_config_h CONFIG "${TMP_H}" ${CONFIG_LIST}
print_config_vars_h "${TMP_H}" ${VAR_LIST}
print_config_h VPX_ARCH "${TMP_H}" ${ARCH_LIST}
print_config_h HAVE "${TMP_H}" ${HAVE_LIST}
print_config_h CONFIG "${TMP_H}" ${CONFIG_LIST}
print_config_vars_h "${TMP_H}" ${VAR_LIST}
echo "#endif /* VPX_CONFIG_H */" >> ${TMP_H}
mkdir -p `dirname "$1"`
cmp "$1" ${TMP_H} >/dev/null 2>&1 || mv ${TMP_H} "$1"
}
write_win_arm64_neon_h_workaround() {
print_webm_license ${TMP_H} "/*" " */"
cat >> ${TMP_H} << EOF
/* This file automatically generated by configure. Do not edit! */
#ifndef VPX_WIN_ARM_NEON_H_WORKAROUND
#define VPX_WIN_ARM_NEON_H_WORKAROUND
/* The Windows SDK has arm_neon.h, but unlike on other platforms it is
* ARM32-only. ARM64 NEON support is provided by arm64_neon.h, a proper
* superset of arm_neon.h. Work around this by providing a more local
* arm_neon.h that simply #includes arm64_neon.h.
*/
#include <arm64_neon.h>
#endif /* VPX_WIN_ARM_NEON_H_WORKAROUND */
EOF
mkdir -p `dirname "$1"`
cmp "$1" ${TMP_H} >/dev/null 2>&1 || mv ${TMP_H} "$1"
}
process_common_cmdline() {
for opt in "$@"; do
optval="${opt#*=}"
@@ -534,6 +649,9 @@ process_common_cmdline() {
--extra-cxxflags=*)
extra_cxxflags="${optval}"
;;
--use-profile=*)
pgo_file=${optval}
;;
--enable-?*|--disable-?*)
eval `echo "$opt" | sed 's/--/action=/;s/-/ option=/;s/-/_/g'`
if is_in ${option} ${ARCH_EXT_LIST}; then
@@ -585,11 +703,7 @@ process_common_cmdline() {
--libdir=*)
libdir="${optval}"
;;
--sdk-path=*)
[ -d "${optval}" ] || die "Not a directory: ${optval}"
sdk_path="${optval}"
;;
--libc|--as|--prefix|--libdir|--sdk-path)
--libc|--as|--prefix|--libdir)
die "Option ${opt} requires argument"
;;
--help|-h)
@@ -634,7 +748,6 @@ setup_gnu_toolchain() {
LD=${LD:-${CROSS}${link_with_cc:-ld}}
AS=${AS:-${CROSS}as}
STRIP=${STRIP:-${CROSS}strip}
NM=${NM:-${CROSS}nm}
AS_SFX=.S
EXE_SFX=
}
@@ -674,7 +787,6 @@ check_xcode_minimum_version() {
process_common_toolchain() {
if [ -z "$toolchain" ]; then
gcctarget="${CHOST:-$(gcc -dumpmachine 2> /dev/null)}"
# detect tgt_isa
case "$gcctarget" in
aarch64*)
@@ -697,37 +809,39 @@ process_common_toolchain() {
*sparc*)
tgt_isa=sparc
;;
power*64le*-*)
tgt_isa=ppc64le
;;
*mips64el*)
tgt_isa=mips64
;;
*mips32el*)
tgt_isa=mips32
;;
loongarch32*)
tgt_isa=loongarch32
;;
loongarch64*)
tgt_isa=loongarch64
;;
esac
# detect tgt_os
case "$gcctarget" in
*darwin10*)
*darwin1[0-9]*)
tgt_isa=x86_64
tgt_os=darwin10
tgt_os=`echo $gcctarget | sed 's/.*\(darwin1[0-9]\).*/\1/'`
;;
*darwin11*)
tgt_isa=x86_64
tgt_os=darwin11
;;
*darwin12*)
tgt_isa=x86_64
tgt_os=darwin12
;;
*darwin13*)
tgt_isa=x86_64
tgt_os=darwin13
;;
*darwin14*)
tgt_isa=x86_64
tgt_os=darwin14
;;
*darwin15*)
tgt_isa=x86_64
tgt_os=darwin15
*darwin2[0-4]*)
tgt_isa=`uname -m`
tgt_os=`echo $gcctarget | sed 's/.*\(darwin2[0-9]\).*/\1/'`
;;
x86_64*mingw32*)
tgt_os=win64
;;
x86_64*cygwin*)
tgt_os=win64
;;
*mingw32*|*cygwin*)
[ -z "$tgt_isa" ] && tgt_isa=x86
tgt_os=win32
@@ -769,16 +883,34 @@ process_common_toolchain() {
# Enable the architecture family
case ${tgt_isa} in
arm64 | armv8)
enable_feature arm
enable_feature aarch64
;;
arm*)
enable_feature arm
;;
mips*)
enable_feature mips
;;
ppc*)
enable_feature ppc
;;
loongarch*)
soft_enable lsx
soft_enable lasx
enable_feature loongarch
;;
esac
# PIC is probably what we want when building shared libs
# Position independent code (PIC) is probably what we want when building
# shared libs or position independent executable (PIE) targets.
enabled shared && soft_enable pic
check_cpp << EOF || soft_enable pic
#if !(__pie__ || __PIE__)
#error Neither __pie__ or __PIE__ are set
#endif
EOF
# Minimum iOS version for all target platforms (darwin and iphonesimulator).
# Shared library framework builds are only possible on iOS 8 and later.
@@ -787,13 +919,13 @@ process_common_toolchain() {
IOS_VERSION_MIN="8.0"
else
IOS_VERSION_OPTIONS=""
IOS_VERSION_MIN="6.0"
IOS_VERSION_MIN="7.0"
fi
# Handle darwin variants. Newer SDKs allow targeting older
# platforms, so use the newest one available.
case ${toolchain} in
arm*-darwin*)
arm*-darwin-*)
add_cflags "-miphoneos-version-min=${IOS_VERSION_MIN}"
iphoneos_sdk_dir="$(show_darwin_sdk_path iphoneos)"
if [ -d "${iphoneos_sdk_dir}" ]; then
@@ -801,7 +933,7 @@ process_common_toolchain() {
add_ldflags "-isysroot ${iphoneos_sdk_dir}"
fi
;;
x86*-darwin*)
*-darwin*)
osx_sdk_dir="$(show_darwin_sdk_path macosx)"
if [ -d "${osx_sdk_dir}" ]; then
add_cflags "-isysroot ${osx_sdk_dir}"
@@ -843,6 +975,26 @@ process_common_toolchain() {
add_cflags "-mmacosx-version-min=10.11"
add_ldflags "-mmacosx-version-min=10.11"
;;
*-darwin16-*)
add_cflags "-mmacosx-version-min=10.12"
add_ldflags "-mmacosx-version-min=10.12"
;;
*-darwin17-*)
add_cflags "-mmacosx-version-min=10.13"
add_ldflags "-mmacosx-version-min=10.13"
;;
*-darwin18-*)
add_cflags "-mmacosx-version-min=10.14"
add_ldflags "-mmacosx-version-min=10.14"
;;
*-darwin19-*)
add_cflags "-mmacosx-version-min=10.15"
add_ldflags "-mmacosx-version-min=10.15"
;;
*-darwin2[0-4]-*)
add_cflags "-arch ${toolchain%%-*}"
add_ldflags "-arch ${toolchain%%-*}"
;;
*-iphonesimulator-*)
add_cflags "-miphoneos-version-min=${IOS_VERSION_MIN}"
add_ldflags "-miphoneos-version-min=${IOS_VERSION_MIN}"
@@ -864,34 +1016,36 @@ process_common_toolchain() {
;;
esac
# Process ARM architecture variants
# Process architecture variants
case ${toolchain} in
arm*)
# on arm, isa versions are supersets
case ${tgt_isa} in
arm64|armv8)
soft_enable neon
case ${toolchain} in
armv7*-darwin*)
# Runtime cpu detection is not defined for these targets.
enabled runtime_cpu_detect && disable_feature runtime_cpu_detect
;;
armv7|armv7s)
soft_enable neon
# Only enable neon_asm when neon is also enabled.
enabled neon && soft_enable neon_asm
# If someone tries to force it through, die.
if disabled neon && enabled neon_asm; then
die "Disabling neon while keeping neon-asm is not supported"
fi
*)
soft_enable runtime_cpu_detect
;;
esac
asm_conversion_cmd="cat"
if [ ${tgt_isa} = "armv7" ] || [ ${tgt_isa} = "armv7s" ]; then
soft_enable neon
# Only enable neon_asm when neon is also enabled.
enabled neon && soft_enable neon_asm
# If someone tries to force it through, die.
if disabled neon && enabled neon_asm; then
die "Disabling neon while keeping neon-asm is not supported"
fi
fi
asm_conversion_cmd="cat"
case ${tgt_cc} in
gcc)
link_with_cc=gcc
setup_gnu_toolchain
arch_int=${tgt_isa##armv}
arch_int=${arch_int%%te}
check_add_asflags --defsym ARCHITECTURE=${arch_int}
tune_cflags="-mtune="
if [ ${tgt_isa} = "armv7" ] || [ ${tgt_isa} = "armv7s" ]; then
if [ -z "${float_abi}" ]; then
@@ -917,7 +1071,17 @@ EOF
fi
enabled debug && add_asflags -g
asm_conversion_cmd="${source_path}/build/make/ads2gas.pl"
asm_conversion_cmd="${source_path_mk}/build/make/ads2gas.pl"
case ${tgt_os} in
win*)
asm_conversion_cmd="$asm_conversion_cmd -noelf"
AS="$CC -c"
EXE_SFX=.exe
enable_feature thumb
;;
esac
if enabled thumb; then
asm_conversion_cmd="$asm_conversion_cmd -thumb"
check_add_cflags -mthumb
@@ -925,18 +1089,44 @@ EOF
fi
;;
vs*)
asm_conversion_cmd="${source_path}/build/make/ads2armasm_ms.pl"
AS_SFX=.S
msvs_arch_dir=arm-msvs
disable_feature multithread
disable_feature unit_tests
vs_version=${tgt_cc##vs}
if [ $vs_version -ge 12 ]; then
# MSVC 2013 doesn't allow doing plain .exe projects for ARM,
# only "AppContainerApplication" which requires an AppxManifest.
# Therefore disable the examples, just build the library.
disable_feature examples
disable_feature tools
# A number of ARM-based Windows platforms are constrained by their
# respective SDKs' limitations. Fortunately, these are all 32-bit ABIs
# and so can be selected as 'win32'.
if [ ${tgt_os} = "win32" ]; then
asm_conversion_cmd="${source_path_mk}/build/make/ads2armasm_ms.pl"
AS_SFX=.S
msvs_arch_dir=arm-msvs
disable_feature multithread
disable_feature unit_tests
if [ ${tgt_cc##vs} -ge 12 ]; then
# MSVC 2013 doesn't allow doing plain .exe projects for ARM32,
# only "AppContainerApplication" which requires an AppxManifest.
# Therefore disable the examples, just build the library.
disable_feature examples
disable_feature tools
fi
else
# Windows 10 on ARM, on the other hand, has full Windows SDK support
# for building Win32 ARM64 applications in addition to ARM64
# Windows Store apps. It is the only 64-bit ARM ABI that
# Windows supports, so it is the default definition of 'win64'.
# ARM64 build support officially shipped in Visual Studio 15.9.0.
# Because the ARM64 Windows SDK's arm_neon.h is ARM32-specific
# while LLVM's is not, probe its validity.
if enabled neon; then
if [ -n "${CC}" ]; then
check_header arm_neon.h || check_header arm64_neon.h && \
enable_feature win_arm64_neon_h_workaround
else
# If a probe is not possible, assume this is the pure Windows
# SDK and so the workaround is necessary when using Visual
# Studio < 2019.
if [ ${tgt_cc##vs} -lt 16 ]; then
enable_feature win_arm64_neon_h_workaround
fi
fi
fi
fi
;;
rvct)
@@ -945,7 +1135,6 @@ EOF
AS=armasm
LD="${source_path}/build/make/armlink_adapter.sh"
STRIP=arm-none-linux-gnueabi-strip
NM=arm-none-linux-gnueabi-nm
tune_cflags="--cpu="
tune_asflags="--cpu="
if [ -z "${tune_cpu}" ]; then
@@ -964,7 +1153,6 @@ EOF
fi
arch_int=${tgt_isa##armv}
arch_int=${arch_int%%te}
check_add_asflags --pd "\"ARCHITECTURE SETA ${arch_int}\""
enabled debug && add_asflags -g
add_cflags --gnu
add_cflags --enum_is_int
@@ -979,109 +1167,76 @@ EOF
;;
android*)
if [ -n "${sdk_path}" ]; then
SDK_PATH=${sdk_path}
COMPILER_LOCATION=`find "${SDK_PATH}" \
-name "arm-linux-androideabi-gcc*" -print -quit`
TOOLCHAIN_PATH=${COMPILER_LOCATION%/*}/arm-linux-androideabi-
CC=${TOOLCHAIN_PATH}gcc
CXX=${TOOLCHAIN_PATH}g++
AR=${TOOLCHAIN_PATH}ar
LD=${TOOLCHAIN_PATH}gcc
AS=${TOOLCHAIN_PATH}as
STRIP=${TOOLCHAIN_PATH}strip
NM=${TOOLCHAIN_PATH}nm
if [ -z "${alt_libc}" ]; then
alt_libc=`find "${SDK_PATH}" -name arch-arm -print | \
awk '{n = split($0,a,"/"); \
split(a[n-1],b,"-"); \
print $0 " " b[2]}' | \
sort -g -k 2 | \
awk '{ print $1 }' | tail -1`
fi
if [ -d "${alt_libc}" ]; then
add_cflags "--sysroot=${alt_libc}"
add_ldflags "--sysroot=${alt_libc}"
fi
# linker flag that routes around a CPU bug in some
# Cortex-A8 implementations (NDK Dev Guide)
add_ldflags "-Wl,--fix-cortex-a8"
enable_feature pic
soft_enable realtime_only
if [ ${tgt_isa} = "armv7" ]; then
soft_enable runtime_cpu_detect
fi
if enabled runtime_cpu_detect; then
add_cflags "-I${SDK_PATH}/sources/android/cpufeatures"
fi
else
echo "Assuming standalone build with NDK toolchain."
echo "See build/make/Android.mk for details."
check_add_ldflags -static
soft_enable unit_tests
fi
;;
darwin*)
XCRUN_FIND="xcrun --sdk iphoneos --find"
CXX="$(${XCRUN_FIND} clang++)"
CC="$(${XCRUN_FIND} clang)"
AR="$(${XCRUN_FIND} ar)"
AS="$(${XCRUN_FIND} as)"
STRIP="$(${XCRUN_FIND} strip)"
NM="$(${XCRUN_FIND} nm)"
RANLIB="$(${XCRUN_FIND} ranlib)"
AS_SFX=.S
LD="${CXX:-$(${XCRUN_FIND} ld)}"
# ASFLAGS is written here instead of using check_add_asflags
# because we need to overwrite all of ASFLAGS and purge the
# options that were put in above
ASFLAGS="-arch ${tgt_isa} -g"
add_cflags -arch ${tgt_isa}
add_ldflags -arch ${tgt_isa}
alt_libc="$(show_darwin_sdk_path iphoneos)"
if [ -d "${alt_libc}" ]; then
add_cflags -isysroot ${alt_libc}
fi
if [ "${LD}" = "${CXX}" ]; then
add_ldflags -miphoneos-version-min="${IOS_VERSION_MIN}"
else
add_ldflags -ios_version_min "${IOS_VERSION_MIN}"
fi
for d in lib usr/lib usr/lib/system; do
try_dir="${alt_libc}/${d}"
[ -d "${try_dir}" ] && add_ldflags -L"${try_dir}"
done
case ${tgt_isa} in
armv7|armv7s|armv8|arm64)
if enabled neon && ! check_xcode_minimum_version; then
soft_disable neon
log_echo " neon disabled: upgrade Xcode (need v6.3+)."
if enabled neon_asm; then
soft_disable neon_asm
log_echo " neon_asm disabled: upgrade Xcode (need v6.3+)."
fi
fi
echo "Assuming standalone build with NDK toolchain."
echo "See build/make/Android.mk for details."
check_add_ldflags -static
soft_enable unit_tests
case "$AS" in
*clang)
# The GNU Assembler was removed in the r24 version of the NDK.
# clang's internal assembler works, but `-c` is necessary to
# avoid linking.
add_asflags -c
;;
esac
;;
asm_conversion_cmd="${source_path}/build/make/ads2gas_apple.pl"
darwin)
if ! enabled external_build; then
XCRUN_FIND="xcrun --sdk iphoneos --find"
CXX="$(${XCRUN_FIND} clang++)"
CC="$(${XCRUN_FIND} clang)"
AR="$(${XCRUN_FIND} ar)"
AS="$(${XCRUN_FIND} as)"
STRIP="$(${XCRUN_FIND} strip)"
AS_SFX=.S
LD="${CXX:-$(${XCRUN_FIND} ld)}"
if [ "$(show_darwin_sdk_major_version iphoneos)" -gt 8 ]; then
check_add_cflags -fembed-bitcode
check_add_asflags -fembed-bitcode
check_add_ldflags -fembed-bitcode
# ASFLAGS is written here instead of using check_add_asflags
# because we need to overwrite all of ASFLAGS and purge the
# options that were put in above
ASFLAGS="-arch ${tgt_isa} -g"
add_cflags -arch ${tgt_isa}
add_ldflags -arch ${tgt_isa}
alt_libc="$(show_darwin_sdk_path iphoneos)"
if [ -d "${alt_libc}" ]; then
add_cflags -isysroot ${alt_libc}
fi
if [ "${LD}" = "${CXX}" ]; then
add_ldflags -miphoneos-version-min="${IOS_VERSION_MIN}"
else
add_ldflags -ios_version_min "${IOS_VERSION_MIN}"
fi
for d in lib usr/lib usr/lib/system; do
try_dir="${alt_libc}/${d}"
[ -d "${try_dir}" ] && add_ldflags -L"${try_dir}"
done
case ${tgt_isa} in
armv7|armv7s|armv8|arm64)
if enabled neon && ! check_xcode_minimum_version; then
soft_disable neon
log_echo " neon disabled: upgrade Xcode (need v6.3+)."
if enabled neon_asm; then
soft_disable neon_asm
log_echo " neon_asm disabled: upgrade Xcode (need v6.3+)."
fi
fi
;;
esac
if [ "$(show_darwin_sdk_major_version iphoneos)" -gt 8 ]; then
check_add_cflags -fembed-bitcode
check_add_asflags -fembed-bitcode
check_add_ldflags -fembed-bitcode
fi
fi
asm_conversion_cmd="${source_path_mk}/build/make/ads2gas_apple.pl"
;;
linux*)
@@ -1108,6 +1263,38 @@ EOF
fi
;;
esac
# AArch64 ISA extensions are treated as supersets.
if [ ${tgt_isa} = "arm64" ] || [ ${tgt_isa} = "armv8" ]; then
aarch64_arch_flag_neon="arch=armv8-a"
aarch64_arch_flag_neon_dotprod="arch=armv8.2-a+dotprod"
aarch64_arch_flag_neon_i8mm="arch=armv8.2-a+dotprod+i8mm"
aarch64_arch_flag_sve="arch=armv8.2-a+dotprod+i8mm+sve"
aarch64_arch_flag_sve2="arch=armv9-a+sve2"
for ext in ${ARCH_EXT_LIST_AARCH64}; do
if [ "$disable_exts" = "yes" ]; then
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-${ext} "
soft_disable $ext
else
# Check the compiler supports the -march flag for the extension.
# This needs to happen after toolchain/OS inspection so we handle
# $CROSS etc correctly when checking for flags, else these will
# always fail.
flag="$(eval echo \$"aarch64_arch_flag_${ext}")"
check_gcc_machine_option "${flag}" "${ext}"
if ! enabled $ext; then
# Disable higher order extensions to simplify dependencies.
disable_exts="yes"
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-${ext} "
soft_disable $ext
fi
fi
done
if enabled sve; then
check_neon_sve_bridge_compiles
fi
fi
;;
mips*)
link_with_cc=gcc
@@ -1135,12 +1322,24 @@ EOF
check_add_asflags -mips64r6 -mabi=64 -mhard-float -mfp64
check_add_ldflags -mips64r6 -mabi=64 -mfp64
;;
loongson3*)
check_cflags -march=loongson3a && soft_enable mmi \
|| disable_feature mmi
check_cflags -mmsa && soft_enable msa \
|| disable_feature msa
tgt_isa=loongson3a
;;
esac
if enabled mmi || enabled msa; then
soft_enable runtime_cpu_detect
fi
if enabled msa; then
add_cflags -mmsa
add_asflags -mmsa
add_ldflags -mmsa
# TODO(libyuv:793)
# The new mips functions in libyuv do not build
# with the toolchains we currently use for testing.
soft_disable libyuv
fi
fi
@@ -1148,8 +1347,25 @@ EOF
check_add_asflags -march=${tgt_isa}
check_add_asflags -KPIC
;;
ppc64le*)
link_with_cc=gcc
setup_gnu_toolchain
# Do not enable vsx by default.
# https://bugs.chromium.org/p/webm/issues/detail?id=1522
enabled vsx || RTCD_OPTIONS="${RTCD_OPTIONS}--disable-vsx "
if [ -n "${tune_cpu}" ]; then
case ${tune_cpu} in
power?)
tune_cflags="-mcpu="
;;
esac
fi
;;
x86*)
case ${tgt_os} in
android)
soft_enable realtime_only
;;
win*)
enabled gcc && add_cflags -fno-common
;;
@@ -1196,24 +1412,12 @@ EOF
enabled optimizations && disabled gprof && check_add_cflags -fomit-frame-pointer
;;
vs*)
# When building with Microsoft Visual Studio the assembler is
# invoked directly. Checking at configure time is unnecessary.
# Skip the check by setting AS arbitrarily
AS=msvs
msvs_arch_dir=x86-msvs
vc_version=${tgt_cc##vs}
case $vc_version in
7|8|9|10)
echo "${tgt_cc} does not support avx/avx2, disabling....."
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-avx --disable-avx2 "
soft_disable avx
soft_disable avx2
;;
esac
case $vc_version in
7|8|9)
echo "${tgt_cc} omits stdint.h, disabling webm-io..."
soft_disable webm_io
case ${tgt_cc##vs} in
14)
echo "${tgt_cc} does not support avx512, disabling....."
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-avx512 "
soft_disable avx512
;;
esac
;;
@@ -1246,8 +1450,13 @@ EOF
elif disabled $ext; then
disable_exts="yes"
else
# use the shortened version for the flag: sse4_1 -> sse4
check_gcc_machine_option ${ext%_*} $ext
if [ "$ext" = "avx512" ]; then
check_gcc_machine_options $ext avx512f avx512cd avx512bw avx512dq avx512vl
check_gcc_avx512_compiles
else
# use the shortened version for the flag: sse4_1 -> sse4
check_gcc_machine_option ${ext%_*} $ext
fi
fi
done
@@ -1273,7 +1482,6 @@ EOF
esac
log_echo " using $AS"
fi
[ "${AS##*/}" = nasm ] && add_asflags -Ox
AS_SFX=.asm
case ${tgt_os} in
win32)
@@ -1282,7 +1490,7 @@ EOF
EXE_SFX=.exe
;;
win64)
add_asflags -f x64
add_asflags -f win64
enabled debug && add_asflags -g cv8
EXE_SFX=.exe
;;
@@ -1309,7 +1517,8 @@ EOF
add_cflags ${sim_arch}
add_ldflags ${sim_arch}
if [ "$(show_darwin_sdk_major_version iphonesimulator)" -gt 8 ]; then
if [ "$(disabled external_build)" ] &&
[ "$(show_darwin_sdk_major_version iphonesimulator)" -gt 8 ]; then
# yasm v1.3.0 doesn't know what -fembed-bitcode means, so turning it
# on is pointless (unless building a C-only lib). Warn the user, but
# do nothing here.
@@ -1326,6 +1535,15 @@ EOF
;;
esac
;;
loongarch*)
link_with_cc=gcc
setup_gnu_toolchain
enabled lsx && check_inline_asm lsx '"vadd.b $vr0, $vr1, $vr1"'
enabled lsx && soft_enable runtime_cpu_detect
enabled lasx && check_inline_asm lasx '"xvadd.b $xr0, $xr1, $xr1"'
enabled lasx && soft_enable runtime_cpu_detect
;;
*-gcc|generic-gnu)
link_with_cc=gcc
enable_feature gcc
@@ -1333,6 +1551,14 @@ EOF
;;
esac
# Enable PGO
if [ -n "${pgo_file}" ]; then
check_add_cflags -fprofile-use=${pgo_file} || \
die "-fprofile-use is not supported by compiler"
check_add_ldflags -fprofile-use=${pgo_file} || \
die "-fprofile-use is not supported by linker"
fi
# Try to enable CPU specific tuning
if [ -n "${tune_cpu}" ]; then
if [ -n "${tune_cflags}" ]; then
@@ -1353,6 +1579,9 @@ EOF
else
check_add_cflags -DNDEBUG
fi
enabled profile &&
check_add_cflags -fprofile-generate &&
check_add_ldflags -fprofile-generate
enabled gprof && check_add_cflags -pg && check_add_ldflags -pg
enabled gcov &&
@@ -1387,7 +1616,7 @@ EOF
# Try to find which inline keywords are supported
check_cc <<EOF && INLINE="inline"
static inline function() {}
static inline int function(void) {}
EOF
# Almost every platform uses pthreads.
@@ -1399,7 +1628,11 @@ EOF
# bionic includes basic pthread functionality, obviating -lpthread.
;;
*)
check_header pthread.h && add_extralibs -lpthread
check_header pthread.h && check_lib -lpthread <<EOF && add_extralibs -lpthread || disable_feature pthread_h
#include <pthread.h>
#include <stddef.h>
int main(void) { return pthread_create(NULL, NULL, NULL, NULL); }
EOF
;;
esac
fi
@@ -1416,6 +1649,26 @@ EOF
echo "msa optimizations are available only for little endian platforms"
disable_feature msa
fi
if enabled mmi; then
echo "mmi optimizations are available only for little endian platforms"
disable_feature mmi
fi
fi
;;
esac
# only for LOONGARCH platforms
case ${toolchain} in
loongarch*)
if enabled big_endian; then
if enabled lsx; then
echo "lsx optimizations are available only for little endian platforms"
disable_feature lsx
fi
if enabled lasx; then
echo "lasx optimizations are available only for little endian platforms"
disable_feature lasx
fi
fi
;;
esac
+1 -1
View File
@@ -42,7 +42,7 @@ done
[ -n "$srcfile" ] || show_help
sfx=${sfx:-asm}
includes=$(LC_ALL=C egrep -i "include +\"?[a-z0-9_/]+\.${sfx}" $srcfile |
includes=$(LC_ALL=C grep -E -i "include +\"?[a-z0-9_/]+\.${sfx}" $srcfile |
perl -p -e "s;.*?([a-z0-9_/]+.${sfx}).*;\1;")
#" restore editor state
for inc in ${includes}; do
+15 -20
View File
@@ -25,7 +25,7 @@ files.
Options:
--help Print this message
--out=outfile Redirect output to a file
--ver=version Version (7,8,9,10,11,12,14) of visual studio to generate for
--ver=version Version (14-17) of visual studio to generate for
--target=isa-os-cc Target specifier
EOF
exit 1
@@ -213,13 +213,15 @@ for opt in "$@"; do
;;
--dep=*) eval "${optval%%:*}_deps=\"\${${optval%%:*}_deps} ${optval##*:}\""
;;
--ver=*) vs_ver="$optval"
case $optval in
10|11|12|14)
;;
*) die Unrecognized Visual Studio Version in $opt
;;
esac
--ver=*)
vs_ver="$optval"
case $optval in
14) vs_year=2015 ;;
15) vs_year=2017 ;;
16) vs_year=2019 ;;
17) vs_year=2022 ;;
*) die Unrecognized Visual Studio Version in $opt ;;
esac
;;
--target=*) target="${optval}"
;;
@@ -230,18 +232,11 @@ for opt in "$@"; do
done
outfile=${outfile:-/dev/stdout}
mkoutfile=${mkoutfile:-/dev/stdout}
case "${vs_ver:-10}" in
10) sln_vers="11.00"
sln_vers_str="Visual Studio 2010"
;;
11) sln_vers="12.00"
sln_vers_str="Visual Studio 2012"
;;
12) sln_vers="12.00"
sln_vers_str="Visual Studio 2013"
;;
14) sln_vers="14.00"
sln_vers_str="Visual Studio 2015"
case "${vs_ver}" in
1[4-7])
# VS has used Format Version 12.00 continuously since vs11.
sln_vers="12.00"
sln_vers_str="Visual Studio ${vs_year}"
;;
esac
sfx=vcxproj
+53 -35
View File
@@ -34,7 +34,7 @@ Options:
--name=project_name Name of the project (required)
--proj-guid=GUID GUID to use for the project
--module-def=filename File containing export definitions (for DLLs)
--ver=version Version (10,11,12,14) of visual studio to generate for
--ver=version Version (14-16) of visual studio to generate for
--src-path-bare=dir Path to root of source tree
-Ipath/to/include Additional include directories
-DFLAG[=value] Preprocessor macros to define
@@ -82,7 +82,7 @@ generate_filter() {
| sed -e "s,$src_path_bare,," \
-e 's/^[\./]\+//g' -e 's,[:/ ],_,g')
if ([ "$pat" == "asm" ] || [ "$pat" == "s" ] || [ "$pat" == "S" ]) && $asm_use_custom_step; then
if ([ "$pat" == "asm" ] || [ "$pat" == "s" ] || [ "$pat" == "S" ]) && $uses_asm; then
# Avoid object file name collisions, i.e. vpx_config.c and
# vpx_config.asm produce the same object file without
# this additional suffix.
@@ -141,7 +141,17 @@ for opt in "$@"; do
case "$opt" in
--help|-h) show_help
;;
--target=*) target="${optval}"
--target=*)
target="${optval}"
platform_toolset=$(echo ${target} | awk 'BEGIN{FS="-"}{print $4}')
case "$platform_toolset" in
clangcl) platform_toolset="ClangCl"
;;
"")
;;
*) die Unrecognized Visual Studio Platform Toolset in $opt
;;
esac
;;
--out=*) outfile="$optval"
;;
@@ -157,6 +167,8 @@ for opt in "$@"; do
;;
--lib) proj_kind="lib"
;;
--as=*) as="${optval}"
;;
--src-path-bare=*)
src_path_bare=$(fix_path "$optval")
src_path_bare=${src_path_bare%/}
@@ -168,7 +180,7 @@ for opt in "$@"; do
--ver=*)
vs_ver="$optval"
case "$optval" in
10|11|12|14)
1[4-7])
;;
*) die Unrecognized Visual Studio Version in $opt
;;
@@ -215,13 +227,7 @@ fix_file_list file_list
outfile=${outfile:-/dev/stdout}
guid=${guid:-`generate_uuid`}
asm_use_custom_step=false
uses_asm=${uses_asm:-false}
case "${vs_ver:-11}" in
10|11|12|14)
asm_use_custom_step=$uses_asm
;;
esac
[ -n "$name" ] || die "Project name (--name) must be specified!"
[ -n "$target" ] || die "Target (--target) must be specified!"
@@ -253,13 +259,22 @@ libs=${libs// /;}
case "$target" in
x86_64*)
platforms[0]="x64"
asm_Debug_cmdline="yasm -Xvc -g cv8 -f win64 ${yasmincs} &quot;%(FullPath)&quot;"
asm_Release_cmdline="yasm -Xvc -f win64 ${yasmincs} &quot;%(FullPath)&quot;"
asm_Debug_cmdline="${as} -Xvc -gcv8 -f win64 ${yasmincs} &quot;%(FullPath)&quot;"
asm_Release_cmdline="${as} -Xvc -f win64 ${yasmincs} &quot;%(FullPath)&quot;"
;;
x86*)
platforms[0]="Win32"
asm_Debug_cmdline="yasm -Xvc -g cv8 -f win32 ${yasmincs} &quot;%(FullPath)&quot;"
asm_Release_cmdline="yasm -Xvc -f win32 ${yasmincs} &quot;%(FullPath)&quot;"
asm_Debug_cmdline="${as} -Xvc -gcv8 -f win32 ${yasmincs} &quot;%(FullPath)&quot;"
asm_Release_cmdline="${as} -Xvc -f win32 ${yasmincs} &quot;%(FullPath)&quot;"
;;
arm64*)
platforms[0]="ARM64"
# As of Visual Studio 2022 17.5.5, clang-cl does not support ARM64EC.
if [ "$vs_ver" -ge 17 -a "$platform_toolset" != "ClangCl" ]; then
platforms[1]="ARM64EC"
fi
asm_Debug_cmdline="armasm64 -nologo -oldit &quot;%(FullPath)&quot;"
asm_Release_cmdline="armasm64 -nologo -oldit &quot;%(FullPath)&quot;"
;;
arm*)
platforms[0]="ARM"
@@ -307,6 +322,16 @@ generate_vcxproj() {
tag_content ApplicationType "Windows Store"
tag_content ApplicationTypeRevision 8.1
fi
if [ "${platforms[0]}" = "ARM64" ]; then
# Require the first Visual Studio version to have ARM64 support.
tag_content MinimumVisualStudioVersion 15.9
fi
if [ $vs_ver -eq 15 ] && [ "${platforms[0]}" = "ARM64" ]; then
# Since VS 15 does not have a 'use latest SDK version' facility,
# specifically require the contemporaneous SDK with official ARM64
# support.
tag_content WindowsTargetPlatformVersion 10.0.17763.0
fi
close_tag PropertyGroup
tag Import \
@@ -324,28 +349,21 @@ generate_vcxproj() {
else
tag_content ConfigurationType StaticLibrary
fi
if [ "$vs_ver" = "11" ]; then
if [ "$plat" = "ARM" ]; then
# Setting the wp80 toolchain automatically sets the
# WINAPI_FAMILY define, which is required for building
# code for arm with the windows headers. Alternatively,
# one could add AppContainerApplication=true in the Globals
# section and add PrecompiledHeader=NotUsing and
# CompileAsWinRT=false in ClCompile and SubSystem=Console
# in Link.
tag_content PlatformToolset v110_wp80
else
tag_content PlatformToolset v110
if [ -n "$platform_toolset" ]; then
tag_content PlatformToolset "$platform_toolset"
else
if [ "$vs_ver" = "14" ]; then
tag_content PlatformToolset v140
fi
if [ "$vs_ver" = "15" ]; then
tag_content PlatformToolset v141
fi
if [ "$vs_ver" = "16" ]; then
tag_content PlatformToolset v142
fi
if [ "$vs_ver" = "17" ]; then
tag_content PlatformToolset v143
fi
fi
if [ "$vs_ver" = "12" ]; then
# Setting a PlatformToolset indicating windows phone isn't
# enough to build code for arm with MSVC 2013, one strictly
# has to enable AppContainerApplication as well.
tag_content PlatformToolset v120
fi
if [ "$vs_ver" = "14" ]; then
tag_content PlatformToolset v140
fi
tag_content CharacterSet Unicode
if [ "$config" = "Release" ]; then
+7 -6
View File
@@ -35,8 +35,8 @@ ARM_TARGETS="arm64-darwin-gcc
armv7s-darwin-gcc"
SIM_TARGETS="x86-iphonesimulator-gcc
x86_64-iphonesimulator-gcc"
OSX_TARGETS="x86-darwin15-gcc
x86_64-darwin15-gcc"
OSX_TARGETS="x86-darwin16-gcc
x86_64-darwin16-gcc"
TARGETS="${ARM_TARGETS} ${SIM_TARGETS}"
# Configures for the target specified by $1, and invokes make with the dist
@@ -132,7 +132,8 @@ create_vpx_framework_config_shim() {
done
# Consume the last line of output from the loop: We don't want it.
sed -i '' -e '$d' "${config_file}"
sed -i.bak -e '$d' "${config_file}"
rm "${config_file}.bak"
printf "#endif\n\n" >> "${config_file}"
printf "#endif // ${include_guard}" >> "${config_file}"
@@ -244,7 +245,7 @@ build_framework() {
# Trap function. Cleans up the subtree used to build all targets contained in
# $TARGETS.
cleanup() {
local readonly res=$?
local res=$?
cd "${ORIG_PWD}"
if [ $res -ne 0 ]; then
@@ -271,7 +272,7 @@ cat << EOF
--help: Display this message and exit.
--enable-shared: Build a dynamic framework for use on iOS 8 or later.
--extra-configure-args <args>: Extra args to pass when configuring libvpx.
--macosx: Uses darwin15 targets instead of iphonesimulator targets for x86
--macosx: Uses darwin16 targets instead of iphonesimulator targets for x86
and x86_64. Allows linking to framework when builds target MacOSX
instead of iOS.
--preserve-build-output: Do not delete the build directory.
@@ -350,7 +351,7 @@ if [ "$ENABLE_SHARED" = "yes" ]; then
IOS_VERSION_MIN="8.0"
else
IOS_VERSION_OPTIONS=""
IOS_VERSION_MIN="6.0"
IOS_VERSION_MIN="7.0"
fi
if [ "${VERBOSE}" = "yes" ]; then
+11 -1
View File
@@ -9,7 +9,8 @@
## be found in the AUTHORS file in the root of the source tree.
##
if [ "$(uname -o 2>/dev/null)" = "Cygwin" ] \
shell_name="$(uname -o 2>/dev/null)"
if [[ "$shell_name" = "Cygwin" || "$shell_name" = "Msys" ]] \
&& cygpath --help >/dev/null 2>&1; then
FIXPATH='cygpath -m'
else
@@ -41,6 +42,15 @@ fix_path() {
# Corrects the paths in file_list in one pass for efficiency.
# $1 is the name of the array to be modified.
fix_file_list() {
if [ "${FIXPATH}" = "echo_path" ] ; then
# When used with echo_path, fix_file_list is a no-op. Avoid warning about
# unsupported 'declare -n' when it is not important.
return 0
elif [ "${BASH_VERSINFO}" -lt 4 ] ; then
echo "Cygwin path conversion has failed. Please use a version of bash"
echo "which supports nameref (-n), introduced in bash 4.3"
return 1
fi
declare -n array_ref=$1
files=$(fix_path "${array_ref[@]}")
local IFS=$'\n'
+133 -9
View File
@@ -1,4 +1,13 @@
#!/usr/bin/env perl
##
## Copyright (c) 2017 The WebM project authors. All Rights Reserved.
##
## Use of this source code is governed by a BSD-style license
## that can be found in the LICENSE file in the root of the source
## tree. An additional intellectual property rights grant can be found
## in the file PATENTS. All contributing project authors may
## be found in the AUTHORS file in the root of the source tree.
##
no strict 'refs';
use warnings;
@@ -64,6 +73,10 @@ sub vpx_config($) {
}
sub specialize {
if (@_ <= 1) {
die "'specialize' must be called with a function name and at least one ",
"architecture ('C' is implied): \n@_\n";
}
my $fn=$_[0];
shift;
foreach my $opt (@_) {
@@ -199,7 +212,20 @@ sub filter {
#
sub common_top() {
my $include_guard = uc($opts{sym})."_H_";
my @time = localtime;
my $year = $time[5] + 1900;
print <<EOF;
/*
* Copyright (c) ${year} The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef ${include_guard}
#define ${include_guard}
@@ -228,13 +254,14 @@ EOF
}
sub common_bottom() {
my $include_guard = uc($opts{sym})."_H_";
print <<EOF;
#ifdef __cplusplus
} // extern "C"
#endif
#endif
#endif // ${include_guard}
EOF
}
@@ -305,14 +332,26 @@ EOF
sub mips() {
determine_indirection("c", @ALL_ARCHS);
# Assign the helper variable for each enabled extension
foreach my $opt (@ALL_ARCHS) {
my $opt_uc = uc $opt;
eval "\$have_${opt}=\"flags & HAS_${opt_uc}\"";
}
common_top;
print <<EOF;
#include "vpx_config.h"
#ifdef RTCD_C
#include "vpx_ports/mips.h"
static void setup_rtcd_internal(void)
{
int flags = mips_cpu_caps();
(void)flags;
EOF
set_function_pointers("c", @ALL_ARCHS);
@@ -335,6 +374,67 @@ EOF
common_bottom;
}
sub ppc() {
determine_indirection("c", @ALL_ARCHS);
# Assign the helper variable for each enabled extension
foreach my $opt (@ALL_ARCHS) {
my $opt_uc = uc $opt;
eval "\$have_${opt}=\"flags & HAS_${opt_uc}\"";
}
common_top;
print <<EOF;
#include "vpx_config.h"
#ifdef RTCD_C
#include "vpx_ports/ppc.h"
static void setup_rtcd_internal(void)
{
int flags = ppc_simd_caps();
(void)flags;
EOF
set_function_pointers("c", @ALL_ARCHS);
print <<EOF;
}
#endif
EOF
common_bottom;
}
sub loongarch() {
determine_indirection("c", @ALL_ARCHS);
# Assign the helper variable for each enabled extension
foreach my $opt (@ALL_ARCHS) {
my $opt_uc = uc $opt;
eval "\$have_${opt}=\"flags & HAS_${opt_uc}\"";
}
common_top;
print <<EOF;
#include "vpx_config.h"
#ifdef RTCD_C
#include "vpx_ports/loongarch.h"
static void setup_rtcd_internal(void)
{
int flags = loongarch_cpu_caps();
(void)flags;
EOF
set_function_pointers("c", @ALL_ARCHS);
print <<EOF;
}
#endif
EOF
common_bottom;
}
sub unoptimized() {
determine_indirection "c";
common_top;
@@ -360,36 +460,60 @@ EOF
#
&require("c");
&require(keys %required);
if ($opts{arch} eq 'x86') {
@ALL_ARCHS = filter(qw/mmx sse sse2 sse3 ssse3 sse4_1 avx avx2/);
@ALL_ARCHS = filter(qw/mmx sse sse2 sse3 ssse3 sse4_1 avx avx2 avx512/);
x86;
} elsif ($opts{arch} eq 'x86_64') {
@ALL_ARCHS = filter(qw/mmx sse sse2 sse3 ssse3 sse4_1 avx avx2/);
@REQUIRES = filter(keys %required ? keys %required : qw/mmx sse sse2/);
@ALL_ARCHS = filter(qw/mmx sse sse2 sse3 ssse3 sse4_1 avx avx2 avx512/);
@REQUIRES = filter(qw/mmx sse sse2/);
&require(@REQUIRES);
x86;
} elsif ($opts{arch} eq 'mips32' || $opts{arch} eq 'mips64') {
my $have_dspr2 = 0;
my $have_msa = 0;
my $have_mmi = 0;
@ALL_ARCHS = filter("$opts{arch}");
open CONFIG_FILE, $opts{config} or
die "Error opening config file '$opts{config}': $!\n";
while (<CONFIG_FILE>) {
if (/HAVE_DSPR2=yes/) {
@ALL_ARCHS = filter("$opts{arch}", qw/dspr2/);
last;
$have_dspr2 = 1;
}
if (/HAVE_MSA=yes/) {
@ALL_ARCHS = filter("$opts{arch}", qw/msa/);
last;
$have_msa = 1;
}
if (/HAVE_MMI=yes/) {
$have_mmi = 1;
}
}
close CONFIG_FILE;
if ($have_dspr2 == 1) {
@ALL_ARCHS = filter("$opts{arch}", qw/dspr2/);
} elsif ($have_msa == 1 && $have_mmi == 1) {
@ALL_ARCHS = filter("$opts{arch}", qw/mmi msa/);
} elsif ($have_msa == 1) {
@ALL_ARCHS = filter("$opts{arch}", qw/msa/);
} elsif ($have_mmi == 1) {
@ALL_ARCHS = filter("$opts{arch}", qw/mmi/);
} else {
unoptimized;
}
mips;
} elsif ($opts{arch} =~ /armv7\w?/) {
@ALL_ARCHS = filter(qw/neon_asm neon/);
arm;
} elsif ($opts{arch} eq 'armv8' || $opts{arch} eq 'arm64' ) {
@ALL_ARCHS = filter(qw/neon/);
@ALL_ARCHS = filter(qw/neon neon_dotprod neon_i8mm sve sve2/);
@REQUIRES = filter(qw/neon/);
&require(@REQUIRES);
arm;
} elsif ($opts{arch} =~ /^ppc/ ) {
@ALL_ARCHS = filter(qw/vsx/);
ppc;
} elsif ($opts{arch} =~ /loongarch/ ) {
@ALL_ARCHS = filter(qw/lsx lasx/);
loongarch;
} else {
unoptimized;
}
+1 -11
View File
@@ -11,11 +11,8 @@
package thumb;
sub FixThumbInstructions($$)
sub FixThumbInstructions($)
{
my $short_branches = $_[1];
my $branch_shift_offset = $short_branches ? 1 : 0;
# Write additions with shifts, such as "add r10, r11, lsl #8",
# in three operand form, "add r10, r10, r11, lsl #8".
s/(add\s+)(r\d+),\s*(r\d+),\s*(lsl #\d+)/$1$2, $2, $3, $4/g;
@@ -54,13 +51,6 @@ sub FixThumbInstructions($$)
# "addne r0, r0, r2".
s/^(\s*)((ldr|str)(ne)?[bhd]?)(\s+)(\w+),(\s*\w+,)?\s*\[(\w+)\],\s*(\w+)/$1$2$5$6,$7 [$8]\n$1add$4$5$8, $8, $9/g;
# Convert a conditional addition to the pc register into a series of
# instructions. This converts "addlt pc, pc, r3, lsl #2" into
# "itttt lt", "movlt.n r12, pc", "addlt.w r12, #12",
# "addlt.w r12, r12, r3, lsl #2", "movlt.n pc, r12".
# This assumes that r12 is free at this point.
s/^(\s*)addlt(\s+)pc,\s*pc,\s*(\w+),\s*lsl\s*#(\d+)/$1itttt$2lt\n$1movlt.n$2r12, pc\n$1addlt.w$2r12, #12\n$1addlt.w$2r12, r12, $3, lsl #($4-$branch_shift_offset)\n$1movlt.n$2pc, r12/g;
# Convert "mov pc, lr" into "bx lr", since the former only works
# for switching from arm to thumb (and only in armv7), but not
# from thumb to arm.
+4
View File
@@ -60,6 +60,9 @@ if [ ${bare} ]; then
echo "${changelog_version}${git_version_id}" > $$.tmp
else
cat<<EOF>$$.tmp
// This file is generated. Do not edit.
#ifndef VPX_VERSION_H_
#define VPX_VERSION_H_
#define VERSION_MAJOR $major_version
#define VERSION_MINOR $minor_version
#define VERSION_PATCH $patch_version
@@ -67,6 +70,7 @@ else
#define VERSION_PACKED ((VERSION_MAJOR<<16)|(VERSION_MINOR<<8)|(VERSION_PATCH))
#define ${id}_NOSP "${version_str}"
#define ${id} " ${version_str}"
#endif // VPX_VERSION_H_
EOF
fi
if [ -n "$out_file" ]; then
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
Binary file not shown.

After

Width:  |  Height:  |  Size: 646 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 898 KiB

File diff suppressed because one or more lines are too long
+2 -3
View File
@@ -1,5 +1,4 @@
# This file is used by gcl to get repository specific information.
GERRIT_HOST: chromium-review.googlesource.com
GERRIT_PORT: 29418
# This file is used by git cl to get repository specific information.
GERRIT_HOST: True
CODE_REVIEW_SERVER: chromium-review.googlesource.com
GERRIT_SQUASH_UPLOADS: False
Vendored
+151 -40
View File
@@ -31,7 +31,6 @@ Advanced options:
--libc=PATH path to alternate libc
--size-limit=WxH max size to allow in the decoder
--as={yasm|nasm|auto} use specified assembler [auto, yasm preferred]
--sdk-path=PATH path to root of sdk (android builds only)
${toggle_codec_srcs} in/exclude codec library source code
${toggle_debug_libs} in/exclude debug version of libraries
${toggle_static_msvcrt} use static MSVCRT (VS builds only)
@@ -100,19 +99,35 @@ EOF
# alphabetically by architecture, generic-gnu last.
all_platforms="${all_platforms} arm64-android-gcc"
all_platforms="${all_platforms} arm64-darwin-gcc"
all_platforms="${all_platforms} arm64-darwin20-gcc"
all_platforms="${all_platforms} arm64-darwin21-gcc"
all_platforms="${all_platforms} arm64-darwin22-gcc"
all_platforms="${all_platforms} arm64-darwin23-gcc"
all_platforms="${all_platforms} arm64-darwin24-gcc"
all_platforms="${all_platforms} arm64-linux-gcc"
all_platforms="${all_platforms} arm64-win64-gcc"
all_platforms="${all_platforms} arm64-win64-vs15"
all_platforms="${all_platforms} arm64-win64-vs16"
all_platforms="${all_platforms} arm64-win64-vs16-clangcl"
all_platforms="${all_platforms} arm64-win64-vs17"
all_platforms="${all_platforms} arm64-win64-vs17-clangcl"
all_platforms="${all_platforms} armv7-android-gcc" #neon Cortex-A8
all_platforms="${all_platforms} armv7-darwin-gcc" #neon Cortex-A8
all_platforms="${all_platforms} armv7-linux-rvct" #neon Cortex-A8
all_platforms="${all_platforms} armv7-linux-gcc" #neon Cortex-A8
all_platforms="${all_platforms} armv7-none-rvct" #neon Cortex-A8
all_platforms="${all_platforms} armv7-win32-vs11"
all_platforms="${all_platforms} armv7-win32-vs12"
all_platforms="${all_platforms} armv7-win32-gcc"
all_platforms="${all_platforms} armv7-win32-vs14"
all_platforms="${all_platforms} armv7-win32-vs15"
all_platforms="${all_platforms} armv7-win32-vs16"
all_platforms="${all_platforms} armv7-win32-vs17"
all_platforms="${all_platforms} armv7s-darwin-gcc"
all_platforms="${all_platforms} armv8-linux-gcc"
all_platforms="${all_platforms} loongarch32-linux-gcc"
all_platforms="${all_platforms} loongarch64-linux-gcc"
all_platforms="${all_platforms} mips32-linux-gcc"
all_platforms="${all_platforms} mips64-linux-gcc"
all_platforms="${all_platforms} ppc64le-linux-gcc"
all_platforms="${all_platforms} sparc-solaris-gcc"
all_platforms="${all_platforms} x86-android-gcc"
all_platforms="${all_platforms} x86-darwin8-gcc"
@@ -125,16 +140,18 @@ all_platforms="${all_platforms} x86-darwin12-gcc"
all_platforms="${all_platforms} x86-darwin13-gcc"
all_platforms="${all_platforms} x86-darwin14-gcc"
all_platforms="${all_platforms} x86-darwin15-gcc"
all_platforms="${all_platforms} x86-darwin16-gcc"
all_platforms="${all_platforms} x86-darwin17-gcc"
all_platforms="${all_platforms} x86-iphonesimulator-gcc"
all_platforms="${all_platforms} x86-linux-gcc"
all_platforms="${all_platforms} x86-linux-icc"
all_platforms="${all_platforms} x86-os2-gcc"
all_platforms="${all_platforms} x86-solaris-gcc"
all_platforms="${all_platforms} x86-win32-gcc"
all_platforms="${all_platforms} x86-win32-vs10"
all_platforms="${all_platforms} x86-win32-vs11"
all_platforms="${all_platforms} x86-win32-vs12"
all_platforms="${all_platforms} x86-win32-vs14"
all_platforms="${all_platforms} x86-win32-vs15"
all_platforms="${all_platforms} x86-win32-vs16"
all_platforms="${all_platforms} x86-win32-vs17"
all_platforms="${all_platforms} x86_64-android-gcc"
all_platforms="${all_platforms} x86_64-darwin9-gcc"
all_platforms="${all_platforms} x86_64-darwin10-gcc"
@@ -143,15 +160,24 @@ all_platforms="${all_platforms} x86_64-darwin12-gcc"
all_platforms="${all_platforms} x86_64-darwin13-gcc"
all_platforms="${all_platforms} x86_64-darwin14-gcc"
all_platforms="${all_platforms} x86_64-darwin15-gcc"
all_platforms="${all_platforms} x86_64-darwin16-gcc"
all_platforms="${all_platforms} x86_64-darwin17-gcc"
all_platforms="${all_platforms} x86_64-darwin18-gcc"
all_platforms="${all_platforms} x86_64-darwin19-gcc"
all_platforms="${all_platforms} x86_64-darwin20-gcc"
all_platforms="${all_platforms} x86_64-darwin21-gcc"
all_platforms="${all_platforms} x86_64-darwin22-gcc"
all_platforms="${all_platforms} x86_64-darwin23-gcc"
all_platforms="${all_platforms} x86_64-darwin24-gcc"
all_platforms="${all_platforms} x86_64-iphonesimulator-gcc"
all_platforms="${all_platforms} x86_64-linux-gcc"
all_platforms="${all_platforms} x86_64-linux-icc"
all_platforms="${all_platforms} x86_64-solaris-gcc"
all_platforms="${all_platforms} x86_64-win64-gcc"
all_platforms="${all_platforms} x86_64-win64-vs10"
all_platforms="${all_platforms} x86_64-win64-vs11"
all_platforms="${all_platforms} x86_64-win64-vs12"
all_platforms="${all_platforms} x86_64-win64-vs14"
all_platforms="${all_platforms} x86_64-win64-vs15"
all_platforms="${all_platforms} x86_64-win64-vs16"
all_platforms="${all_platforms} x86_64-win64-vs17"
all_platforms="${all_platforms} generic-gnu"
# all_targets is a list of all targets that can be configured
@@ -163,11 +189,14 @@ for t in ${all_targets}; do
[ -f "${source_path}/${t}.mk" ] && enable_feature ${t}
done
if ! diff --version >/dev/null; then
die "diff missing: Try installing diffutils via your package manager."
fi
if ! perl --version >/dev/null; then
die "Perl is required to build"
fi
if [ "`cd \"${source_path}\" && pwd`" != "`pwd`" ]; then
# test to see if source_path already configured
if [ -f "${source_path}/vpx_config.h" ]; then
@@ -220,10 +249,22 @@ CODEC_FAMILIES="
ARCH_LIST="
arm
aarch64
mips
x86
x86_64
ppc
loongarch
"
ARCH_EXT_LIST_AARCH64="
neon
neon_dotprod
neon_i8mm
sve
sve2
"
ARCH_EXT_LIST_X86="
mmx
sse
@@ -233,10 +274,18 @@ ARCH_EXT_LIST_X86="
sse4_1
avx
avx2
avx512
"
ARCH_EXT_LIST_LOONGSON="
mmi
lsx
lasx
"
ARCH_EXT_LIST="
neon
neon_asm
${ARCH_EXT_LIST_AARCH64}
mips32
dspr2
@@ -244,6 +293,10 @@ ARCH_EXT_LIST="
mips64
${ARCH_EXT_LIST_X86}
vsx
${ARCH_EXT_LIST_LOONGSON}
"
HAVE_LIST="
${ARCH_EXT_LIST}
@@ -252,10 +305,11 @@ HAVE_LIST="
unistd_h
"
EXPERIMENT_LIST="
spatial_svc
fp_mb_stats
emulate_hardware
misc_fixes
non_greedy_mv
rate_ctrl
collect_component_timing
"
CONFIG_LIST="
dependency_tracking
@@ -310,6 +364,9 @@ CONFIG_LIST="
better_hw_compatibility
experimental
size_limit
always_adjust_bpm
bitstream_debug
mismatch_debug
${EXPERIMENT_LIST}
"
CMDLINE_SELECT="
@@ -322,6 +379,7 @@ CMDLINE_SELECT="
install_libs
install_srcs
debug
profile
gprof
gcov
pic
@@ -369,6 +427,9 @@ CMDLINE_SELECT="
better_hw_compatibility
vp9_highbitdepth
experimental
always_adjust_bpm
bitstream_debug
mismatch_debug
"
process_cmdline() {
@@ -399,6 +460,12 @@ process_cmdline() {
}
post_process_cmdline() {
if enabled coefficient_range_checking; then
echo "coefficient-range-checking is for decoders only, disabling encoders:"
soft_disable vp8_encoder
soft_disable vp9_encoder
fi
c=""
# Enable all detected codecs, if they haven't been disabled
@@ -420,6 +487,7 @@ process_targets() {
enabled child || write_common_config_banner
write_common_target_config_h ${BUILD_PFX}vpx_config.h
write_common_config_targets
enabled win_arm64_neon_h_workaround && write_win_arm64_neon_h_workaround ${BUILD_PFX}arm_neon.h
# Calculate the default distribution name, based on the enabled features
cf=""
@@ -496,7 +564,7 @@ process_detect() {
# here rather than at option parse time because the target auto-detect
# magic happens after the command line has been parsed.
case "${tgt_os}" in
linux|os2|darwin*|iphonesimulator*)
linux|os2|solaris|darwin*|iphonesimulator*)
# Supported platforms
;;
*)
@@ -548,16 +616,30 @@ process_detect() {
check_ld() {
true
}
check_lib() {
true
}
fi
check_header stdio.h || die "Unable to invoke compiler: ${CC} ${CFLAGS}"
check_ld <<EOF || die "Toolchain is unable to link executables"
int main(void) {return 0;}
EOF
# check system headers
check_header pthread.h
# Use both check_header and check_lib here, since check_lib
# could be a stub that always returns true.
check_header pthread.h && check_lib -lpthread <<EOF || disable_feature pthread_h
#include <pthread.h>
#include <stddef.h>
int main(void) { return pthread_create(NULL, NULL, NULL, NULL); }
EOF
check_header unistd.h # for sysconf(3) and friends.
check_header vpx/vpx_integer.h -I${source_path} && enable_feature vpx_ports
if enabled neon && ! enabled external_build; then
check_header arm_neon.h || die "Unable to find arm_neon.h"
fi
}
process_toolchain() {
@@ -567,32 +649,67 @@ process_toolchain() {
if enabled gcc; then
enabled werror && check_add_cflags -Werror
check_add_cflags -Wall
check_add_cflags -Wdeclaration-after-statement
check_add_cflags -Wdisabled-optimization
check_add_cflags -Wextra-semi
check_add_cflags -Wextra-semi-stmt
check_add_cflags -Wfloat-conversion
check_add_cflags -Wformat=2
check_add_cflags -Wparentheses-equality
check_add_cflags -Wpointer-arith
check_add_cflags -Wtype-limits
check_add_cflags -Wcast-qual
check_add_cflags -Wvla
check_add_cflags -Wimplicit-function-declaration
check_add_cflags -Wmissing-declarations
check_add_cflags -Wmissing-prototypes
check_add_cflags -Wshadow
check_add_cflags -Wstrict-prototypes
check_add_cflags -Wuninitialized
check_add_cflags -Wunreachable-code-aggressive
check_add_cflags -Wunused
# -Wextra has some tricky cases. Rather than fix them all now, get the
# flag for as many files as possible and fix the remaining issues
# piecemeal.
# https://bugs.chromium.org/p/webm/issues/detail?id=1069
check_add_cflags -Wextra
# check_add_cflags also adds to cxxflags. gtest does not do well with
# -Wundef so add it explicitly to CFLAGS only.
# these flags so add them explicitly to CFLAGS only.
check_cflags -Wundef && add_cflags_only -Wundef
check_cflags -Wframe-larger-than=52000 && \
add_cflags_only -Wframe-larger-than=52000
if enabled mips || [ -z "${INLINE}" ]; then
enabled extra_warnings || check_add_cflags -Wno-unused-function
fi
if ! enabled vp9_highbitdepth; then
# Avoid this warning for third_party C++ sources. Some reorganization
# would be needed to apply this only to test/*.cc.
check_cflags -Wshorten-64-to-32 && add_cflags_only -Wshorten-64-to-32
# Enforce C99 for C files. Allow GNU extensions.
check_cflags -std=gnu99 && add_cflags_only -std=gnu99
# Avoid this warning for third_party C++ sources. Some reorganization
# would be needed to apply this only to test/*.cc.
check_cflags -Wshorten-64-to-32 && add_cflags_only -Wshorten-64-to-32
# Do not allow implicit vector type conversions on Clang builds (this
# is already the default on GCC builds).
check_add_cflags -flax-vector-conversions=none
# Quiet gcc 6 vs 7 abi warnings:
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77728
if enabled arm; then
check_add_cxxflags -Wno-psabi
fi
# Enforce C++11 compatibility.
check_add_cxxflags -Wc++14-extensions
check_add_cxxflags -Wc++17-extensions
check_add_cxxflags -Wc++20-extensions
check_add_cxxflags -Wnon-virtual-dtor
# disable some warnings specific to libyuv / libwebm.
check_cxxflags -Wno-missing-declarations \
&& LIBYUV_CXXFLAGS="${LIBYUV_CXXFLAGS} -Wno-missing-declarations"
check_cxxflags -Wno-missing-prototypes \
&& LIBYUV_CXXFLAGS="${LIBYUV_CXXFLAGS} -Wno-missing-prototypes"
check_cxxflags -Wno-pass-failed \
&& LIBYUV_CXXFLAGS="${LIBYUV_CXXFLAGS} -Wno-pass-failed"
check_cxxflags -Wno-shadow \
&& LIBWEBM_CXXFLAGS="${LIBWEBM_CXXFLAGS} -Wno-shadow" \
&& LIBYUV_CXXFLAGS="${LIBYUV_CXXFLAGS} -Wno-shadow"
check_cxxflags -Wno-unused-parameter \
&& LIBYUV_CXXFLAGS="${LIBYUV_CXXFLAGS} -Wno-unused-parameter"
fi
if enabled icc; then
@@ -644,7 +761,7 @@ process_toolchain() {
gen_vcproj_cmd=${source_path}/build/make/gen_msvs_vcxproj.sh
enabled werror && gen_vcproj_cmd="${gen_vcproj_cmd} --enable-werror"
all_targets="${all_targets} solution"
INLINE="__forceinline"
INLINE="__inline"
;;
esac
@@ -663,39 +780,33 @@ process_toolchain() {
soft_enable libyuv
;;
*-android-*)
soft_enable webm_io
check_add_cxxflags -std=gnu++11 && soft_enable webm_io
soft_enable libyuv
# GTestLog must be modified to use Android logging utilities.
;;
*-darwin-*)
check_add_cxxflags -std=gnu++11
# iOS/ARM builds do not work with gtest. This does not match
# x86 targets.
;;
*-iphonesimulator-*)
soft_enable webm_io
check_add_cxxflags -std=gnu++11 && soft_enable webm_io
soft_enable libyuv
;;
*-win*)
# Some mingw toolchains don't have pthread available by default.
# Treat these more like visual studio where threading in gtest
# would be disabled for the same reason.
check_cxx "$@" <<EOF && soft_enable unit_tests
int z;
EOF
check_cxx "$@" <<EOF && soft_enable webm_io
int z;
EOF
check_add_cxxflags -std=gnu++11 && soft_enable unit_tests \
&& soft_enable webm_io
check_cxx "$@" <<EOF && soft_enable libyuv
int z;
EOF
;;
*)
enabled pthread_h && check_cxx "$@" <<EOF && soft_enable unit_tests
int z;
EOF
check_cxx "$@" <<EOF && soft_enable webm_io
int z;
EOF
enabled pthread_h && check_add_cxxflags -std=gnu++11 \
&& soft_enable unit_tests
check_add_cxxflags -std=gnu++11 && soft_enable webm_io
check_cxx "$@" <<EOF && soft_enable libyuv
int z;
EOF
+33 -33
View File
@@ -23,7 +23,7 @@ LIBYUV_SRCS += third_party/libyuv/include/libyuv/basic_types.h \
third_party/libyuv/source/row_any.cc \
third_party/libyuv/source/row_common.cc \
third_party/libyuv/source/row_gcc.cc \
third_party/libyuv/source/row_mips.cc \
third_party/libyuv/source/row_msa.cc \
third_party/libyuv/source/row_neon.cc \
third_party/libyuv/source/row_neon64.cc \
third_party/libyuv/source/row_win.cc \
@@ -31,7 +31,7 @@ LIBYUV_SRCS += third_party/libyuv/include/libyuv/basic_types.h \
third_party/libyuv/source/scale_any.cc \
third_party/libyuv/source/scale_common.cc \
third_party/libyuv/source/scale_gcc.cc \
third_party/libyuv/source/scale_mips.cc \
third_party/libyuv/source/scale_msa.cc \
third_party/libyuv/source/scale_neon.cc \
third_party/libyuv/source/scale_neon64.cc \
third_party/libyuv/source/scale_win.cc \
@@ -57,6 +57,7 @@ LIBWEBM_PARSER_SRCS = third_party/libwebm/mkvparser/mkvparser.cc \
# Add compile flags and include path for libwebm sources.
ifeq ($(CONFIG_WEBM_IO),yes)
CXXFLAGS += -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS
$(BUILD_PFX)third_party/libwebm/%.cc.o: CXXFLAGS += $(LIBWEBM_CXXFLAGS)
INC_PATH-yes += $(SRC_PATH_BARE)/third_party/libwebm
endif
@@ -65,22 +66,21 @@ endif
# while EXAMPLES demonstrate specific portions of the API.
UTILS-$(CONFIG_DECODERS) += vpxdec.c
vpxdec.SRCS += md5_utils.c md5_utils.h
vpxdec.SRCS += vpx_ports/compiler_attributes.h
vpxdec.SRCS += vpx_ports/mem_ops.h
vpxdec.SRCS += vpx_ports/mem_ops_aligned.h
vpxdec.SRCS += vpx_ports/msvc.h
vpxdec.SRCS += vpx_ports/vpx_timer.h
vpxdec.SRCS += vpx/vpx_integer.h
vpxdec.SRCS += args.c args.h
vpxdec.SRCS += ivfdec.c ivfdec.h
vpxdec.SRCS += y4minput.c y4minput.h
vpxdec.SRCS += tools_common.c tools_common.h
vpxdec.SRCS += y4menc.c y4menc.h
ifeq ($(CONFIG_LIBYUV),yes)
vpxdec.SRCS += $(LIBYUV_SRCS)
$(BUILD_PFX)third_party/libyuv/%.cc.o: CXXFLAGS += -Wno-unused-parameter
$(BUILD_PFX)third_party/libyuv/%.cc.o: CXXFLAGS += ${LIBYUV_CXXFLAGS}
endif
ifeq ($(CONFIG_WEBM_IO),yes)
vpxdec.SRCS += $(LIBWEBM_COMMON_SRCS)
vpxdec.SRCS += $(LIBWEBM_MUXER_SRCS)
vpxdec.SRCS += $(LIBWEBM_PARSER_SRCS)
vpxdec.SRCS += webmdec.cc webmdec.h
endif
@@ -95,7 +95,6 @@ vpxenc.SRCS += tools_common.c tools_common.h
vpxenc.SRCS += warnings.c warnings.h
vpxenc.SRCS += vpx_ports/mem_ops.h
vpxenc.SRCS += vpx_ports/mem_ops_aligned.h
vpxenc.SRCS += vpx_ports/msvc.h
vpxenc.SRCS += vpx_ports/vpx_timer.h
vpxenc.SRCS += vpxstats.c vpxstats.h
ifeq ($(CONFIG_LIBYUV),yes)
@@ -109,110 +108,108 @@ ifeq ($(CONFIG_WEBM_IO),yes)
endif
vpxenc.GUID = 548DEC74-7A15-4B2B-AFC3-AA102E7C25C1
vpxenc.DESCRIPTION = Full featured encoder
ifeq ($(CONFIG_SPATIAL_SVC),yes)
EXAMPLES-$(CONFIG_VP9_ENCODER) += vp9_spatial_svc_encoder.c
vp9_spatial_svc_encoder.SRCS += args.c args.h
vp9_spatial_svc_encoder.SRCS += ivfenc.c ivfenc.h
vp9_spatial_svc_encoder.SRCS += tools_common.c tools_common.h
vp9_spatial_svc_encoder.SRCS += video_common.h
vp9_spatial_svc_encoder.SRCS += video_writer.h video_writer.c
vp9_spatial_svc_encoder.SRCS += vpx_ports/msvc.h
vp9_spatial_svc_encoder.SRCS += vpxstats.c vpxstats.h
vp9_spatial_svc_encoder.GUID = 4A38598D-627D-4505-9C7B-D4020C84100D
vp9_spatial_svc_encoder.DESCRIPTION = VP9 Spatial SVC Encoder
endif
ifneq ($(CONFIG_SHARED),yes)
EXAMPLES-$(CONFIG_VP9_ENCODER) += resize_util.c
endif
EXAMPLES-$(CONFIG_VP9_ENCODER) += vp9_spatial_svc_encoder.c
vp9_spatial_svc_encoder.SRCS += args.c args.h
vp9_spatial_svc_encoder.SRCS += ivfenc.c ivfenc.h
vp9_spatial_svc_encoder.SRCS += y4minput.c y4minput.h
vp9_spatial_svc_encoder.SRCS += tools_common.c tools_common.h
vp9_spatial_svc_encoder.SRCS += video_common.h
vp9_spatial_svc_encoder.SRCS += video_writer.h video_writer.c
vp9_spatial_svc_encoder.SRCS += vpxstats.c vpxstats.h
vp9_spatial_svc_encoder.SRCS += examples/svc_encodeframe.c
vp9_spatial_svc_encoder.SRCS += examples/svc_context.h
vp9_spatial_svc_encoder.GUID = 4A38598D-627D-4505-9C7B-D4020C84100D
vp9_spatial_svc_encoder.DESCRIPTION = VP9 Spatial SVC Encoder
EXAMPLES-$(CONFIG_ENCODERS) += vpx_temporal_svc_encoder.c
vpx_temporal_svc_encoder.SRCS += ivfenc.c ivfenc.h
vpx_temporal_svc_encoder.SRCS += y4minput.c y4minput.h
vpx_temporal_svc_encoder.SRCS += tools_common.c tools_common.h
vpx_temporal_svc_encoder.SRCS += video_common.h
vpx_temporal_svc_encoder.SRCS += video_writer.h video_writer.c
vpx_temporal_svc_encoder.SRCS += vpx_ports/msvc.h
vpx_temporal_svc_encoder.GUID = B18C08F2-A439-4502-A78E-849BE3D60947
vpx_temporal_svc_encoder.DESCRIPTION = Temporal SVC Encoder
EXAMPLES-$(CONFIG_DECODERS) += simple_decoder.c
simple_decoder.GUID = D3BBF1E9-2427-450D-BBFF-B2843C1D44CC
simple_decoder.SRCS += ivfdec.h ivfdec.c
simple_decoder.SRCS += y4minput.c y4minput.h
simple_decoder.SRCS += tools_common.h tools_common.c
simple_decoder.SRCS += video_common.h
simple_decoder.SRCS += video_reader.h video_reader.c
simple_decoder.SRCS += vpx_ports/mem_ops.h
simple_decoder.SRCS += vpx_ports/mem_ops_aligned.h
simple_decoder.SRCS += vpx_ports/msvc.h
simple_decoder.DESCRIPTION = Simplified decoder loop
EXAMPLES-$(CONFIG_DECODERS) += postproc.c
postproc.SRCS += ivfdec.h ivfdec.c
postproc.SRCS += y4minput.c y4minput.h
postproc.SRCS += tools_common.h tools_common.c
postproc.SRCS += video_common.h
postproc.SRCS += video_reader.h video_reader.c
postproc.SRCS += vpx_ports/mem_ops.h
postproc.SRCS += vpx_ports/mem_ops_aligned.h
postproc.SRCS += vpx_ports/msvc.h
postproc.GUID = 65E33355-F35E-4088-884D-3FD4905881D7
postproc.DESCRIPTION = Decoder postprocessor control
EXAMPLES-$(CONFIG_DECODERS) += decode_to_md5.c
decode_to_md5.SRCS += md5_utils.h md5_utils.c
decode_to_md5.SRCS += ivfdec.h ivfdec.c
decode_to_md5.SRCS += y4minput.c y4minput.h
decode_to_md5.SRCS += tools_common.h tools_common.c
decode_to_md5.SRCS += video_common.h
decode_to_md5.SRCS += video_reader.h video_reader.c
decode_to_md5.SRCS += vpx_ports/compiler_attributes.h
decode_to_md5.SRCS += vpx_ports/mem_ops.h
decode_to_md5.SRCS += vpx_ports/mem_ops_aligned.h
decode_to_md5.SRCS += vpx_ports/msvc.h
decode_to_md5.GUID = 59120B9B-2735-4BFE-B022-146CA340FE42
decode_to_md5.DESCRIPTION = Frame by frame MD5 checksum
EXAMPLES-$(CONFIG_ENCODERS) += simple_encoder.c
simple_encoder.SRCS += ivfenc.h ivfenc.c
simple_encoder.SRCS += y4minput.c y4minput.h
simple_encoder.SRCS += tools_common.h tools_common.c
simple_encoder.SRCS += video_common.h
simple_encoder.SRCS += video_writer.h video_writer.c
simple_encoder.SRCS += vpx_ports/msvc.h
simple_encoder.GUID = 4607D299-8A71-4D2C-9B1D-071899B6FBFD
simple_encoder.DESCRIPTION = Simplified encoder loop
EXAMPLES-$(CONFIG_VP9_ENCODER) += vp9_lossless_encoder.c
vp9_lossless_encoder.SRCS += ivfenc.h ivfenc.c
vp9_lossless_encoder.SRCS += y4minput.c y4minput.h
vp9_lossless_encoder.SRCS += tools_common.h tools_common.c
vp9_lossless_encoder.SRCS += video_common.h
vp9_lossless_encoder.SRCS += video_writer.h video_writer.c
vp9_lossless_encoder.SRCS += vpx_ports/msvc.h
vp9_lossless_encoder.GUID = B63C7C88-5348-46DC-A5A6-CC151EF93366
vp9_lossless_encoder.DESCRIPTION = Simplified lossless VP9 encoder
EXAMPLES-$(CONFIG_ENCODERS) += twopass_encoder.c
twopass_encoder.SRCS += ivfenc.h ivfenc.c
twopass_encoder.SRCS += y4minput.c y4minput.h
twopass_encoder.SRCS += tools_common.h tools_common.c
twopass_encoder.SRCS += video_common.h
twopass_encoder.SRCS += video_writer.h video_writer.c
twopass_encoder.SRCS += vpx_ports/msvc.h
twopass_encoder.GUID = 73494FA6-4AF9-4763-8FBB-265C92402FD8
twopass_encoder.DESCRIPTION = Two-pass encoder loop
EXAMPLES-$(CONFIG_DECODERS) += decode_with_drops.c
decode_with_drops.SRCS += ivfdec.h ivfdec.c
decode_with_drops.SRCS += y4minput.c y4minput.h
decode_with_drops.SRCS += tools_common.h tools_common.c
decode_with_drops.SRCS += video_common.h
decode_with_drops.SRCS += video_reader.h video_reader.c
decode_with_drops.SRCS += vpx_ports/mem_ops.h
decode_with_drops.SRCS += vpx_ports/mem_ops_aligned.h
decode_with_drops.SRCS += vpx_ports/msvc.h
decode_with_drops.GUID = CE5C53C4-8DDA-438A-86ED-0DDD3CDB8D26
decode_with_drops.DESCRIPTION = Drops frames while decoding
EXAMPLES-$(CONFIG_ENCODERS) += set_maps.c
set_maps.SRCS += ivfenc.h ivfenc.c
set_maps.SRCS += y4minput.c y4minput.h
set_maps.SRCS += tools_common.h tools_common.c
set_maps.SRCS += video_common.h
set_maps.SRCS += video_writer.h video_writer.c
set_maps.SRCS += vpx_ports/msvc.h
set_maps.GUID = ECB2D24D-98B8-4015-A465-A4AF3DCC145F
set_maps.DESCRIPTION = Set active and ROI maps
EXAMPLES-$(CONFIG_VP8_ENCODER) += vp8cx_set_ref.c
vp8cx_set_ref.SRCS += ivfenc.h ivfenc.c
vp8cx_set_ref.SRCS += y4minput.c y4minput.h
vp8cx_set_ref.SRCS += tools_common.h tools_common.c
vp8cx_set_ref.SRCS += video_common.h
vp8cx_set_ref.SRCS += video_writer.h video_writer.c
vp8cx_set_ref.SRCS += vpx_ports/msvc.h
vp8cx_set_ref.GUID = C5E31F7F-96F6-48BD-BD3E-10EBF6E8057A
vp8cx_set_ref.DESCRIPTION = VP8 set encoder reference frame
@@ -220,6 +217,7 @@ ifeq ($(CONFIG_VP9_ENCODER),yes)
ifeq ($(CONFIG_DECODERS),yes)
EXAMPLES-yes += vp9cx_set_ref.c
vp9cx_set_ref.SRCS += ivfenc.h ivfenc.c
vp9cx_set_ref.SRCS += y4minput.c y4minput.h
vp9cx_set_ref.SRCS += tools_common.h tools_common.c
vp9cx_set_ref.SRCS += video_common.h
vp9cx_set_ref.SRCS += video_writer.h video_writer.c
@@ -232,9 +230,9 @@ ifeq ($(CONFIG_MULTI_RES_ENCODING),yes)
ifeq ($(CONFIG_LIBYUV),yes)
EXAMPLES-$(CONFIG_VP8_ENCODER) += vp8_multi_resolution_encoder.c
vp8_multi_resolution_encoder.SRCS += ivfenc.h ivfenc.c
vp8_multi_resolution_encoder.SRCS += y4minput.c y4minput.h
vp8_multi_resolution_encoder.SRCS += tools_common.h tools_common.c
vp8_multi_resolution_encoder.SRCS += video_writer.h video_writer.c
vp8_multi_resolution_encoder.SRCS += vpx_ports/msvc.h
vp8_multi_resolution_encoder.SRCS += $(LIBYUV_SRCS)
vp8_multi_resolution_encoder.GUID = 04f8738e-63c8-423b-90fa-7c2703a374de
vp8_multi_resolution_encoder.DESCRIPTION = VP8 Multiple-resolution Encoding
@@ -359,6 +357,7 @@ $(1): $($(1:.$(VCPROJ_SFX)=).SRCS) vpx.$(VCPROJ_SFX)
--ver=$$(CONFIG_VS_VERSION)\
--proj-guid=$$($$(@:.$(VCPROJ_SFX)=).GUID)\
--src-path-bare="$(SRC_PATH_BARE)" \
--as=$$(AS) \
$$(if $$(CONFIG_STATIC_MSVCRT),--static-crt) \
--out=$$@ $$(INTERNAL_CFLAGS) $$(CFLAGS) \
$$(INTERNAL_LDFLAGS) $$(LDFLAGS) -l$$(CODEC_LIB) $$^
@@ -403,3 +402,4 @@ CLEAN-OBJS += examples.doxy samples.dox $(ALL_EXAMPLES:.c=.dox)
DOCS-yes += examples.doxy samples.dox
examples.doxy: samples.dox $(ALL_EXAMPLES:.c=.dox)
@echo "INPUT += $^" > $@
@echo "ENABLED_SECTIONS += samples" >> $@
+1 -1
View File
@@ -106,7 +106,7 @@ int main(int argc, char **argv) {
printf("Using %s\n", vpx_codec_iface_name(decoder->codec_interface()));
if (vpx_codec_dec_init(&codec, decoder->codec_interface(), NULL, 0))
die_codec(&codec, "Failed to initialize decoder.");
die("Failed to initialize decoder.");
while (vpx_video_reader_read_frame(reader)) {
vpx_codec_iter_t iter = NULL;
+3 -3
View File
@@ -86,9 +86,9 @@ int main(int argc, char **argv) {
res = vpx_codec_dec_init(&codec, decoder->codec_interface(), NULL,
VPX_CODEC_USE_POSTPROC);
if (res == VPX_CODEC_INCAPABLE)
die_codec(&codec, "Postproc not supported by this decoder.");
die("Postproc not supported by this decoder.");
if (res) die_codec(&codec, "Failed to initialize decoder.");
if (res) die("Failed to initialize decoder.");
while (vpx_video_reader_read_frame(reader)) {
vpx_codec_iter_t iter = NULL;
@@ -109,7 +109,7 @@ int main(int argc, char **argv) {
0 };
if (vpx_codec_control(&codec, VP8_SET_POSTPROC, &pp))
die_codec(&codec, "Failed to turn on postproc.");
};
}
// Decode the frame with 15ms deadline
if (vpx_codec_decode(&codec, frame, (unsigned int)frame_size, NULL, 15000))
-123
View File
@@ -1,123 +0,0 @@
/*
* Copyright (c) 2014 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <assert.h>
#include <limits.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "../tools_common.h"
#include "../vp9/encoder/vp9_resize.h"
static const char *exec_name = NULL;
static void usage() {
printf("Usage:\n");
printf("%s <input_yuv> <width>x<height> <target_width>x<target_height> ",
exec_name);
printf("<output_yuv> [<frames>]\n");
}
void usage_exit(void) {
usage();
exit(EXIT_FAILURE);
}
static int parse_dim(char *v, int *width, int *height) {
char *x = strchr(v, 'x');
if (x == NULL) x = strchr(v, 'X');
if (x == NULL) return 0;
*width = atoi(v);
*height = atoi(&x[1]);
if (*width <= 0 || *height <= 0)
return 0;
else
return 1;
}
int main(int argc, char *argv[]) {
char *fin, *fout;
FILE *fpin, *fpout;
uint8_t *inbuf, *outbuf;
uint8_t *inbuf_u, *outbuf_u;
uint8_t *inbuf_v, *outbuf_v;
int f, frames;
int width, height, target_width, target_height;
exec_name = argv[0];
if (argc < 5) {
printf("Incorrect parameters:\n");
usage();
return 1;
}
fin = argv[1];
fout = argv[4];
if (!parse_dim(argv[2], &width, &height)) {
printf("Incorrect parameters: %s\n", argv[2]);
usage();
return 1;
}
if (!parse_dim(argv[3], &target_width, &target_height)) {
printf("Incorrect parameters: %s\n", argv[3]);
usage();
return 1;
}
fpin = fopen(fin, "rb");
if (fpin == NULL) {
printf("Can't open file %s to read\n", fin);
usage();
return 1;
}
fpout = fopen(fout, "wb");
if (fpout == NULL) {
printf("Can't open file %s to write\n", fout);
usage();
return 1;
}
if (argc >= 6)
frames = atoi(argv[5]);
else
frames = INT_MAX;
printf("Input size: %dx%d\n", width, height);
printf("Target size: %dx%d, Frames: ", target_width, target_height);
if (frames == INT_MAX)
printf("All\n");
else
printf("%d\n", frames);
inbuf = (uint8_t *)malloc(width * height * 3 / 2);
outbuf = (uint8_t *)malloc(target_width * target_height * 3 / 2);
inbuf_u = inbuf + width * height;
inbuf_v = inbuf_u + width * height / 4;
outbuf_u = outbuf + target_width * target_height;
outbuf_v = outbuf_u + target_width * target_height / 4;
f = 0;
while (f < frames) {
if (fread(inbuf, width * height * 3 / 2, 1, fpin) != 1) break;
vp9_resize_frame420(inbuf, width, inbuf_u, inbuf_v, width / 2, height,
width, outbuf, target_width, outbuf_u, outbuf_v,
target_width / 2, target_height, target_width);
fwrite(outbuf, target_width * target_height * 3 / 2, 1, fpout);
f++;
}
printf("%d frames processed\n", f);
fclose(fpin);
fclose(fpout);
free(inbuf);
free(outbuf);
return 0;
}
+1 -1
View File
@@ -209,7 +209,7 @@ int main(int argc, char **argv) {
die("Failed to open %s for reading.", argv[4]);
if (vpx_codec_enc_init(&codec, encoder->codec_interface(), &cfg, 0))
die_codec(&codec, "Failed to initialize encoder");
die("Failed to initialize encoder");
// Encode frames.
while (vpx_img_read(&raw, infile)) {
+1 -1
View File
@@ -118,7 +118,7 @@ int main(int argc, char **argv) {
printf("Using %s\n", vpx_codec_iface_name(decoder->codec_interface()));
if (vpx_codec_dec_init(&codec, decoder->codec_interface(), NULL, 0))
die_codec(&codec, "Failed to initialize decoder.");
die("Failed to initialize decoder.");
while (vpx_video_reader_read_frame(reader)) {
vpx_codec_iter_t iter = NULL;
+1 -1
View File
@@ -218,7 +218,7 @@ int main(int argc, char **argv) {
die("Failed to open %s for reading.", infile_arg);
if (vpx_codec_enc_init(&codec, encoder->codec_interface(), &cfg, 0))
die_codec(&codec, "Failed to initialize encoder");
die("Failed to initialize encoder");
// Encode frames.
while (vpx_img_read(&raw, infile)) {
+8 -16
View File
@@ -13,11 +13,11 @@
* spatial SVC frame
*/
#ifndef VPX_SVC_CONTEXT_H_
#define VPX_SVC_CONTEXT_H_
#ifndef VPX_EXAMPLES_SVC_CONTEXT_H_
#define VPX_EXAMPLES_SVC_CONTEXT_H_
#include "./vp8cx.h"
#include "./vpx_encoder.h"
#include "vpx/vp8cx.h"
#include "vpx/vpx_encoder.h"
#ifdef __cplusplus
extern "C" {
@@ -35,10 +35,8 @@ typedef struct {
int temporal_layers; // number of temporal layers
int temporal_layering_mode;
SVC_LOG_LEVEL log_level; // amount of information to display
int log_print; // when set, printf log messages instead of returning the
// message with svc_get_message
int output_rc_stat; // for outputting rc stats
int speed; // speed setting for codec
int output_rc_stat; // for outputting rc stats
int speed; // speed setting for codec
int threads;
int aqmode; // turns on aq-mode=3 (cyclic_refresh): 0=off, 1=on.
// private storage for vpx_svc_encode
@@ -71,7 +69,6 @@ typedef struct SvcInternal {
int layer;
int use_multiple_frame_contexts;
char message_buffer[2048];
vpx_codec_ctx_t *codec_ctx;
} SvcInternal_t;
@@ -106,15 +103,10 @@ void vpx_svc_release(SvcContext *svc_ctx);
/**
* dump accumulated statistics and reset accumulated values
*/
const char *vpx_svc_dump_statistics(SvcContext *svc_ctx);
/**
* get status message from previous encode
*/
const char *vpx_svc_get_message(const SvcContext *svc_ctx);
void vpx_svc_dump_statistics(SvcContext *svc_ctx);
#ifdef __cplusplus
} // extern "C"
#endif
#endif // VPX_SVC_CONTEXT_H_
#endif // VPX_EXAMPLES_SVC_CONTEXT_H_
@@ -21,8 +21,9 @@
#include <stdlib.h>
#include <string.h>
#define VPX_DISABLE_CTRL_TYPECHECKS 1
#include "../tools_common.h"
#include "./vpx_config.h"
#include "vpx/svc_context.h"
#include "./svc_context.h"
#include "vpx/vp8cx.h"
#include "vpx/vpx_encoder.h"
#include "vpx_mem/vpx_mem.h"
@@ -95,17 +96,12 @@ static const SvcInternal_t *get_const_svc_internal(const SvcContext *svc_ctx) {
return (const SvcInternal_t *)svc_ctx->internal;
}
static void svc_log_reset(SvcContext *svc_ctx) {
SvcInternal_t *const si = (SvcInternal_t *)svc_ctx->internal;
si->message_buffer[0] = '\0';
}
static int svc_log(SvcContext *svc_ctx, SVC_LOG_LEVEL level, const char *fmt,
...) {
static VPX_TOOLS_FORMAT_PRINTF(3, 4) int svc_log(SvcContext *svc_ctx,
SVC_LOG_LEVEL level,
const char *fmt, ...) {
char buf[512];
int retval = 0;
va_list ap;
SvcInternal_t *const si = get_svc_internal(svc_ctx);
if (level > svc_ctx->log_level) {
return retval;
@@ -115,25 +111,17 @@ static int svc_log(SvcContext *svc_ctx, SVC_LOG_LEVEL level, const char *fmt,
retval = vsnprintf(buf, sizeof(buf), fmt, ap);
va_end(ap);
if (svc_ctx->log_print) {
printf("%s", buf);
} else {
strncat(si->message_buffer, buf,
sizeof(si->message_buffer) - strlen(si->message_buffer) - 1);
}
printf("%s", buf);
if (level == SVC_LOG_ERROR) {
si->codec_ctx->err_detail = si->message_buffer;
}
return retval;
}
static vpx_codec_err_t extract_option(LAYER_OPTION_TYPE type, char *input,
int *value0, int *value1) {
if (type == SCALE_FACTOR) {
*value0 = strtol(input, &input, 10);
*value0 = (int)strtol(input, &input, 10);
if (*input++ != '/') return VPX_CODEC_INVALID_PARAM;
*value1 = strtol(input, &input, 10);
*value1 = (int)strtol(input, &input, 10);
if (*value0 < option_min_values[SCALE_FACTOR] ||
*value1 < option_min_values[SCALE_FACTOR] ||
@@ -169,6 +157,7 @@ static vpx_codec_err_t parse_layer_options_from_string(SvcContext *svc_ctx,
return VPX_CODEC_INVALID_PARAM;
input_string = strdup(input);
if (input_string == NULL) return VPX_CODEC_MEM_ERROR;
token = strtok_r(input_string, delim, &save_ptr);
for (i = 0; i < num_layers; ++i) {
if (token != NULL) {
@@ -208,6 +197,7 @@ static vpx_codec_err_t parse_options(SvcContext *svc_ctx, const char *options) {
if (options == NULL) return VPX_CODEC_OK;
input_string = strdup(options);
if (input_string == NULL) return VPX_CODEC_MEM_ERROR;
// parse option name
option_name = strtok_r(input_string, "=", &input_ptr);
@@ -276,7 +266,7 @@ static vpx_codec_err_t parse_options(SvcContext *svc_ctx, const char *options) {
if (alt_ref_enabled > REF_FRAMES - svc_ctx->spatial_layers) {
svc_log(svc_ctx, SVC_LOG_ERROR,
"svc: auto alt ref: Maxinum %d(REF_FRAMES - layers) layers could"
"enabled auto alt reference frame, but % layers are enabled\n",
"enabled auto alt reference frame, but %d layers are enabled\n",
REF_FRAMES - svc_ctx->spatial_layers, alt_ref_enabled);
res = VPX_CODEC_INVALID_PARAM;
}
@@ -289,13 +279,13 @@ vpx_codec_err_t vpx_svc_set_options(SvcContext *svc_ctx, const char *options) {
if (svc_ctx == NULL || options == NULL || si == NULL) {
return VPX_CODEC_INVALID_PARAM;
}
strncpy(si->options, options, sizeof(si->options));
strncpy(si->options, options, sizeof(si->options) - 1);
si->options[sizeof(si->options) - 1] = '\0';
return VPX_CODEC_OK;
}
vpx_codec_err_t assign_layer_bitrates(const SvcContext *svc_ctx,
vpx_codec_enc_cfg_t *const enc_cfg) {
static vpx_codec_err_t assign_layer_bitrates(
const SvcContext *svc_ctx, vpx_codec_enc_cfg_t *const enc_cfg) {
int i;
const SvcInternal_t *const si = get_const_svc_internal(svc_ctx);
int sl, tl, spatial_layer_target;
@@ -391,7 +381,7 @@ vpx_codec_err_t vpx_svc_init(SvcContext *svc_ctx, vpx_codec_ctx_t *codec_ctx,
vpx_codec_iface_t *iface,
vpx_codec_enc_cfg_t *enc_cfg) {
vpx_codec_err_t res;
int i, sl, tl;
int sl, tl;
SvcInternal_t *const si = get_svc_internal(svc_ctx);
if (svc_ctx == NULL || codec_ctx == NULL || iface == NULL ||
enc_cfg == NULL) {
@@ -436,10 +426,14 @@ vpx_codec_err_t vpx_svc_init(SvcContext *svc_ctx, vpx_codec_ctx_t *codec_ctx,
si->svc_params.scaling_factor_num[sl] = DEFAULT_SCALE_FACTORS_NUM_2x[sl2];
si->svc_params.scaling_factor_den[sl] = DEFAULT_SCALE_FACTORS_DEN_2x[sl2];
}
if (svc_ctx->spatial_layers == 1) {
si->svc_params.scaling_factor_num[0] = 1;
si->svc_params.scaling_factor_den[0] = 1;
}
}
for (tl = 0; tl < svc_ctx->temporal_layers; ++tl) {
for (sl = 0; sl < svc_ctx->spatial_layers; ++sl) {
i = sl * svc_ctx->temporal_layers + tl;
const int i = sl * svc_ctx->temporal_layers + tl;
si->svc_params.max_quantizers[i] = MAX_QUANTIZER;
si->svc_params.min_quantizers[i] = 0;
if (enc_cfg->rc_end_usage == VPX_CBR &&
@@ -464,11 +458,11 @@ vpx_codec_err_t vpx_svc_init(SvcContext *svc_ctx, vpx_codec_ctx_t *codec_ctx,
svc_ctx->temporal_layers = VPX_TS_MAX_LAYERS;
if (svc_ctx->temporal_layers * svc_ctx->spatial_layers > VPX_MAX_LAYERS) {
svc_log(svc_ctx, SVC_LOG_ERROR,
"spatial layers * temporal layers exceeds the maximum number of "
"allowed layers of %d\n",
svc_ctx->spatial_layers * svc_ctx->temporal_layers,
(int)VPX_MAX_LAYERS);
svc_log(
svc_ctx, SVC_LOG_ERROR,
"spatial layers * temporal layers (%d) exceeds the maximum number of "
"allowed layers of %d\n",
svc_ctx->spatial_layers * svc_ctx->temporal_layers, VPX_MAX_LAYERS);
return VPX_CODEC_INVALID_PARAM;
}
res = assign_layer_bitrates(svc_ctx, enc_cfg);
@@ -481,11 +475,6 @@ vpx_codec_err_t vpx_svc_init(SvcContext *svc_ctx, vpx_codec_ctx_t *codec_ctx,
return VPX_CODEC_INVALID_PARAM;
}
#if CONFIG_SPATIAL_SVC
for (i = 0; i < svc_ctx->spatial_layers; ++i)
enc_cfg->ss_enable_auto_alt_ref[i] = si->enable_auto_alt_ref[i];
#endif
if (svc_ctx->temporal_layers > 1) {
int i;
for (i = 0; i < svc_ctx->temporal_layers; ++i) {
@@ -510,7 +499,17 @@ vpx_codec_err_t vpx_svc_init(SvcContext *svc_ctx, vpx_codec_ctx_t *codec_ctx,
enc_cfg->rc_buf_initial_sz = 500;
enc_cfg->rc_buf_optimal_sz = 600;
enc_cfg->rc_buf_sz = 1000;
enc_cfg->rc_dropframe_thresh = 0;
}
for (tl = 0; tl < svc_ctx->temporal_layers; ++tl) {
for (sl = 0; sl < svc_ctx->spatial_layers; ++sl) {
const int i = sl * svc_ctx->temporal_layers + tl;
if (enc_cfg->rc_end_usage == VPX_CBR &&
enc_cfg->g_pass == VPX_RC_ONE_PASS) {
si->svc_params.max_quantizers[i] = enc_cfg->rc_max_quantizer;
si->svc_params.min_quantizers[i] = enc_cfg->rc_min_quantizer;
}
}
}
if (enc_cfg->g_error_resilient == 0 && si->use_multiple_frame_contexts == 0)
@@ -544,8 +543,6 @@ vpx_codec_err_t vpx_svc_encode(SvcContext *svc_ctx, vpx_codec_ctx_t *codec_ctx,
return VPX_CODEC_INVALID_PARAM;
}
svc_log_reset(svc_ctx);
res =
vpx_codec_encode(codec_ctx, rawimg, pts, (uint32_t)duration, 0, deadline);
if (res != VPX_CODEC_OK) {
@@ -555,81 +552,21 @@ vpx_codec_err_t vpx_svc_encode(SvcContext *svc_ctx, vpx_codec_ctx_t *codec_ctx,
iter = NULL;
while ((cx_pkt = vpx_codec_get_cx_data(codec_ctx, &iter))) {
switch (cx_pkt->kind) {
#if VPX_ENCODER_ABI_VERSION > (5 + VPX_CODEC_ABI_VERSION)
#if CONFIG_SPATIAL_SVC
case VPX_CODEC_SPATIAL_SVC_LAYER_PSNR: {
int i;
for (i = 0; i < svc_ctx->spatial_layers; ++i) {
int j;
svc_log(svc_ctx, SVC_LOG_DEBUG,
"SVC frame: %d, layer: %d, PSNR(Total/Y/U/V): "
"%2.3f %2.3f %2.3f %2.3f \n",
si->psnr_pkt_received, i, cx_pkt->data.layer_psnr[i].psnr[0],
cx_pkt->data.layer_psnr[i].psnr[1],
cx_pkt->data.layer_psnr[i].psnr[2],
cx_pkt->data.layer_psnr[i].psnr[3]);
svc_log(svc_ctx, SVC_LOG_DEBUG,
"SVC frame: %d, layer: %d, SSE(Total/Y/U/V): "
"%2.3f %2.3f %2.3f %2.3f \n",
si->psnr_pkt_received, i, cx_pkt->data.layer_psnr[i].sse[0],
cx_pkt->data.layer_psnr[i].sse[1],
cx_pkt->data.layer_psnr[i].sse[2],
cx_pkt->data.layer_psnr[i].sse[3]);
for (j = 0; j < COMPONENTS; ++j) {
si->psnr_sum[i][j] += cx_pkt->data.layer_psnr[i].psnr[j];
si->sse_sum[i][j] += cx_pkt->data.layer_psnr[i].sse[j];
}
}
++si->psnr_pkt_received;
break;
}
case VPX_CODEC_SPATIAL_SVC_LAYER_SIZES: {
int i;
for (i = 0; i < svc_ctx->spatial_layers; ++i)
si->bytes_sum[i] += cx_pkt->data.layer_sizes[i];
break;
}
#endif
#endif
case VPX_CODEC_PSNR_PKT: {
#if VPX_ENCODER_ABI_VERSION > (5 + VPX_CODEC_ABI_VERSION)
int j;
svc_log(svc_ctx, SVC_LOG_DEBUG,
"frame: %d, layer: %d, PSNR(Total/Y/U/V): "
"%2.3f %2.3f %2.3f %2.3f \n",
si->psnr_pkt_received, 0, cx_pkt->data.layer_psnr[0].psnr[0],
cx_pkt->data.layer_psnr[0].psnr[1],
cx_pkt->data.layer_psnr[0].psnr[2],
cx_pkt->data.layer_psnr[0].psnr[3]);
for (j = 0; j < COMPONENTS; ++j) {
si->psnr_sum[0][j] += cx_pkt->data.layer_psnr[0].psnr[j];
si->sse_sum[0][j] += cx_pkt->data.layer_psnr[0].sse[j];
}
#endif
}
++si->psnr_pkt_received;
break;
default: { break; }
case VPX_CODEC_PSNR_PKT: ++si->psnr_pkt_received; break;
default: break;
}
}
return VPX_CODEC_OK;
}
const char *vpx_svc_get_message(const SvcContext *svc_ctx) {
const SvcInternal_t *const si = get_const_svc_internal(svc_ctx);
if (svc_ctx == NULL || si == NULL) return NULL;
return si->message_buffer;
}
static double calc_psnr(double d) {
if (d == 0) return 100;
return -10.0 * log(d) / log(10.0);
}
// dump accumulated statistics and reset accumulated values
const char *vpx_svc_dump_statistics(SvcContext *svc_ctx) {
void vpx_svc_dump_statistics(SvcContext *svc_ctx) {
int number_of_frames;
int i, j;
uint32_t bytes_total = 0;
@@ -639,21 +576,19 @@ const char *vpx_svc_dump_statistics(SvcContext *svc_ctx) {
double y_scale;
SvcInternal_t *const si = get_svc_internal(svc_ctx);
if (svc_ctx == NULL || si == NULL) return NULL;
svc_log_reset(svc_ctx);
if (svc_ctx == NULL || si == NULL) return;
number_of_frames = si->psnr_pkt_received;
if (number_of_frames <= 0) return vpx_svc_get_message(svc_ctx);
if (number_of_frames <= 0) return;
svc_log(svc_ctx, SVC_LOG_INFO, "\n");
for (i = 0; i < svc_ctx->spatial_layers; ++i) {
svc_log(svc_ctx, SVC_LOG_INFO,
"Layer %d Average PSNR=[%2.3f, %2.3f, %2.3f, %2.3f], Bytes=[%u]\n",
i, (double)si->psnr_sum[i][0] / number_of_frames,
(double)si->psnr_sum[i][1] / number_of_frames,
(double)si->psnr_sum[i][2] / number_of_frames,
(double)si->psnr_sum[i][3] / number_of_frames, si->bytes_sum[i]);
i, si->psnr_sum[i][0] / number_of_frames,
si->psnr_sum[i][1] / number_of_frames,
si->psnr_sum[i][2] / number_of_frames,
si->psnr_sum[i][3] / number_of_frames, si->bytes_sum[i]);
// the following psnr calculation is deduced from ffmpeg.c#print_report
y_scale = si->width * si->height * 255.0 * 255.0 * number_of_frames;
scale[1] = y_scale;
@@ -684,7 +619,6 @@ const char *vpx_svc_dump_statistics(SvcContext *svc_ctx) {
si->psnr_pkt_received = 0;
svc_log(svc_ctx, SVC_LOG_INFO, "Total Bytes=[%u]\n", bytes_total);
return vpx_svc_get_message(svc_ctx);
}
void vpx_svc_release(SvcContext *svc_ctx) {
+4 -3
View File
@@ -84,6 +84,7 @@ static int get_frame_stats(vpx_codec_ctx_t *ctx, const vpx_image_t *img,
const uint8_t *const pkt_buf = pkt->data.twopass_stats.buf;
const size_t pkt_size = pkt->data.twopass_stats.sz;
stats->buf = realloc(stats->buf, stats->sz + pkt_size);
if (!stats->buf) die("Failed to reallocate stats buffer.");
memcpy((uint8_t *)stats->buf + stats->sz, pkt_buf, pkt_size);
stats->sz += pkt_size;
}
@@ -128,7 +129,7 @@ static vpx_fixed_buf_t pass0(vpx_image_t *raw, FILE *infile,
vpx_fixed_buf_t stats = { NULL, 0 };
if (vpx_codec_enc_init(&codec, encoder->codec_interface(), cfg, 0))
die_codec(&codec, "Failed to initialize encoder");
die("Failed to initialize encoder");
// Calculate frame statistics.
while (vpx_img_read(raw, infile)) {
@@ -164,7 +165,7 @@ static void pass1(vpx_image_t *raw, FILE *infile, const char *outfile_name,
if (!writer) die("Failed to open %s for writing", outfile_name);
if (vpx_codec_enc_init(&codec, encoder->codec_interface(), cfg, 0))
die_codec(&codec, "Failed to initialize encoder");
die("Failed to initialize encoder");
// Encode frames.
while (vpx_img_read(raw, infile)) {
@@ -221,7 +222,7 @@ int main(int argc, char **argv) {
die("Invalid frame size: %dx%d", w, h);
if (!vpx_img_alloc(&raw, VPX_IMG_FMT_I420, w, h, 1))
die("Failed to allocate image", w, h);
die("Failed to allocate image (%dx%d)", w, h);
printf("Using %s\n", vpx_codec_iface_name(encoder->codec_interface()));
+24 -27
View File
@@ -61,7 +61,7 @@ void usage_exit(void) { exit(EXIT_FAILURE); }
int (*read_frame_p)(FILE *f, vpx_image_t *img);
static int read_frame(FILE *f, vpx_image_t *img) {
static int mulres_read_frame(FILE *f, vpx_image_t *img) {
size_t nbytes, to_read;
int res = 1;
@@ -75,7 +75,7 @@ static int read_frame(FILE *f, vpx_image_t *img) {
return res;
}
static int read_frame_by_row(FILE *f, vpx_image_t *img) {
static int mulres_read_frame_by_row(FILE *f, vpx_image_t *img) {
size_t nbytes, to_read;
int res = 1;
int plane;
@@ -151,7 +151,7 @@ static void write_ivf_frame_header(FILE *outfile,
if (pkt->kind != VPX_CODEC_CX_FRAME_PKT) return;
pts = pkt->data.frame.pts;
mem_put_le32(header, pkt->data.frame.sz);
mem_put_le32(header, (int)pkt->data.frame.sz);
mem_put_le32(header + 4, pts & 0xFFFFFFFF);
mem_put_le32(header + 8, pts >> 32);
@@ -190,7 +190,7 @@ static void set_temporal_layer_pattern(int num_temporal_layers,
cfg->ts_layer_id[0] = 0;
cfg->ts_layer_id[1] = 1;
// Use 60/40 bit allocation as example.
cfg->ts_target_bitrate[0] = 0.6f * bitrate;
cfg->ts_target_bitrate[0] = (int)(0.6f * bitrate);
cfg->ts_target_bitrate[1] = bitrate;
/* 0=L, 1=GF */
@@ -241,8 +241,8 @@ static void set_temporal_layer_pattern(int num_temporal_layers,
cfg->ts_layer_id[2] = 1;
cfg->ts_layer_id[3] = 2;
// Use 45/20/35 bit allocation as example.
cfg->ts_target_bitrate[0] = 0.45f * bitrate;
cfg->ts_target_bitrate[1] = 0.65f * bitrate;
cfg->ts_target_bitrate[0] = (int)(0.45f * bitrate);
cfg->ts_target_bitrate[1] = (int)(0.65f * bitrate);
cfg->ts_target_bitrate[2] = bitrate;
/* 0=L, 1=GF, 2=ARF */
@@ -294,8 +294,8 @@ int main(int argc, char **argv) {
vpx_codec_err_t res[NUM_ENCODERS];
int i;
long width;
long height;
int width;
int height;
int length_frame;
int frame_avail;
int got_data;
@@ -347,12 +347,12 @@ int main(int argc, char **argv) {
printf("Using %s\n", vpx_codec_iface_name(interface));
width = strtol(argv[1], NULL, 0);
height = strtol(argv[2], NULL, 0);
framerate = strtol(argv[3], NULL, 0);
width = (int)strtol(argv[1], NULL, 0);
height = (int)strtol(argv[2], NULL, 0);
framerate = (int)strtol(argv[3], NULL, 0);
if (width < 16 || width % 2 || height < 16 || height % 2)
die("Invalid resolution: %ldx%ld", width, height);
die("Invalid resolution: %dx%d", width, height);
/* Open input video file for encoding */
if (!(infile = fopen(argv[4], "rb")))
@@ -371,15 +371,16 @@ int main(int argc, char **argv) {
// Bitrates per spatial layer: overwrite default rates above.
for (i = 0; i < NUM_ENCODERS; i++) {
target_bitrate[i] = strtol(argv[NUM_ENCODERS + 5 + i], NULL, 0);
target_bitrate[i] = (int)strtol(argv[NUM_ENCODERS + 5 + i], NULL, 0);
}
// Temporal layers per spatial layers: overwrite default settings above.
for (i = 0; i < NUM_ENCODERS; i++) {
num_temporal_layers[i] = strtol(argv[2 * NUM_ENCODERS + 5 + i], NULL, 0);
num_temporal_layers[i] =
(int)strtol(argv[2 * NUM_ENCODERS + 5 + i], NULL, 0);
if (num_temporal_layers[i] < 1 || num_temporal_layers[i] > 3)
die("Invalid temporal layers: %d, Must be 1, 2, or 3. \n",
num_temporal_layers);
num_temporal_layers[i]);
}
/* Open file to write out each spatially downsampled input stream. */
@@ -391,9 +392,9 @@ int main(int argc, char **argv) {
downsampled_input[i] = fopen(filename, "wb");
}
key_frame_insert = strtol(argv[3 * NUM_ENCODERS + 5], NULL, 0);
key_frame_insert = (int)strtol(argv[3 * NUM_ENCODERS + 5], NULL, 0);
show_psnr = strtol(argv[3 * NUM_ENCODERS + 6], NULL, 0);
show_psnr = (int)strtol(argv[3 * NUM_ENCODERS + 6], NULL, 0);
/* Populate default encoder configuration */
for (i = 0; i < NUM_ENCODERS; i++) {
@@ -467,12 +468,12 @@ int main(int argc, char **argv) {
/* Allocate image for each encoder */
for (i = 0; i < NUM_ENCODERS; i++)
if (!vpx_img_alloc(&raw[i], VPX_IMG_FMT_I420, cfg[i].g_w, cfg[i].g_h, 32))
die("Failed to allocate image", cfg[i].g_w, cfg[i].g_h);
die("Failed to allocate image (%dx%d)", cfg[i].g_w, cfg[i].g_h);
if (raw[0].stride[VPX_PLANE_Y] == raw[0].d_w)
read_frame_p = read_frame;
if (raw[0].stride[VPX_PLANE_Y] == (int)raw[0].d_w)
read_frame_p = mulres_read_frame;
else
read_frame_p = read_frame_by_row;
read_frame_p = mulres_read_frame_by_row;
for (i = 0; i < NUM_ENCODERS; i++)
if (outfile[i]) write_ivf_file_header(outfile[i], &cfg[i], 0);
@@ -558,7 +559,8 @@ int main(int argc, char **argv) {
/* Write out down-sampled input. */
length_frame = cfg[i].g_w * cfg[i].g_h * 3 / 2;
if (fwrite(raw[i].planes[0], 1, length_frame,
downsampled_input[NUM_ENCODERS - i - 1]) != length_frame) {
downsampled_input[NUM_ENCODERS - i - 1]) !=
(unsigned int)length_frame) {
return EXIT_FAILURE;
}
}
@@ -619,10 +621,6 @@ int main(int argc, char **argv) {
break;
default: break;
}
printf(pkt[i]->kind == VPX_CODEC_CX_FRAME_PKT &&
(pkt[i]->data.frame.flags & VPX_FRAME_IS_KEY)
? "K"
: "");
fflush(stdout);
}
}
@@ -663,7 +661,6 @@ int main(int argc, char **argv) {
write_ivf_file_header(outfile[i], &cfg[i], frame_cnt - 1);
fclose(outfile[i]);
}
printf("\n");
return EXIT_SUCCESS;
}
+1 -1
View File
@@ -155,7 +155,7 @@ int main(int argc, char **argv) {
die("Failed to open %s for reading.", argv[3]);
if (vpx_codec_enc_init(&codec, encoder->codec_interface(), &cfg, 0))
die_codec(&codec, "Failed to initialize encoder");
die("Failed to initialize encoder");
// Encode frames.
while (vpx_img_read(&raw, infile)) {
+1 -1
View File
@@ -110,7 +110,7 @@ int main(int argc, char **argv) {
die("Failed to open %s for reading.", argv[3]);
if (vpx_codec_enc_init(&codec, encoder->codec_interface(), &cfg, 0))
die_codec(&codec, "Failed to initialize encoder");
die("Failed to initialize encoder");
if (vpx_codec_control_(&codec, VP9E_SET_LOSSLESS, 1))
die_codec(&codec, "Failed to use lossless mode");
File diff suppressed because it is too large Load Diff
+2 -124
View File
@@ -60,7 +60,7 @@
static const char *exec_name;
void usage_exit() {
void usage_exit(void) {
fprintf(stderr,
"Usage: %s <width> <height> <infile> <outfile> "
"<frame> <limit(optional)>\n",
@@ -68,128 +68,6 @@ void usage_exit() {
exit(EXIT_FAILURE);
}
static int compare_img(const vpx_image_t *const img1,
const vpx_image_t *const img2) {
uint32_t l_w = img1->d_w;
uint32_t c_w = (img1->d_w + img1->x_chroma_shift) >> img1->x_chroma_shift;
const uint32_t c_h =
(img1->d_h + img1->y_chroma_shift) >> img1->y_chroma_shift;
uint32_t i;
int match = 1;
match &= (img1->fmt == img2->fmt);
match &= (img1->d_w == img2->d_w);
match &= (img1->d_h == img2->d_h);
for (i = 0; i < img1->d_h; ++i)
match &= (memcmp(img1->planes[VPX_PLANE_Y] + i * img1->stride[VPX_PLANE_Y],
img2->planes[VPX_PLANE_Y] + i * img2->stride[VPX_PLANE_Y],
l_w) == 0);
for (i = 0; i < c_h; ++i)
match &= (memcmp(img1->planes[VPX_PLANE_U] + i * img1->stride[VPX_PLANE_U],
img2->planes[VPX_PLANE_U] + i * img2->stride[VPX_PLANE_U],
c_w) == 0);
for (i = 0; i < c_h; ++i)
match &= (memcmp(img1->planes[VPX_PLANE_V] + i * img1->stride[VPX_PLANE_V],
img2->planes[VPX_PLANE_V] + i * img2->stride[VPX_PLANE_V],
c_w) == 0);
return match;
}
#define mmin(a, b) ((a) < (b) ? (a) : (b))
static void find_mismatch(const vpx_image_t *const img1,
const vpx_image_t *const img2, int yloc[4],
int uloc[4], int vloc[4]) {
const uint32_t bsize = 64;
const uint32_t bsizey = bsize >> img1->y_chroma_shift;
const uint32_t bsizex = bsize >> img1->x_chroma_shift;
const uint32_t c_w =
(img1->d_w + img1->x_chroma_shift) >> img1->x_chroma_shift;
const uint32_t c_h =
(img1->d_h + img1->y_chroma_shift) >> img1->y_chroma_shift;
int match = 1;
uint32_t i, j;
yloc[0] = yloc[1] = yloc[2] = yloc[3] = -1;
for (i = 0, match = 1; match && i < img1->d_h; i += bsize) {
for (j = 0; match && j < img1->d_w; j += bsize) {
int k, l;
const int si = mmin(i + bsize, img1->d_h) - i;
const int sj = mmin(j + bsize, img1->d_w) - j;
for (k = 0; match && k < si; ++k) {
for (l = 0; match && l < sj; ++l) {
if (*(img1->planes[VPX_PLANE_Y] +
(i + k) * img1->stride[VPX_PLANE_Y] + j + l) !=
*(img2->planes[VPX_PLANE_Y] +
(i + k) * img2->stride[VPX_PLANE_Y] + j + l)) {
yloc[0] = i + k;
yloc[1] = j + l;
yloc[2] = *(img1->planes[VPX_PLANE_Y] +
(i + k) * img1->stride[VPX_PLANE_Y] + j + l);
yloc[3] = *(img2->planes[VPX_PLANE_Y] +
(i + k) * img2->stride[VPX_PLANE_Y] + j + l);
match = 0;
break;
}
}
}
}
}
uloc[0] = uloc[1] = uloc[2] = uloc[3] = -1;
for (i = 0, match = 1; match && i < c_h; i += bsizey) {
for (j = 0; match && j < c_w; j += bsizex) {
int k, l;
const int si = mmin(i + bsizey, c_h - i);
const int sj = mmin(j + bsizex, c_w - j);
for (k = 0; match && k < si; ++k) {
for (l = 0; match && l < sj; ++l) {
if (*(img1->planes[VPX_PLANE_U] +
(i + k) * img1->stride[VPX_PLANE_U] + j + l) !=
*(img2->planes[VPX_PLANE_U] +
(i + k) * img2->stride[VPX_PLANE_U] + j + l)) {
uloc[0] = i + k;
uloc[1] = j + l;
uloc[2] = *(img1->planes[VPX_PLANE_U] +
(i + k) * img1->stride[VPX_PLANE_U] + j + l);
uloc[3] = *(img2->planes[VPX_PLANE_U] +
(i + k) * img2->stride[VPX_PLANE_U] + j + l);
match = 0;
break;
}
}
}
}
}
vloc[0] = vloc[1] = vloc[2] = vloc[3] = -1;
for (i = 0, match = 1; match && i < c_h; i += bsizey) {
for (j = 0; match && j < c_w; j += bsizex) {
int k, l;
const int si = mmin(i + bsizey, c_h - i);
const int sj = mmin(j + bsizex, c_w - j);
for (k = 0; match && k < si; ++k) {
for (l = 0; match && l < sj; ++l) {
if (*(img1->planes[VPX_PLANE_V] +
(i + k) * img1->stride[VPX_PLANE_V] + j + l) !=
*(img2->planes[VPX_PLANE_V] +
(i + k) * img2->stride[VPX_PLANE_V] + j + l)) {
vloc[0] = i + k;
vloc[1] = j + l;
vloc[2] = *(img1->planes[VPX_PLANE_V] +
(i + k) * img1->stride[VPX_PLANE_V] + j + l);
vloc[3] = *(img2->planes[VPX_PLANE_V] +
(i + k) * img2->stride[VPX_PLANE_V] + j + l);
match = 0;
break;
}
}
}
}
}
}
static void testing_decode(vpx_codec_ctx_t *encoder, vpx_codec_ctx_t *decoder,
unsigned int frame_out, int *mismatch_seen) {
vpx_image_t enc_img, dec_img;
@@ -373,7 +251,7 @@ int main(int argc, char **argv) {
die("Failed to open %s for reading.", infile_arg);
if (vpx_codec_enc_init(&ecodec, encoder->codec_interface(), &cfg, 0))
die_codec(&ecodec, "Failed to initialize encoder");
die("Failed to initialize encoder");
// Disable alt_ref.
if (vpx_codec_control(&ecodec, VP8E_SET_ENABLEAUTOALTREF, 0))
+130
View File
@@ -0,0 +1,130 @@
/*
* Copyright (c) 2018 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
/*
* Fuzzer for libvpx decoders
* ==========================
* Requirements
* --------------
* Requires Clang 6.0 or above as -fsanitize=fuzzer is used as a linker
* option.
* Steps to build
* --------------
* Clone libvpx repository
$git clone https://chromium.googlesource.com/webm/libvpx
* Create a directory in parallel to libvpx and change directory
$mkdir vpx_dec_fuzzer
$cd vpx_dec_fuzzer/
* Enable sanitizers (Supported: address integer memory thread undefined)
$source ../libvpx/tools/set_analyzer_env.sh address
* Configure libvpx.
* Note --size-limit and VPX_MAX_ALLOCABLE_MEMORY are defined to avoid
* Out of memory errors when running generated fuzzer binary
$../libvpx/configure --disable-unit-tests --size-limit=12288x12288 \
--extra-cflags="-fsanitize=fuzzer-no-link \
-DVPX_MAX_ALLOCABLE_MEMORY=1073741824" \
--disable-webm-io --enable-debug --disable-vp8-encoder \
--disable-vp9-encoder --disable-examples
* Build libvpx
$make -j32
* Build vp9 fuzzer
$ $CXX $CXXFLAGS -std=gnu++11 -DDECODER=vp9 \
-fsanitize=fuzzer -I../libvpx -I. -Wl,--start-group \
../libvpx/examples/vpx_dec_fuzzer.cc -o ./vpx_dec_fuzzer_vp9 \
./libvpx.a -Wl,--end-group
* DECODER should be defined as vp9 or vp8 to enable vp9/vp8
*
* create a corpus directory and copy some ivf files there.
* Based on which codec (vp8/vp9) is being tested, it is recommended to
* have corresponding ivf files in corpus directory
* Empty corpus directoy also is acceptable, though not recommended
$mkdir CORPUS && cp some-files CORPUS
* Run fuzzing:
$./vpx_dec_fuzzer_vp9 CORPUS
* References:
* http://llvm.org/docs/LibFuzzer.html
* https://github.com/google/oss-fuzz
*/
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <algorithm>
#include <memory>
#include "vpx/vp8dx.h"
#include "vpx/vpx_decoder.h"
#include "vpx_ports/mem_ops.h"
#define IVF_FRAME_HDR_SZ (4 + 8) /* 4 byte size + 8 byte timestamp */
#define IVF_FILE_HDR_SZ 32
#define VPXD_INTERFACE(name) VPXD_INTERFACE_(name)
#define VPXD_INTERFACE_(name) vpx_codec_##name##_dx()
extern "C" void usage_exit(void) { exit(EXIT_FAILURE); }
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
if (size <= IVF_FILE_HDR_SZ) {
return 0;
}
vpx_codec_ctx_t codec;
// Set thread count in the range [1, 64].
const unsigned int threads = (data[IVF_FILE_HDR_SZ] & 0x3f) + 1;
vpx_codec_dec_cfg_t cfg = { threads, 0, 0 };
if (vpx_codec_dec_init(&codec, VPXD_INTERFACE(DECODER), &cfg, 0)) {
return 0;
}
if (threads > 1) {
const int enable = (data[IVF_FILE_HDR_SZ] & 0xa0) != 0;
const vpx_codec_err_t err =
vpx_codec_control(&codec, VP9D_SET_LOOP_FILTER_OPT, enable);
static_cast<void>(err);
}
data += IVF_FILE_HDR_SZ;
size -= IVF_FILE_HDR_SZ;
while (size > IVF_FRAME_HDR_SZ) {
size_t frame_size = mem_get_le32(data);
size -= IVF_FRAME_HDR_SZ;
data += IVF_FRAME_HDR_SZ;
frame_size = std::min(size, frame_size);
vpx_codec_stream_info_t stream_info;
stream_info.sz = sizeof(stream_info);
vpx_codec_err_t err = vpx_codec_peek_stream_info(VPXD_INTERFACE(DECODER),
data, size, &stream_info);
static_cast<void>(err);
err = vpx_codec_decode(&codec, data, frame_size, nullptr, 0);
static_cast<void>(err);
vpx_codec_iter_t iter = nullptr;
vpx_image_t *img = nullptr;
while ((img = vpx_codec_get_frame(&codec, &iter)) != nullptr) {
}
data += frame_size;
size -= frame_size;
}
vpx_codec_destroy(&codec);
return 0;
}
+278 -51
View File
@@ -19,24 +19,38 @@
#include <string.h>
#include "./vpx_config.h"
#include "./y4minput.h"
#include "../vpx_ports/vpx_timer.h"
#include "vpx/vp8cx.h"
#include "vpx/vpx_encoder.h"
#include "vpx_ports/bitops.h"
#include "../tools_common.h"
#include "../video_writer.h"
#define ROI_MAP 0
#define zero(Dest) memset(&(Dest), 0, sizeof(Dest))
static const char *exec_name;
void usage_exit(void) { exit(EXIT_FAILURE); }
// Denoiser states, for temporal denoising.
enum denoiserState {
kDenoiserOff,
kDenoiserOnYOnly,
kDenoiserOnYUV,
kDenoiserOnYUVAggressive,
kDenoiserOnAdaptive
// Denoiser states for vp8, for temporal denoising.
enum denoiserStateVp8 {
kVp8DenoiserOff,
kVp8DenoiserOnYOnly,
kVp8DenoiserOnYUV,
kVp8DenoiserOnYUVAggressive,
kVp8DenoiserOnAdaptive
};
// Denoiser states for vp9, for temporal denoising.
enum denoiserStateVp9 {
kVp9DenoiserOff,
kVp9DenoiserOnYOnly,
// For SVC: denoise the top two spatial layers.
kVp9DenoiserOnYTwoSpatialLayers
};
static int mode_to_num_layers[13] = { 1, 2, 2, 3, 3, 3, 3, 5, 2, 3, 3, 3, 3 };
@@ -79,19 +93,21 @@ struct RateControlMetrics {
// in the stream.
static void set_rate_control_metrics(struct RateControlMetrics *rc,
vpx_codec_enc_cfg_t *cfg) {
unsigned int i = 0;
int i = 0;
// Set the layer (cumulative) framerate and the target layer (non-cumulative)
// per-frame-bandwidth, for the rate control encoding stats below.
const double framerate = cfg->g_timebase.den / cfg->g_timebase.num;
const int ts_number_layers = cfg->ts_number_layers;
rc->layer_framerate[0] = framerate / cfg->ts_rate_decimator[0];
rc->layer_pfb[0] =
1000.0 * rc->layer_target_bitrate[0] / rc->layer_framerate[0];
for (i = 0; i < cfg->ts_number_layers; ++i) {
for (i = 0; i < ts_number_layers; ++i) {
if (i > 0) {
rc->layer_framerate[i] = framerate / cfg->ts_rate_decimator[i];
rc->layer_pfb[i] = 1000.0 * (rc->layer_target_bitrate[i] -
rc->layer_target_bitrate[i - 1]) /
(rc->layer_framerate[i] - rc->layer_framerate[i - 1]);
rc->layer_pfb[i] =
1000.0 *
(rc->layer_target_bitrate[i] - rc->layer_target_bitrate[i - 1]) /
(rc->layer_framerate[i] - rc->layer_framerate[i - 1]);
}
rc->layer_input_frames[i] = 0;
rc->layer_enc_frames[i] = 0;
@@ -104,6 +120,9 @@ static void set_rate_control_metrics(struct RateControlMetrics *rc,
rc->window_size = 15;
rc->avg_st_encoding_bitrate = 0.0;
rc->variance_st_encoding_bitrate = 0.0;
// Target bandwidth for the whole stream.
// Set to layer_target_bitrate for highest layer (total bitrate).
cfg->rc_target_bitrate = rc->layer_target_bitrate[ts_number_layers - 1];
}
static void printout_rate_control_summary(struct RateControlMetrics *rc,
@@ -154,6 +173,107 @@ static void printout_rate_control_summary(struct RateControlMetrics *rc,
die("Error: Number of input frames not equal to output! \n");
}
#if ROI_MAP
static void set_roi_map(const char *enc_name, vpx_codec_enc_cfg_t *cfg,
vpx_roi_map_t *roi) {
unsigned int i, j;
int block_size = 0;
uint8_t is_vp8 = strncmp(enc_name, "vp8", 3) == 0 ? 1 : 0;
uint8_t is_vp9 = strncmp(enc_name, "vp9", 3) == 0 ? 1 : 0;
if (!is_vp8 && !is_vp9) {
die("unsupported codec.");
}
zero(*roi);
block_size = is_vp9 && !is_vp8 ? 8 : 16;
// ROI is based on the segments (4 for vp8, 8 for vp9), smallest unit for
// segment is 16x16 for vp8, 8x8 for vp9.
roi->rows = (cfg->g_h + block_size - 1) / block_size;
roi->cols = (cfg->g_w + block_size - 1) / block_size;
// Applies delta QP on the segment blocks, varies from -63 to 63.
// Setting to negative means lower QP (better quality).
// Below we set delta_q to the extreme (-63) to show strong effect.
// VP8 uses the first 4 segments. VP9 uses all 8 segments.
zero(roi->delta_q);
roi->delta_q[1] = -63;
// Applies delta loopfilter strength on the segment blocks, varies from -63 to
// 63. Setting to positive means stronger loopfilter. VP8 uses the first 4
// segments. VP9 uses all 8 segments.
zero(roi->delta_lf);
if (is_vp8) {
// Applies skip encoding threshold on the segment blocks, varies from 0 to
// UINT_MAX. Larger value means more skipping of encoding is possible.
// This skip threshold only applies on delta frames.
zero(roi->static_threshold);
}
if (is_vp9) {
// Apply skip segment. Setting to 1 means this block will be copied from
// previous frame.
zero(roi->skip);
}
if (is_vp9) {
// Apply ref frame segment.
// -1 : Do not apply this segment.
// 0 : Froce using intra.
// 1 : Force using last.
// 2 : Force using golden.
// 3 : Force using alfref but not used in non-rd pickmode for 0 lag.
memset(roi->ref_frame, -1, sizeof(roi->ref_frame));
roi->ref_frame[1] = 1;
}
// Use 2 states: 1 is center square, 0 is the rest.
roi->roi_map =
(uint8_t *)calloc(roi->rows * roi->cols, sizeof(*roi->roi_map));
for (i = 0; i < roi->rows; ++i) {
for (j = 0; j < roi->cols; ++j) {
if (i > (roi->rows >> 2) && i < ((roi->rows * 3) >> 2) &&
j > (roi->cols >> 2) && j < ((roi->cols * 3) >> 2)) {
roi->roi_map[i * roi->cols + j] = 1;
}
}
}
}
static void set_roi_skip_map(vpx_codec_enc_cfg_t *cfg, vpx_roi_map_t *roi,
int *skip_map, int *prev_mask_map, int frame_num) {
const int block_size = 8;
unsigned int i, j;
roi->rows = (cfg->g_h + block_size - 1) / block_size;
roi->cols = (cfg->g_w + block_size - 1) / block_size;
zero(roi->skip);
zero(roi->delta_q);
zero(roi->delta_lf);
memset(roi->ref_frame, -1, sizeof(roi->ref_frame));
roi->ref_frame[1] = 1;
// Use segment 3 for skip.
roi->skip[3] = 1;
roi->roi_map =
(uint8_t *)calloc(roi->rows * roi->cols, sizeof(*roi->roi_map));
for (i = 0; i < roi->rows; ++i) {
for (j = 0; j < roi->cols; ++j) {
const int idx = i * roi->cols + j;
// Use segment 3 for skip.
// prev_mask_map keeps track of blocks that have been stably on segment 3
// for the past 10 frames. Only skip when the block is on segment 3 in
// both current map and prev_mask_map.
if (skip_map[idx] == 1 && prev_mask_map[idx] == 1) roi->roi_map[idx] = 3;
// Reset it every 10 frames so it doesn't propagate for too many frames.
if (frame_num % 10 == 0)
prev_mask_map[idx] = skip_map[idx];
else if (prev_mask_map[idx] == 1 && skip_map[idx] == 0)
prev_mask_map[idx] = 0;
}
}
}
#endif
// Temporal scaling parameters:
// NOTE: The 3 prediction frames cannot be used interchangeably due to
// differences in the way they are handled throughout the code. The
@@ -486,6 +606,23 @@ static void set_temporal_layer_pattern(int layering_mode,
}
}
#if ROI_MAP
static void read_mask(FILE *mask_file, int *seg_map) {
int mask_rows, mask_cols, i, j;
int *map_start = seg_map;
fscanf(mask_file, "%d %d\n", &mask_cols, &mask_rows);
for (i = 0; i < mask_rows; i++) {
for (j = 0; j < mask_cols; j++) {
fscanf(mask_file, "%d ", &seg_map[j]);
// reverse the bit
seg_map[j] = 1 - seg_map[j];
}
seg_map += mask_cols;
}
seg_map = map_start;
}
#endif
int main(int argc, char **argv) {
VpxVideoWriter *outfile[VPX_TS_MAX_LAYERS] = { NULL };
vpx_codec_ctx_t codec;
@@ -495,6 +632,7 @@ int main(int argc, char **argv) {
vpx_codec_err_t res;
unsigned int width;
unsigned int height;
uint32_t error_resilient = 0;
int speed;
int frame_avail;
int got_data;
@@ -505,16 +643,15 @@ int main(int argc, char **argv) {
int layering_mode = 0;
int layer_flags[VPX_TS_MAX_PERIODICITY] = { 0 };
int flag_periodicity = 1;
#if VPX_ENCODER_ABI_VERSION > (4 + VPX_CODEC_ABI_VERSION)
vpx_svc_layer_id_t layer_id = { 0, 0 };
#else
vpx_svc_layer_id_t layer_id = { 0 };
#if ROI_MAP
vpx_roi_map_t roi;
#endif
vpx_svc_layer_id_t layer_id;
const VpxInterface *encoder = NULL;
FILE *infile = NULL;
struct VpxInputContext input_ctx;
struct RateControlMetrics rc;
int64_t cx_time = 0;
const int min_args_base = 12;
const int min_args_base = 13;
#if CONFIG_VP9_HIGHBITDEPTH
vpx_bit_depth_t bit_depth = VPX_BITS_8;
int input_bit_depth = 8;
@@ -525,18 +662,36 @@ int main(int argc, char **argv) {
double sum_bitrate = 0.0;
double sum_bitrate2 = 0.0;
double framerate = 30.0;
#if ROI_MAP
FILE *mask_file = NULL;
int block_size = 8;
int mask_rows = 0;
int mask_cols = 0;
int *mask_map;
int *prev_mask_map;
#endif
zero(rc.layer_target_bitrate);
memset(&layer_id, 0, sizeof(vpx_svc_layer_id_t));
memset(&input_ctx, 0, sizeof(input_ctx));
/* Setup default input stream settings */
input_ctx.framerate.numerator = 30;
input_ctx.framerate.denominator = 1;
input_ctx.only_i420 = 1;
input_ctx.bit_depth = 0;
exec_name = argv[0];
// Check usage and arguments.
if (argc < min_args) {
#if CONFIG_VP9_HIGHBITDEPTH
die("Usage: %s <infile> <outfile> <codec_type(vp8/vp9)> <width> <height> "
"<rate_num> <rate_den> <speed> <frame_drop_threshold> <threads> <mode> "
"<rate_num> <rate_den> <speed> <frame_drop_threshold> "
"<error_resilient> <threads> <mode> "
"<Rate_0> ... <Rate_nlayers-1> <bit-depth> \n",
argv[0]);
#else
die("Usage: %s <infile> <outfile> <codec_type(vp8/vp9)> <width> <height> "
"<rate_num> <rate_den> <speed> <frame_drop_threshold> <threads> <mode> "
"<rate_num> <rate_den> <speed> <frame_drop_threshold> "
"<error_resilient> <threads> <mode> "
"<Rate_0> ... <Rate_nlayers-1> \n",
argv[0]);
#endif // CONFIG_VP9_HIGHBITDEPTH
@@ -553,14 +708,23 @@ int main(int argc, char **argv) {
die("Invalid resolution: %d x %d", width, height);
}
layering_mode = (int)strtol(argv[11], NULL, 0);
layering_mode = (int)strtol(argv[12], NULL, 0);
if (layering_mode < 0 || layering_mode > 13) {
die("Invalid layering mode (0..12) %s", argv[11]);
die("Invalid layering mode (0..12) %s", argv[12]);
}
#if ROI_MAP
if (argc != min_args + mode_to_num_layers[layering_mode] + 1) {
die("Invalid number of arguments");
}
#else
if (argc != min_args + mode_to_num_layers[layering_mode]) {
die("Invalid number of arguments");
}
#endif
input_ctx.filename = argv[1];
open_input_file(&input_ctx);
#if CONFIG_VP9_HIGHBITDEPTH
switch (strtol(argv[argc - 1], NULL, 0)) {
@@ -578,14 +742,22 @@ int main(int argc, char **argv) {
break;
default: die("Invalid bit depth (8, 10, 12) %s", argv[argc - 1]);
}
if (!vpx_img_alloc(
&raw, bit_depth == VPX_BITS_8 ? VPX_IMG_FMT_I420 : VPX_IMG_FMT_I42016,
width, height, 32)) {
die("Failed to allocate image", width, height);
// Y4M reader has its own allocation.
if (input_ctx.file_type != FILE_TYPE_Y4M) {
if (!vpx_img_alloc(
&raw,
bit_depth == VPX_BITS_8 ? VPX_IMG_FMT_I420 : VPX_IMG_FMT_I42016,
width, height, 32)) {
die("Failed to allocate image (%dx%d)", width, height);
}
}
#else
if (!vpx_img_alloc(&raw, VPX_IMG_FMT_I420, width, height, 32)) {
die("Failed to allocate image", width, height);
// Y4M reader has its own allocation.
if (input_ctx.file_type != FILE_TYPE_Y4M) {
if (!vpx_img_alloc(&raw, VPX_IMG_FMT_I420, width, height, 32)) {
die("Failed to allocate image (%dx%d)", width, height);
}
}
#endif // CONFIG_VP9_HIGHBITDEPTH
@@ -616,14 +788,17 @@ int main(int argc, char **argv) {
if (speed < 0) {
die("Invalid speed setting: must be positive");
}
if (strncmp(encoder->name, "vp9", 3) == 0 && speed > 9) {
warn("Mapping speed %d to speed 9.\n", speed);
}
for (i = min_args_base;
(int)i < min_args_base + mode_to_num_layers[layering_mode]; ++i) {
rc.layer_target_bitrate[i - 12] = (int)strtol(argv[i], NULL, 0);
rc.layer_target_bitrate[i - 13] = (int)strtol(argv[i], NULL, 0);
if (strncmp(encoder->name, "vp8", 3) == 0)
cfg.ts_target_bitrate[i - 12] = rc.layer_target_bitrate[i - 12];
cfg.ts_target_bitrate[i - 13] = rc.layer_target_bitrate[i - 13];
else if (strncmp(encoder->name, "vp9", 3) == 0)
cfg.layer_target_bitrate[i - 12] = rc.layer_target_bitrate[i - 12];
cfg.layer_target_bitrate[i - 13] = rc.layer_target_bitrate[i - 13];
}
// Real time parameters.
@@ -634,7 +809,7 @@ int main(int argc, char **argv) {
if (strncmp(encoder->name, "vp9", 3) == 0) cfg.rc_max_quantizer = 52;
cfg.rc_undershoot_pct = 50;
cfg.rc_overshoot_pct = 50;
cfg.rc_buf_initial_sz = 500;
cfg.rc_buf_initial_sz = 600;
cfg.rc_buf_optimal_sz = 600;
cfg.rc_buf_sz = 1000;
@@ -642,10 +817,14 @@ int main(int argc, char **argv) {
cfg.rc_resize_allowed = 0;
// Use 1 thread as default.
cfg.g_threads = (unsigned int)strtoul(argv[10], NULL, 0);
cfg.g_threads = (unsigned int)strtoul(argv[11], NULL, 0);
error_resilient = (uint32_t)strtoul(argv[10], NULL, 0);
if (error_resilient != 0 && error_resilient != 1) {
die("Invalid value for error resilient (0, 1): %d.", error_resilient);
}
// Enable error resilient mode.
cfg.g_error_resilient = 1;
cfg.g_error_resilient = error_resilient;
cfg.g_lag_in_frames = 0;
cfg.kf_mode = VPX_KF_AUTO;
@@ -659,13 +838,15 @@ int main(int argc, char **argv) {
set_rate_control_metrics(&rc, &cfg);
// Target bandwidth for the whole stream.
// Set to layer_target_bitrate for highest layer (total bitrate).
cfg.rc_target_bitrate = rc.layer_target_bitrate[cfg.ts_number_layers - 1];
// Open input file.
if (!(infile = fopen(argv[1], "rb"))) {
die("Failed to open %s for reading", argv[1]);
if (input_ctx.file_type == FILE_TYPE_Y4M) {
if (input_ctx.width != cfg.g_w || input_ctx.height != cfg.g_h) {
die("Incorrect width or height: %d x %d", cfg.g_w, cfg.g_h);
}
if (input_ctx.framerate.numerator != cfg.g_timebase.den ||
input_ctx.framerate.denominator != cfg.g_timebase.num) {
die("Incorrect framerate: numerator %d denominator %d",
cfg.g_timebase.num, cfg.g_timebase.den);
}
}
framerate = cfg.g_timebase.den / cfg.g_timebase.num;
@@ -696,25 +877,45 @@ int main(int argc, char **argv) {
#else
if (vpx_codec_enc_init(&codec, encoder->codec_interface(), &cfg, 0))
#endif // CONFIG_VP9_HIGHBITDEPTH
die_codec(&codec, "Failed to initialize encoder");
die("Failed to initialize encoder");
#if ROI_MAP
mask_rows = (cfg.g_h + block_size - 1) / block_size;
mask_cols = (cfg.g_w + block_size - 1) / block_size;
mask_map = (int *)calloc(mask_rows * mask_cols, sizeof(*mask_map));
prev_mask_map = (int *)calloc(mask_rows * mask_cols, sizeof(*mask_map));
#endif
if (strncmp(encoder->name, "vp8", 3) == 0) {
vpx_codec_control(&codec, VP8E_SET_CPUUSED, -speed);
vpx_codec_control(&codec, VP8E_SET_NOISE_SENSITIVITY, kDenoiserOff);
vpx_codec_control(&codec, VP8E_SET_NOISE_SENSITIVITY, kVp8DenoiserOff);
vpx_codec_control(&codec, VP8E_SET_STATIC_THRESHOLD, 1);
vpx_codec_control(&codec, VP8E_SET_GF_CBR_BOOST_PCT, 0);
#if ROI_MAP
set_roi_map(encoder->name, &cfg, &roi);
if (vpx_codec_control(&codec, VP8E_SET_ROI_MAP, &roi))
die_codec(&codec, "Failed to set ROI map");
#endif
} else if (strncmp(encoder->name, "vp9", 3) == 0) {
vpx_svc_extra_cfg_t svc_params;
memset(&svc_params, 0, sizeof(svc_params));
vpx_codec_control(&codec, VP9E_SET_POSTENCODE_DROP, 0);
vpx_codec_control(&codec, VP9E_SET_DISABLE_OVERSHOOT_MAXQ_CBR, 0);
vpx_codec_control(&codec, VP8E_SET_CPUUSED, speed);
vpx_codec_control(&codec, VP9E_SET_AQ_MODE, 3);
vpx_codec_control(&codec, VP9E_SET_GF_CBR_BOOST_PCT, 0);
vpx_codec_control(&codec, VP9E_SET_FRAME_PARALLEL_DECODING, 0);
vpx_codec_control(&codec, VP9E_SET_FRAME_PERIODIC_BOOST, 0);
vpx_codec_control(&codec, VP9E_SET_NOISE_SENSITIVITY, kDenoiserOff);
vpx_codec_control(&codec, VP9E_SET_NOISE_SENSITIVITY, kVp9DenoiserOff);
vpx_codec_control(&codec, VP8E_SET_STATIC_THRESHOLD, 1);
vpx_codec_control(&codec, VP9E_SET_TUNE_CONTENT, 0);
vpx_codec_control(&codec, VP9E_SET_TILE_COLUMNS, (cfg.g_threads >> 1));
vpx_codec_control(&codec, VP9E_SET_TILE_COLUMNS, get_msb(cfg.g_threads));
vpx_codec_control(&codec, VP9E_SET_DISABLE_LOOPFILTER, 0);
if (cfg.g_threads > 1)
vpx_codec_control(&codec, VP9E_SET_ROW_MT, 1);
else
vpx_codec_control(&codec, VP9E_SET_ROW_MT, 0);
if (vpx_codec_control(&codec, VP9E_SET_SVC, layering_mode > 0 ? 1 : 0))
die_codec(&codec, "Failed to set SVC");
for (i = 0; i < cfg.ts_number_layers; ++i) {
@@ -733,7 +934,7 @@ int main(int argc, char **argv) {
// For generating smaller key frames, use a smaller max_intra_size_pct
// value, like 100 or 200.
{
const int max_intra_size_pct = 900;
const int max_intra_size_pct = 1000;
vpx_codec_control(&codec, VP8E_SET_MAX_INTRA_BITRATE_PCT,
max_intra_size_pct);
}
@@ -743,12 +944,14 @@ int main(int argc, char **argv) {
struct vpx_usec_timer timer;
vpx_codec_iter_t iter = NULL;
const vpx_codec_cx_pkt_t *pkt;
#if VPX_ENCODER_ABI_VERSION > (4 + VPX_CODEC_ABI_VERSION)
#if ROI_MAP
char mask_file_name[255];
#endif
// Update the temporal layer_id. No spatial layers in this test.
layer_id.spatial_layer_id = 0;
#endif
layer_id.temporal_layer_id =
cfg.ts_layer_id[frame_cnt % cfg.ts_periodicity];
layer_id.temporal_layer_id_per_spatial[0] = layer_id.temporal_layer_id;
if (strncmp(encoder->name, "vp9", 3) == 0) {
vpx_codec_control(&codec, VP9E_SET_SVC_LAYER_ID, &layer_id);
} else if (strncmp(encoder->name, "vp8", 3) == 0) {
@@ -757,7 +960,20 @@ int main(int argc, char **argv) {
}
flags = layer_flags[frame_cnt % flag_periodicity];
if (layering_mode == 0) flags = 0;
frame_avail = vpx_img_read(&raw, infile);
#if ROI_MAP
snprintf(mask_file_name, sizeof(mask_file_name), "%s%05d.txt",
argv[argc - 1], frame_cnt);
mask_file = fopen(mask_file_name, "r");
if (mask_file != NULL) {
read_mask(mask_file, mask_map);
fclose(mask_file);
// set_roi_map(encoder->name, &cfg, &roi);
set_roi_skip_map(&cfg, &roi, mask_map, prev_mask_map, frame_cnt);
if (vpx_codec_control(&codec, VP9E_SET_ROI_MAP, &roi))
die_codec(&codec, "Failed to set ROI map");
}
#endif
frame_avail = read_frame(&input_ctx, &raw);
if (frame_avail) ++rc.layer_input_frames[layer_id.temporal_layer_id];
vpx_usec_timer_start(&timer);
if (vpx_codec_encode(&codec, frame_avail ? &raw : NULL, pts, 1, flags,
@@ -794,6 +1010,7 @@ int main(int argc, char **argv) {
// Update for short-time encoding bitrate states, for moving window
// of size rc->window, shifted by rc->window / 2.
// Ignore first window segment, due to key frame.
if (rc.window_size == 0) rc.window_size = 15;
if (frame_cnt > rc.window_size) {
sum_bitrate += 0.001 * 8.0 * pkt->data.frame.sz * framerate;
if (frame_cnt % rc.window_size == 0) {
@@ -825,7 +1042,11 @@ int main(int argc, char **argv) {
++frame_cnt;
pts += frame_duration;
}
fclose(infile);
#if ROI_MAP
free(mask_map);
free(prev_mask_map);
#endif
close_input_file(&input_ctx);
printout_rate_control_summary(&rc, &cfg, frame_cnt);
printf("\n");
printf("Frame cnt and encoding time/FPS stats for encoding: %d %f %f \n",
@@ -837,6 +1058,12 @@ int main(int argc, char **argv) {
// Try to rewrite the output file headers with the actual frame count.
for (i = 0; i < cfg.ts_number_layers; ++i) vpx_video_writer_close(outfile[i]);
vpx_img_free(&raw);
if (input_ctx.file_type != FILE_TYPE_Y4M) {
vpx_img_free(&raw);
}
#if ROI_MAP
free(roi.roi_map);
#endif
return EXIT_SUCCESS;
}
+4 -4
View File
@@ -76,12 +76,12 @@ int ivf_read_frame(FILE *infile, uint8_t **buffer, size_t *bytes_read,
size_t frame_size = 0;
if (fread(raw_header, IVF_FRAME_HDR_SZ, 1, infile) != 1) {
if (!feof(infile)) warn("Failed to read frame size\n");
if (!feof(infile)) warn("Failed to read frame size");
} else {
frame_size = mem_get_le32(raw_header);
if (frame_size > 256 * 1024 * 1024) {
warn("Read invalid frame size (%u)\n", (unsigned int)frame_size);
warn("Read invalid frame size (%u)", (unsigned int)frame_size);
frame_size = 0;
}
@@ -92,7 +92,7 @@ int ivf_read_frame(FILE *infile, uint8_t **buffer, size_t *bytes_read,
*buffer = new_buffer;
*buffer_size = 2 * frame_size;
} else {
warn("Failed to allocate compressed data buffer\n");
warn("Failed to allocate compressed data buffer");
frame_size = 0;
}
}
@@ -100,7 +100,7 @@ int ivf_read_frame(FILE *infile, uint8_t **buffer, size_t *bytes_read,
if (!feof(infile)) {
if (fread(*buffer, 1, frame_size, infile) != frame_size) {
warn("Failed to read full frame\n");
warn("Failed to read full frame");
return 1;
}
+3 -3
View File
@@ -7,8 +7,8 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef IVFDEC_H_
#define IVFDEC_H_
#ifndef VPX_IVFDEC_H_
#define VPX_IVFDEC_H_
#include "./tools_common.h"
@@ -25,4 +25,4 @@ int ivf_read_frame(FILE *infile, uint8_t **buffer, size_t *bytes_read,
} /* extern "C" */
#endif
#endif // IVFDEC_H_
#endif // VPX_IVFDEC_H_
+19 -11
View File
@@ -13,27 +13,35 @@
#include "vpx/vpx_encoder.h"
#include "vpx_ports/mem_ops.h"
void ivf_write_file_header(FILE *outfile, const struct vpx_codec_enc_cfg *cfg,
unsigned int fourcc, int frame_cnt) {
void ivf_write_file_header_with_video_info(FILE *outfile, unsigned int fourcc,
int frame_cnt, int frame_width,
int frame_height,
vpx_rational_t timebase) {
char header[32];
header[0] = 'D';
header[1] = 'K';
header[2] = 'I';
header[3] = 'F';
mem_put_le16(header + 4, 0); // version
mem_put_le16(header + 6, 32); // header size
mem_put_le32(header + 8, fourcc); // fourcc
mem_put_le16(header + 12, cfg->g_w); // width
mem_put_le16(header + 14, cfg->g_h); // height
mem_put_le32(header + 16, cfg->g_timebase.den); // rate
mem_put_le32(header + 20, cfg->g_timebase.num); // scale
mem_put_le32(header + 24, frame_cnt); // length
mem_put_le32(header + 28, 0); // unused
mem_put_le16(header + 4, 0); // version
mem_put_le16(header + 6, 32); // header size
mem_put_le32(header + 8, fourcc); // fourcc
mem_put_le16(header + 12, frame_width); // width
mem_put_le16(header + 14, frame_height); // height
mem_put_le32(header + 16, timebase.den); // rate
mem_put_le32(header + 20, timebase.num); // scale
mem_put_le32(header + 24, frame_cnt); // length
mem_put_le32(header + 28, 0); // unused
fwrite(header, 1, 32, outfile);
}
void ivf_write_file_header(FILE *outfile, const struct vpx_codec_enc_cfg *cfg,
unsigned int fourcc, int frame_cnt) {
ivf_write_file_header_with_video_info(outfile, fourcc, frame_cnt, cfg->g_w,
cfg->g_h, cfg->g_timebase);
}
void ivf_write_frame_header(FILE *outfile, int64_t pts, size_t frame_size) {
char header[12];
+10 -3
View File
@@ -7,11 +7,13 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef IVFENC_H_
#define IVFENC_H_
#ifndef VPX_IVFENC_H_
#define VPX_IVFENC_H_
#include "./tools_common.h"
#include "vpx/vpx_encoder.h"
struct vpx_codec_enc_cfg;
struct vpx_codec_cx_pkt;
@@ -19,6 +21,11 @@ struct vpx_codec_cx_pkt;
extern "C" {
#endif
void ivf_write_file_header_with_video_info(FILE *outfile, unsigned int fourcc,
int frame_cnt, int frame_width,
int frame_height,
vpx_rational_t timebase);
void ivf_write_file_header(FILE *outfile, const struct vpx_codec_enc_cfg *cfg,
uint32_t fourcc, int frame_cnt);
@@ -30,4 +37,4 @@ void ivf_write_frame_size(FILE *outfile, size_t frame_size);
} /* extern "C" */
#endif
#endif // IVFENC_H_
#endif // VPX_IVFENC_H_
+8 -52
View File
@@ -654,12 +654,6 @@ VERBATIM_HEADERS = YES
ALPHABETICAL_INDEX = NO
# If the alphabetical index is enabled (see ALPHABETICAL_INDEX) then
# the COLS_IN_ALPHA_INDEX tag can be used to specify the number of columns
# in which this list will be split (can be a number in the range [1..20])
COLS_IN_ALPHA_INDEX = 5
# In case all classes in a project start with a common prefix, all
# classes will be put under the same header in the alphabetical index.
# The IGNORE_PREFIX tag can be used to specify one or more prefixes that
@@ -943,18 +937,6 @@ GENERATE_XML = NO
XML_OUTPUT = xml
# The XML_SCHEMA tag can be used to specify an XML schema,
# which can be used by a validating XML parser to check the
# syntax of the XML files.
XML_SCHEMA =
# The XML_DTD tag can be used to specify an XML DTD,
# which can be used by a validating XML parser to check the
# syntax of the XML files.
XML_DTD =
# If the XML_PROGRAMLISTING tag is set to YES Doxygen will
# dump the program listings (including syntax highlighting
# and cross-referencing information) to the XML output. Note that
@@ -1111,32 +1093,10 @@ ALLEXTERNALS = NO
EXTERNAL_GROUPS = YES
# The PERL_PATH should be the absolute path and name of the perl script
# interpreter (i.e. the result of `which perl').
PERL_PATH = /usr/bin/perl
#---------------------------------------------------------------------------
# Configuration options related to the dot tool
#---------------------------------------------------------------------------
# If the CLASS_DIAGRAMS tag is set to YES (the default) Doxygen will
# generate a inheritance diagram (in HTML, RTF and la_te_x) for classes with base
# or super classes. Setting the tag to NO turns the diagrams off. Note that
# this option is superseded by the HAVE_DOT option below. This is only a
# fallback. It is recommended to install and use dot, since it yields more
# powerful graphs.
CLASS_DIAGRAMS = YES
# You can define message sequence charts within doxygen comments using the \msc
# command. Doxygen will then run the mscgen tool (see http://www.mcternan.me.uk/mscgen/) to
# produce the chart and insert it in the documentation. The MSCGEN_PATH tag allows you to
# specify the directory where the mscgen tool resides. If left empty the tool is assumed to
# be found in the default search path.
MSCGEN_PATH =
# If set to YES, the inheritance and collaboration graphs will hide
# inheritance and usage relations if the target is undocumented
# or is not a class.
@@ -1150,10 +1110,14 @@ HIDE_UNDOC_RELATIONS = YES
HAVE_DOT = NO
# If the CLASS_GRAPH and HAVE_DOT tags are set to YES then doxygen
# will generate a graph for each documented class showing the direct and
# indirect inheritance relations. Setting this tag to YES will force the
# the CLASS_DIAGRAMS tag to NO.
# If the CLASS_GRAPH tag is set to YES (or GRAPH) then doxygen will generate a
# graph for each documented class showing the direct and indirect inheritance
# relations. In case HAVE_DOT is set as well dot will be used to draw the graph,
# otherwise the built-in generator will be used. If the CLASS_GRAPH tag is set
# to TEXT the direct and indirect inheritance relations will be shown as texts /
# links.
# Possible values are: NO, YES, TEXT and GRAPH.
# The default value is: YES.
CLASS_GRAPH = YES
@@ -1259,14 +1223,6 @@ DOT_GRAPH_MAX_NODES = 50
MAX_DOT_GRAPH_DEPTH = 0
# Set the DOT_TRANSPARENT tag to YES to generate images with a transparent
# background. This is disabled by default, which results in a white background.
# Warning: Depending on the platform used, enabling this option may lead to
# badly anti-aliased labels on the edges of a graph (i.e. they become hard to
# read).
DOT_TRANSPARENT = YES
# Set the DOT_MULTI_TARGETS tag to YES allow dot to generate multiple output
# files in one run (i.e. multiple -o and -T options on the command line). This
# makes dot run faster, but since only newer versions of dot (>1.8.10)
+239 -56
View File
@@ -11,7 +11,7 @@
# ARM assembly files are written in RVCT-style. We use some make magic to
# filter those files to allow GCC compilation
ifeq ($(ARCH_ARM),yes)
ifeq ($(VPX_ARCH_ARM),yes)
ASM:=$(if $(filter yes,$(CONFIG_GCC)$(CONFIG_MSVS)),.asm.S,.asm)
else
ASM:=.asm
@@ -63,6 +63,7 @@ ifeq ($(CONFIG_VP8_ENCODER),yes)
CODEC_SRCS-yes += $(addprefix $(VP8_PREFIX),$(call enabled,VP8_CX_SRCS))
CODEC_EXPORTS-yes += $(addprefix $(VP8_PREFIX),$(VP8_CX_EXPORTS))
INSTALL-LIBS-yes += include/vpx/vp8.h include/vpx/vp8cx.h
INSTALL-LIBS-yes += include/vpx/vpx_ext_ratectrl.h
INSTALL_MAPS += include/vpx/% $(SRC_PATH_BARE)/$(VP8_PREFIX)/%
CODEC_DOC_SECTIONS += vp8 vp8_encoder
endif
@@ -87,13 +88,35 @@ ifeq ($(CONFIG_VP9_ENCODER),yes)
CODEC_SRCS-yes += $(addprefix $(VP9_PREFIX),$(call enabled,VP9_CX_SRCS))
CODEC_EXPORTS-yes += $(addprefix $(VP9_PREFIX),$(VP9_CX_EXPORTS))
CODEC_SRCS-yes += $(VP9_PREFIX)vp9cx.mk vpx/vp8.h vpx/vp8cx.h
CODEC_SRCS-yes += vpx/vpx_ext_ratectrl.h
INSTALL-LIBS-yes += include/vpx/vp8.h include/vpx/vp8cx.h
INSTALL-LIBS-$(CONFIG_SPATIAL_SVC) += include/vpx/svc_context.h
INSTALL-LIBS-yes += include/vpx/vpx_ext_ratectrl.h
INSTALL_MAPS += include/vpx/% $(SRC_PATH_BARE)/$(VP9_PREFIX)/%
CODEC_DOC_SRCS += vpx/vp8.h vpx/vp8cx.h
CODEC_DOC_SRCS += vpx/vp8.h vpx/vp8cx.h vpx/vpx_ext_ratectrl.h
CODEC_DOC_SECTIONS += vp9 vp9_encoder
endif
RC_RTC_SRCS := vpx/vp8.h vpx/vp8cx.h
RC_RTC_SRCS += vpx/vpx_ext_ratectrl.h
RC_RTC_SRCS += vpx/internal/vpx_ratectrl_rtc.h
ifeq ($(CONFIG_VP9_ENCODER),yes)
VP9_PREFIX=vp9/
RC_RTC_SRCS += $(addprefix $(VP9_PREFIX),$(call enabled,VP9_CX_SRCS))
RC_RTC_SRCS += $(VP9_PREFIX)vp9cx.mk
RC_RTC_SRCS += $(VP9_PREFIX)ratectrl_rtc.cc
RC_RTC_SRCS += $(VP9_PREFIX)ratectrl_rtc.h
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(VP9_PREFIX)ratectrl_rtc.cc
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(VP9_PREFIX)ratectrl_rtc.h
endif
ifeq ($(CONFIG_VP8_ENCODER),yes)
VP8_PREFIX=vp8/
RC_RTC_SRCS += $(addprefix $(VP8_PREFIX),$(call enabled,VP8_CX_SRCS))
RC_RTC_SRCS += $(VP8_PREFIX)vp8_ratectrl_rtc.cc
RC_RTC_SRCS += $(VP8_PREFIX)vp8_ratectrl_rtc.h
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(VP8_PREFIX)vp8_ratectrl_rtc.cc
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(VP8_PREFIX)vp8_ratectrl_rtc.h
endif
ifeq ($(CONFIG_VP9_DECODER),yes)
VP9_PREFIX=vp9/
include $(SRC_PATH_BARE)/$(VP9_PREFIX)vp9dx.mk
@@ -113,16 +136,10 @@ ifeq ($(CONFIG_DECODERS),yes)
CODEC_DOC_SECTIONS += decoder
endif
# Suppress -Wextra warnings in third party code.
$(BUILD_PFX)third_party/googletest/%.cc.o: CXXFLAGS += -Wno-missing-field-initializers
# Suppress -Wextra warnings in first party code pending investigation.
# https://bugs.chromium.org/p/webm/issues/detail?id=1069
$(BUILD_PFX)vp8/encoder/onyx_if.c.o: CFLAGS += -Wno-unknown-warning-option -Wno-clobbered
$(BUILD_PFX)vp8/decoder/onyxd_if.c.o: CFLAGS += -Wno-unknown-warning-option -Wno-clobbered
ifeq ($(CONFIG_MSVS),yes)
CODEC_LIB=$(if $(CONFIG_STATIC_MSVCRT),vpxmt,vpxmd)
GTEST_LIB=$(if $(CONFIG_STATIC_MSVCRT),gtestmt,gtestmd)
RC_RTC_LIB=$(if $(CONFIG_STATIC_MSVCRT),vpxrcmt,vpxrcmd)
# This variable uses deferred expansion intentionally, since the results of
# $(wildcard) may change during the course of the Make.
VS_PLATFORMS = $(foreach d,$(wildcard */Release/$(CODEC_LIB).lib),$(word 1,$(subst /, ,$(d))))
@@ -147,14 +164,12 @@ CODEC_SRCS-yes += vpx_ports/mem_ops_aligned.h
CODEC_SRCS-yes += vpx_ports/vpx_once.h
CODEC_SRCS-yes += $(BUILD_PFX)vpx_config.c
INSTALL-SRCS-no += $(BUILD_PFX)vpx_config.c
ifeq ($(ARCH_X86)$(ARCH_X86_64),yes)
ifeq ($(VPX_ARCH_X86)$(VPX_ARCH_X86_64),yes)
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += third_party/x86inc/x86inc.asm
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += vpx_dsp/x86/bitdepth_conversion_sse2.asm
endif
CODEC_EXPORTS-yes += vpx/exports_com
CODEC_EXPORTS-$(CONFIG_ENCODERS) += vpx/exports_enc
ifeq ($(CONFIG_SPATIAL_SVC),yes)
CODEC_EXPORTS-$(CONFIG_ENCODERS) += vpx/exports_spatial_svc
endif
CODEC_EXPORTS-$(CONFIG_DECODERS) += vpx/exports_dec
INSTALL-LIBS-yes += include/vpx/vpx_codec.h
@@ -163,6 +178,7 @@ INSTALL-LIBS-yes += include/vpx/vpx_image.h
INSTALL-LIBS-yes += include/vpx/vpx_integer.h
INSTALL-LIBS-$(CONFIG_DECODERS) += include/vpx/vpx_decoder.h
INSTALL-LIBS-$(CONFIG_ENCODERS) += include/vpx/vpx_encoder.h
INSTALL-LIBS-$(CONFIG_ENCODERS) += include/vpx/vpx_tpl.h
ifeq ($(CONFIG_EXTERNAL_BUILD),yes)
ifeq ($(CONFIG_MSVS),yes)
INSTALL-LIBS-yes += $(foreach p,$(VS_PLATFORMS),$(LIBSUBDIR)/$(p)/$(CODEC_LIB).lib)
@@ -175,7 +191,18 @@ INSTALL-LIBS-$(CONFIG_STATIC) += $(LIBSUBDIR)/libvpx.a
INSTALL-LIBS-$(CONFIG_DEBUG_LIBS) += $(LIBSUBDIR)/libvpx_g.a
endif
ifeq ($(CONFIG_VP9_ENCODER)$(CONFIG_RATE_CTRL),yesyes)
SIMPLE_ENCODE_SRCS := $(call enabled,CODEC_SRCS)
SIMPLE_ENCODE_SRCS += $(VP9_PREFIX)simple_encode.cc
SIMPLE_ENCODE_SRCS += $(VP9_PREFIX)simple_encode.h
SIMPLE_ENCODE_SRCS += ivfenc.h
SIMPLE_ENCODE_SRCS += ivfenc.c
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(VP9_PREFIX)simple_encode.cc
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(VP9_PREFIX)simple_encode.h
endif
CODEC_SRCS=$(call enabled,CODEC_SRCS)
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(CODEC_SRCS)
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(call enabled,CODEC_EXPORTS)
@@ -187,6 +214,13 @@ libvpx_srcs.txt:
@echo $(CODEC_SRCS) | xargs -n1 echo | LC_ALL=C sort -u > $@
CLEAN-OBJS += libvpx_srcs.txt
# Assembly files that are included, but don't define symbols themselves.
# Filtered out to avoid Windows build warnings.
ASM_INCLUDES := \
third_party/x86inc/x86inc.asm \
vpx_config.asm \
vpx_ports/x86_abi_support.asm \
vpx_dsp/x86/bitdepth_conversion_sse2.asm \
ifeq ($(CONFIG_EXTERNAL_BUILD),yes)
ifeq ($(CONFIG_MSVS),yes)
@@ -198,12 +232,7 @@ vpx.def: $(call enabled,CODEC_EXPORTS)
--out=$@ $^
CLEAN-OBJS += vpx.def
# Assembly files that are included, but don't define symbols themselves.
# Filtered out to avoid Visual Studio build warnings.
ASM_INCLUDES := \
third_party/x86inc/x86inc.asm \
vpx_config.asm \
vpx_ports/x86_abi_support.asm \
vpx.$(VCPROJ_SFX): VCPROJ_SRCS=$(filter-out $(addprefix %, $(ASM_INCLUDES)), $^)
vpx.$(VCPROJ_SFX): $(CODEC_SRCS) vpx.def
@echo " [CREATE] $@"
@@ -217,7 +246,16 @@ vpx.$(VCPROJ_SFX): $(CODEC_SRCS) vpx.def
--ver=$(CONFIG_VS_VERSION) \
--src-path-bare="$(SRC_PATH_BARE)" \
--out=$@ $(CFLAGS) \
$(filter-out $(addprefix %, $(ASM_INCLUDES)), $^) \
--as=$(AS) \
$(filter $(SRC_PATH_BARE)/vp8/%.c, $(VCPROJ_SRCS)) \
$(filter $(SRC_PATH_BARE)/vp8/%.h, $(VCPROJ_SRCS)) \
$(filter $(SRC_PATH_BARE)/vp9/%.c, $(VCPROJ_SRCS)) \
$(filter $(SRC_PATH_BARE)/vp9/%.h, $(VCPROJ_SRCS)) \
$(filter $(SRC_PATH_BARE)/vpx/%, $(VCPROJ_SRCS)) \
$(filter $(SRC_PATH_BARE)/vpx_dsp/%, $(VCPROJ_SRCS)) \
$(filter-out $(addprefix $(SRC_PATH_BARE)/, \
vp8/%.c vp8/%.h vp9/%.c vp9/%.h vpx/% vpx_dsp/%), \
$(VCPROJ_SRCS)) \
--src-path-bare="$(SRC_PATH_BARE)" \
PROJECTS-yes += vpx.$(VCPROJ_SFX)
@@ -225,15 +263,58 @@ PROJECTS-yes += vpx.$(VCPROJ_SFX)
vpx.$(VCPROJ_SFX): vpx_config.asm
vpx.$(VCPROJ_SFX): $(RTCD)
endif
else
LIBVPX_OBJS=$(call objs,$(CODEC_SRCS))
vpxrc.$(VCPROJ_SFX): \
VCPROJ_SRCS=$(filter-out $(addprefix %, $(ASM_INCLUDES)), $^)
vpxrc.$(VCPROJ_SFX): $(RC_RTC_SRCS)
@echo " [CREATE] $@"
$(qexec)$(GEN_VCPROJ) \
$(if $(CONFIG_SHARED),--dll,--lib) \
--target=$(TOOLCHAIN) \
$(if $(CONFIG_STATIC_MSVCRT),--static-crt) \
--name=vpxrc \
--proj-guid=C26FF952-9494-4838-9A3F-7F3D4F613385 \
--ver=$(CONFIG_VS_VERSION) \
--src-path-bare="$(SRC_PATH_BARE)" \
--out=$@ $(CFLAGS) \
--as=$(AS) \
$(filter $(SRC_PATH_BARE)/vp9/%.c, $(VCPROJ_SRCS)) \
$(filter $(SRC_PATH_BARE)/vp9/%.cc, $(VCPROJ_SRCS)) \
$(filter $(SRC_PATH_BARE)/vp9/%.h, $(VCPROJ_SRCS)) \
$(filter $(SRC_PATH_BARE)/vpx/%, $(VCPROJ_SRCS)) \
$(filter $(SRC_PATH_BARE)/vpx_dsp/%, $(VCPROJ_SRCS)) \
$(filter-out $(addprefix $(SRC_PATH_BARE)/, \
vp8/%.c vp8/%.h vp9/%.c vp9/%.cc vp9/%.h vpx/% \
vpx_dsp/%), \
$(VCPROJ_SRCS)) \
--src-path-bare="$(SRC_PATH_BARE)" \
PROJECTS-yes += vpxrc.$(VCPROJ_SFX)
vpxrc.$(VCPROJ_SFX): vpx_config.asm
vpxrc.$(VCPROJ_SFX): $(RTCD)
endif # ifeq ($(CONFIG_MSVS),yes)
else # ifeq ($(CONFIG_EXTERNAL_BUILD),yes)
LIBVPX_OBJS=$(call objs, $(filter-out $(ASM_INCLUDES), $(CODEC_SRCS)))
OBJS-yes += $(LIBVPX_OBJS)
LIBS-$(if yes,$(CONFIG_STATIC)) += $(BUILD_PFX)libvpx.a $(BUILD_PFX)libvpx_g.a
$(BUILD_PFX)libvpx_g.a: $(LIBVPX_OBJS)
SO_VERSION_MAJOR := 4
SO_VERSION_MINOR := 1
# Updating version info.
# https://www.gnu.org/software/libtool/manual/libtool.html#Updating-version-info
# For libtool: c=<current>, a=<age>, r=<revision>
# libtool generates .so file as .so.[c-a].a.r, while -version-info c:r:a is
# passed to libtool.
#
# libvpx library file is generated as libvpx.so.<MAJOR>.<MINOR>.<PATCH>
# MAJOR = c-a, MINOR = a, PATCH = r
#
# To determine SO_VERSION_{MAJOR,MINOR,PATCH}, calculate c,a,r with current
# SO_VERSION_* then follow the rules in the link to detemine the new version
# (c1, a1, r1) and set MAJOR to [c1-a1], MINOR to a1 and PATCH to r1
SO_VERSION_MAJOR := 11
SO_VERSION_MINOR := 0
SO_VERSION_PATCH := 0
ifeq ($(filter darwin%,$(TGT_OS)),$(TGT_OS))
LIBVPX_SO := libvpx.$(SO_VERSION_MAJOR).dylib
@@ -273,18 +354,6 @@ $(BUILD_PFX)$(LIBVPX_SO): extralibs += -lm
$(BUILD_PFX)$(LIBVPX_SO): SONAME = libvpx.so.$(SO_VERSION_MAJOR)
$(BUILD_PFX)$(LIBVPX_SO): EXPORTS_FILE = $(EXPORT_FILE)
libvpx.ver: $(call enabled,CODEC_EXPORTS)
@echo " [CREATE] $@"
$(qexec)echo "{ global:" > $@
$(qexec)for f in $?; do awk '{print $$2";"}' < $$f >>$@; done
$(qexec)echo "local: *; };" >> $@
CLEAN-OBJS += libvpx.ver
libvpx.syms: $(call enabled,CODEC_EXPORTS)
@echo " [CREATE] $@"
$(qexec)awk '{print "_"$$2}' $^ >$@
CLEAN-OBJS += libvpx.syms
libvpx.def: $(call enabled,CODEC_EXPORTS)
@echo " [CREATE] $@"
$(qexec)echo LIBRARY $(LIBVPX_SO:.dll=) INITINSTANCE TERMINSTANCE > $@
@@ -342,22 +411,49 @@ endif
INSTALL-LIBS-yes += $(LIBSUBDIR)/pkgconfig/vpx.pc
INSTALL_MAPS += $(LIBSUBDIR)/pkgconfig/%.pc %.pc
CLEAN-OBJS += vpx.pc
ifeq ($(CONFIG_ENCODERS),yes)
RC_RTC_OBJS=$(call objs,$(RC_RTC_SRCS))
OBJS-yes += $(RC_RTC_OBJS)
LIBS-yes += $(BUILD_PFX)libvpxrc.a $(BUILD_PFX)libvpxrc_g.a
$(BUILD_PFX)libvpxrc_g.a: $(RC_RTC_OBJS)
endif
ifeq ($(CONFIG_VP9_ENCODER)$(CONFIG_RATE_CTRL),yesyes)
SIMPLE_ENCODE_OBJS=$(call objs,$(SIMPLE_ENCODE_SRCS))
OBJS-yes += $(SIMPLE_ENCODE_OBJS)
LIBS-yes += $(BUILD_PFX)libsimple_encode.a $(BUILD_PFX)libsimple_encode_g.a
$(BUILD_PFX)libsimple_encode_g.a: $(SIMPLE_ENCODE_OBJS)
endif
endif # ifeq ($(CONFIG_EXTERNAL_BUILD),yes)
libvpx.ver: $(call enabled,CODEC_EXPORTS)
@echo " [CREATE] $@"
$(qexec)echo "{ global:" > $@
$(qexec)for f in $?; do awk '{print $$2";"}' < $$f >>$@; done
$(qexec)echo "local: *; };" >> $@
CLEAN-OBJS += libvpx.ver
libvpx.syms: $(call enabled,CODEC_EXPORTS)
@echo " [CREATE] $@"
$(qexec)awk '{print "_"$$2}' $^ >$@
CLEAN-OBJS += libvpx.syms
#
# Rule to make assembler configuration file from C configuration file
#
ifeq ($(ARCH_X86)$(ARCH_X86_64),yes)
ifeq ($(VPX_ARCH_X86)$(VPX_ARCH_X86_64),yes)
# YASM
$(BUILD_PFX)vpx_config.asm: $(BUILD_PFX)vpx_config.h
@echo " [CREATE] $@"
@egrep "#define [A-Z0-9_]+ [01]" $< \
@LC_ALL=C grep -E "#define [A-Z0-9_]+ [01]" $< \
| awk '{print $$2 " equ " $$3}' > $@
else
ADS2GAS=$(if $(filter yes,$(CONFIG_GCC)),| $(ASM_CONVERSION))
$(BUILD_PFX)vpx_config.asm: $(BUILD_PFX)vpx_config.h
@echo " [CREATE] $@"
@egrep "#define [A-Z0-9_]+ [01]" $< \
@LC_ALL=C grep -E "#define [A-Z0-9_]+ [01]" $< \
| awk '{print $$2 " EQU " $$3}' $(ADS2GAS) > $@
@echo " END" $(ADS2GAS) >> $@
CLEAN-OBJS += $(BUILD_PFX)vpx_config.asm
@@ -387,27 +483,61 @@ ifeq ($(CONFIG_UNIT_TESTS),yes)
LIBVPX_TEST_DATA_PATH ?= .
include $(SRC_PATH_BARE)/test/test.mk
LIBVPX_TEST_SRCS=$(addprefix test/,$(call enabled,LIBVPX_TEST_SRCS))
# addprefix_clean behaves like addprefix if the target doesn't start with "../"
# However, if the target starts with "../", instead of adding prefix,
# it will remove "../".
# Using addprefix_clean, we can avoid two different targets building the
# same file, i.e.
# test/../ivfenc.c.d: ivfenc.o
# ivfenc.c.d: ivfenc.o
# Note that the other way to solve this problem is using "realpath".
# The "realpath" is supported by make 3.81 or later.
addprefix_clean=$(patsubst $(1)../%,%,$(addprefix $(1), $(2)))
LIBVPX_TEST_SRCS=$(call addprefix_clean,test/,$(call enabled,LIBVPX_TEST_SRCS))
LIBVPX_TEST_BIN=./test_libvpx$(EXE_SFX)
LIBVPX_TEST_DATA=$(addprefix $(LIBVPX_TEST_DATA_PATH)/,\
$(call enabled,LIBVPX_TEST_DATA))
libvpx_test_data_url=https://storage.googleapis.com/downloads.webmproject.org/test_data/libvpx/$(1)
TEST_INTRA_PRED_SPEED_BIN=./test_intra_pred_speed$(EXE_SFX)
TEST_INTRA_PRED_SPEED_SRCS=$(addprefix test/,$(call enabled,TEST_INTRA_PRED_SPEED_SRCS))
TEST_INTRA_PRED_SPEED_SRCS=$(call addprefix_clean,test/,\
$(call enabled,TEST_INTRA_PRED_SPEED_SRCS))
TEST_INTRA_PRED_SPEED_OBJS := $(sort $(call objs,$(TEST_INTRA_PRED_SPEED_SRCS)))
ifeq ($(CONFIG_ENCODERS),yes)
RC_INTERFACE_TEST_BIN=./test_rc_interface$(EXE_SFX)
RC_INTERFACE_TEST_SRCS=$(call addprefix_clean,test/,\
$(call enabled,RC_INTERFACE_TEST_SRCS))
RC_INTERFACE_TEST_OBJS := $(sort $(call objs,$(RC_INTERFACE_TEST_SRCS)))
endif
SIMPLE_ENCODE_TEST_BIN=./test_simple_encode$(EXE_SFX)
SIMPLE_ENCODE_TEST_SRCS=$(call addprefix_clean,test/,\
$(call enabled,SIMPLE_ENCODE_TEST_SRCS))
SIMPLE_ENCODE_TEST_OBJS := $(sort $(call objs,$(SIMPLE_ENCODE_TEST_SRCS)))
libvpx_test_srcs.txt:
@echo " [CREATE] $@"
@echo $(LIBVPX_TEST_SRCS) | xargs -n1 echo | LC_ALL=C sort -u > $@
CLEAN-OBJS += libvpx_test_srcs.txt
# Attempt to download the file using curl, retrying once if it fails for a
# partial file (18).
$(LIBVPX_TEST_DATA): $(SRC_PATH_BARE)/test/test-data.sha1
@echo " [DOWNLOAD] $@"
$(qexec)trap 'rm -f $@' INT TERM &&\
curl --retry 1 -L -o $@ $(call libvpx_test_data_url,$(@F))
$(qexec)( \
trap 'rm -f $@' INT TERM; \
curl="curl -S -s --retry 1 -L -o $@ $(call libvpx_test_data_url,$(@F))"; \
$$curl; ret=$$?; \
case "$$ret" in \
18) $$curl -C - ;; \
*) exit $$ret ;; \
esac \
)
testdata:: $(LIBVPX_TEST_DATA)
testdata: $(LIBVPX_TEST_DATA)
$(qexec)[ -x "$$(which sha1sum)" ] && sha1sum=sha1sum;\
[ -x "$$(which shasum)" ] && sha1sum=shasum;\
[ -x "$$(which sha1)" ] && sha1sum=sha1;\
@@ -416,7 +546,7 @@ testdata:: $(LIBVPX_TEST_DATA)
echo "Checking test data:";\
for f in $(call enabled,LIBVPX_TEST_DATA); do\
grep $$f $(SRC_PATH_BARE)/test/test-data.sha1 |\
(cd $(LIBVPX_TEST_DATA_PATH); $${sha1sum} -c);\
(cd "$(LIBVPX_TEST_DATA_PATH)"; $${sha1sum} -c);\
done; \
else\
echo "Skipping test data integrity check, sha1sum not found.";\
@@ -435,6 +565,7 @@ gtest.$(VCPROJ_SFX): $(SRC_PATH_BARE)/third_party/googletest/src/src/gtest-all.c
--proj-guid=EC00E1EC-AF68-4D92-A255-181690D1C9B1 \
--ver=$(CONFIG_VS_VERSION) \
--src-path-bare="$(SRC_PATH_BARE)" \
--as=$(AS) \
-D_VARIADIC_MAX=10 \
--out=gtest.$(VCPROJ_SFX) $(SRC_PATH_BARE)/third_party/googletest/src/src/gtest-all.cc \
-I. -I"$(SRC_PATH_BARE)/third_party/googletest/src/include" -I"$(SRC_PATH_BARE)/third_party/googletest/src"
@@ -451,6 +582,7 @@ test_libvpx.$(VCPROJ_SFX): $(LIBVPX_TEST_SRCS) vpx.$(VCPROJ_SFX) gtest.$(VCPROJ_
--proj-guid=CD837F5F-52D8-4314-A370-895D614166A7 \
--ver=$(CONFIG_VS_VERSION) \
--src-path-bare="$(SRC_PATH_BARE)" \
--as=$(AS) \
$(if $(CONFIG_STATIC_MSVCRT),--static-crt) \
--out=$@ $(INTERNAL_CFLAGS) $(CFLAGS) \
-I. -I"$(SRC_PATH_BARE)/third_party/googletest/src/include" \
@@ -473,12 +605,35 @@ test_intra_pred_speed.$(VCPROJ_SFX): $(TEST_INTRA_PRED_SPEED_SRCS) vpx.$(VCPROJ_
--proj-guid=CD837F5F-52D8-4314-A370-895D614166A7 \
--ver=$(CONFIG_VS_VERSION) \
--src-path-bare="$(SRC_PATH_BARE)" \
--as=$(AS) \
$(if $(CONFIG_STATIC_MSVCRT),--static-crt) \
--out=$@ $(INTERNAL_CFLAGS) $(CFLAGS) \
-I. -I"$(SRC_PATH_BARE)/third_party/googletest/src/include" \
-L. -l$(CODEC_LIB) -l$(GTEST_LIB) $^
endif # TEST_INTRA_PRED_SPEED
endif
ifeq ($(CONFIG_ENCODERS),yes)
ifneq ($(strip $(RC_INTERFACE_TEST_OBJS)),)
PROJECTS-$(CONFIG_MSVS) += test_rc_interface.$(VCPROJ_SFX)
test_rc_interface.$(VCPROJ_SFX): $(RC_INTERFACE_TEST_SRCS) vpx.$(VCPROJ_SFX) \
vpxrc.$(VCPROJ_SFX) gtest.$(VCPROJ_SFX)
@echo " [CREATE] $@"
$(qexec)$(GEN_VCPROJ) \
--exe \
--target=$(TOOLCHAIN) \
--name=test_rc_interface \
-D_VARIADIC_MAX=10 \
--proj-guid=30458F88-1BC6-4689-B41C-50F3737AAB27 \
--ver=$(CONFIG_VS_VERSION) \
--as=$(AS) \
--src-path-bare="$(SRC_PATH_BARE)" \
$(if $(CONFIG_STATIC_MSVCRT),--static-crt) \
--out=$@ $(INTERNAL_CFLAGS) $(CFLAGS) \
-I. -I"$(SRC_PATH_BARE)/third_party/googletest/src/include" \
-L. -l$(CODEC_LIB) -l$(RC_RTC_LIB) -l$(GTEST_LIB) $^
endif # RC_INTERFACE_TEST
endif # CONFIG_ENCODERS
endif # CONFIG_MSVS
else
include $(SRC_PATH_BARE)/third_party/googletest/gtest.mk
@@ -519,31 +674,58 @@ $(eval $(call linkerxx_template,$(TEST_INTRA_PRED_SPEED_BIN), \
-L. -lvpx -lgtest $(extralibs) -lm))
endif # TEST_INTRA_PRED_SPEED
endif # CONFIG_UNIT_TESTS
ifeq ($(CONFIG_ENCODERS),yes)
ifneq ($(strip $(RC_INTERFACE_TEST_OBJS)),)
$(RC_INTERFACE_TEST_OBJS) $(RC_INTERFACE_TEST_OBJS:.o=.d): \
CXXFLAGS += $(GTEST_INCLUDES)
OBJS-yes += $(RC_INTERFACE_TEST_OBJS)
BINS-yes += $(RC_INTERFACE_TEST_BIN)
$(RC_INTERFACE_TEST_BIN): $(TEST_LIBS) libvpxrc.a
$(eval $(call linkerxx_template,$(RC_INTERFACE_TEST_BIN), \
$(RC_INTERFACE_TEST_OBJS) \
-L. -lvpx -lgtest -lvpxrc $(extralibs) -lm))
endif # RC_INTERFACE_TEST
endif # CONFIG_ENCODERS
ifneq ($(strip $(SIMPLE_ENCODE_TEST_OBJS)),)
$(SIMPLE_ENCODE_TEST_OBJS) $(SIMPLE_ENCODE_TEST_OBJS:.o=.d): \
CXXFLAGS += $(GTEST_INCLUDES)
OBJS-yes += $(SIMPLE_ENCODE_TEST_OBJS)
BINS-yes += $(SIMPLE_ENCODE_TEST_BIN)
$(SIMPLE_ENCODE_TEST_BIN): $(TEST_LIBS) libsimple_encode.a
$(eval $(call linkerxx_template,$(SIMPLE_ENCODE_TEST_BIN), \
$(SIMPLE_ENCODE_TEST_OBJS) \
-L. -lsimple_encode -lvpx -lgtest $(extralibs) -lm))
endif # SIMPLE_ENCODE_TEST
endif # CONFIG_EXTERNAL_BUILD
# Install test sources only if codec source is included
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(patsubst $(SRC_PATH_BARE)/%,%,\
$(shell find $(SRC_PATH_BARE)/third_party/googletest -type f))
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(LIBVPX_TEST_SRCS)
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(TEST_INTRA_PRED_SPEED_SRCS)
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(RC_INTERFACE_TEST_SRCS)
define test_shard_template
test:: test_shard.$(1)
test-no-data-check:: test_shard_ndc.$(1)
test: test_shard.$(1)
test-no-data-check: test_shard_ndc.$(1)
test_shard.$(1) test_shard_ndc.$(1): $(LIBVPX_TEST_BIN)
@set -e; \
export GTEST_SHARD_INDEX=$(1); \
export GTEST_TOTAL_SHARDS=$(2); \
$(LIBVPX_TEST_BIN)
test_shard.$(1): testdata
.PHONY: test_shard.$(1)
.PHONY: test_shard.$(1) test_shard_ndc.$(1)
endef
NUM_SHARDS := 10
SHARDS := 0 1 2 3 4 5 6 7 8 9
$(foreach s,$(SHARDS),$(eval $(call test_shard_template,$(s),$(NUM_SHARDS))))
endif
endif # CONFIG_UNIT_TESTS
##
## documentation directives
@@ -566,6 +748,7 @@ endif
## Update the global src list
SRCS += $(CODEC_SRCS) $(LIBVPX_TEST_SRCS) $(GTEST_SRCS)
SRCS += $(RC_INTERFACE_TEST_SRCS)
##
## vpxdec/vpxenc tests.
@@ -582,10 +765,10 @@ TEST_BIN_PATH := $(addsuffix /$(TGT_OS:win64=x64)/Release, $(TEST_BIN_PATH))
endif
utiltest utiltest-no-data-check:
$(qexec)$(SRC_PATH_BARE)/test/vpxdec.sh \
--test-data-path $(LIBVPX_TEST_DATA_PATH) \
--test-data-path "$(LIBVPX_TEST_DATA_PATH)" \
--bin-path $(TEST_BIN_PATH)
$(qexec)$(SRC_PATH_BARE)/test/vpxenc.sh \
--test-data-path $(LIBVPX_TEST_DATA_PATH) \
--test-data-path "$(LIBVPX_TEST_DATA_PATH)" \
--bin-path $(TEST_BIN_PATH)
utiltest: testdata
else
@@ -609,7 +792,7 @@ EXAMPLES_BIN_PATH := $(TGT_OS:win64=x64)/Release
endif
exampletest exampletest-no-data-check: examples
$(qexec)$(SRC_PATH_BARE)/test/examples.sh \
--test-data-path $(LIBVPX_TEST_DATA_PATH) \
--test-data-path "$(LIBVPX_TEST_DATA_PATH)" \
--bin-path $(EXAMPLES_BIN_PATH)
exampletest: testdata
else
+2
View File
@@ -25,8 +25,10 @@
release.
- The \ref readme contains instructions on recompiling the sample applications.
- Read the \ref usage "usage" for a narrative on codec usage.
\if samples
- Read the \ref samples "sample code" for examples of how to interact with the
codec.
\endif
- \ref codec reference
\if encoder
- \ref encoder reference
+4 -16
View File
@@ -23,6 +23,7 @@
#include <string.h> /* for memcpy() */
#include "md5_utils.h"
#include "vpx_ports/compiler_attributes.h"
static void byteSwap(UWORD32 *buf, unsigned words) {
md5byte *p;
@@ -145,25 +146,14 @@ void MD5Final(md5byte digest[16], struct MD5Context *ctx) {
#define MD5STEP(f, w, x, y, z, in, s) \
(w += f(x, y, z) + in, w = (w << s | w >> (32 - s)) + x)
#if defined(__clang__) && defined(__has_attribute)
#if __has_attribute(no_sanitize)
#define VPX_NO_UNSIGNED_OVERFLOW_CHECK \
__attribute__((no_sanitize("unsigned-integer-overflow")))
#endif
#endif
#ifndef VPX_NO_UNSIGNED_OVERFLOW_CHECK
#define VPX_NO_UNSIGNED_OVERFLOW_CHECK
#endif
/*
* The core of the MD5 algorithm, this alters an existing MD5 hash to
* reflect the addition of 16 longwords of new data. MD5Update blocks
* the data and converts bytes into longwords for this routine.
*/
VPX_NO_UNSIGNED_OVERFLOW_CHECK void MD5Transform(UWORD32 buf[4],
UWORD32 const in[16]) {
register UWORD32 a, b, c, d;
VPX_NO_UNSIGNED_OVERFLOW_CHECK VPX_NO_UNSIGNED_SHIFT_CHECK void MD5Transform(
UWORD32 buf[4], UWORD32 const in[16]) {
UWORD32 a, b, c, d;
a = buf[0];
b = buf[1];
@@ -244,6 +234,4 @@ VPX_NO_UNSIGNED_OVERFLOW_CHECK void MD5Transform(UWORD32 buf[4],
buf[3] += d;
}
#undef VPX_NO_UNSIGNED_OVERFLOW_CHECK
#endif
+3 -3
View File
@@ -20,8 +20,8 @@
* Still in the public domain.
*/
#ifndef MD5_UTILS_H_
#define MD5_UTILS_H_
#ifndef VPX_MD5_UTILS_H_
#define VPX_MD5_UTILS_H_
#ifdef __cplusplus
extern "C" {
@@ -46,4 +46,4 @@ void MD5Transform(UWORD32 buf[4], UWORD32 const in[16]);
} // extern "C"
#endif
#endif // MD5_UTILS_H_
#endif // VPX_MD5_UTILS_H_
+47 -23
View File
@@ -9,10 +9,11 @@
*/
#include <assert.h>
#include <stdlib.h>
#include <limits.h>
#include <stdio.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include "./rate_hist.h"
@@ -37,12 +38,19 @@ struct rate_hist {
struct rate_hist *init_rate_histogram(const vpx_codec_enc_cfg_t *cfg,
const vpx_rational_t *fps) {
int i;
struct rate_hist *hist = malloc(sizeof(*hist));
struct rate_hist *hist = calloc(1, sizeof(*hist));
if (hist == NULL || cfg == NULL || fps == NULL || fps->num == 0 ||
fps->den == 0) {
destroy_rate_histogram(hist);
return NULL;
}
// Determine the number of samples in the buffer. Use the file's framerate
// to determine the number of frames in rc_buf_sz milliseconds, with an
// adjustment (5/4) to account for alt-refs
hist->samples = cfg->rc_buf_sz * 5 / 4 * fps->num / fps->den / 1000;
hist->samples =
(int)((int64_t)cfg->rc_buf_sz * 5 / 4 * fps->num / fps->den / 1000);
// prevent division by zero
if (hist->samples == 0) hist->samples = 1;
@@ -80,7 +88,11 @@ void update_rate_histogram(struct rate_hist *hist,
(uint64_t)cfg->g_timebase.num /
(uint64_t)cfg->g_timebase.den;
int idx = hist->frames++ % hist->samples;
int idx;
if (hist == NULL || cfg == NULL || pkt == NULL) return;
idx = hist->frames++ % hist->samples;
hist->pts[idx] = now;
hist->sz[idx] = (int)pkt->data.frame.sz;
@@ -116,9 +128,14 @@ void update_rate_histogram(struct rate_hist *hist,
static int merge_hist_buckets(struct hist_bucket *bucket, int max_buckets,
int *num_buckets) {
int small_bucket = 0, merge_bucket = INT_MAX, big_bucket = 0;
int buckets = *num_buckets;
int buckets;
int i;
assert(bucket != NULL);
assert(num_buckets != NULL);
buckets = *num_buckets;
/* Find the extrema for this list of buckets */
big_bucket = small_bucket = 0;
for (i = 0; i < buckets; i++) {
@@ -178,38 +195,42 @@ static int merge_hist_buckets(struct hist_bucket *bucket, int max_buckets,
static void show_histogram(const struct hist_bucket *bucket, int buckets,
int total, int scale) {
const char *pat1, *pat2;
int width1, width2;
int i;
if (!buckets) return;
assert(bucket != NULL);
assert(buckets > 0);
switch ((int)(log(bucket[buckets - 1].high) / log(10)) + 1) {
case 1:
case 2:
pat1 = "%4d %2s: ";
pat2 = "%4d-%2d: ";
width1 = 4;
width2 = 2;
break;
case 3:
pat1 = "%5d %3s: ";
pat2 = "%5d-%3d: ";
width1 = 5;
width2 = 3;
break;
case 4:
pat1 = "%6d %4s: ";
pat2 = "%6d-%4d: ";
width1 = 6;
width2 = 4;
break;
case 5:
pat1 = "%7d %5s: ";
pat2 = "%7d-%5d: ";
width1 = 7;
width2 = 5;
break;
case 6:
pat1 = "%8d %6s: ";
pat2 = "%8d-%6d: ";
width1 = 8;
width2 = 6;
break;
case 7:
pat1 = "%9d %7s: ";
pat2 = "%9d-%7d: ";
width1 = 9;
width2 = 7;
break;
default:
pat1 = "%12d %10s: ";
pat2 = "%12d-%10d: ";
width1 = 12;
width2 = 10;
break;
}
@@ -224,9 +245,10 @@ static void show_histogram(const struct hist_bucket *bucket, int buckets,
assert(len <= HIST_BAR_MAX);
if (bucket[i].low == bucket[i].high)
fprintf(stderr, pat1, bucket[i].low, "");
fprintf(stderr, "%*d %*s: ", width1, bucket[i].low, width2, "");
else
fprintf(stderr, pat2, bucket[i].low, bucket[i].high);
fprintf(stderr, "%*d-%*d: ", width1, bucket[i].low, width2,
bucket[i].high);
for (j = 0; j < HIST_BAR_MAX; j++) fprintf(stderr, j < len ? "=" : " ");
fprintf(stderr, "\t%5d (%6.2f%%)\n", bucket[i].count, pct);
@@ -259,6 +281,8 @@ void show_rate_histogram(struct rate_hist *hist, const vpx_codec_enc_cfg_t *cfg,
int i, scale;
int buckets = 0;
if (hist == NULL || cfg == NULL) return;
for (i = 0; i < RATE_BINS; i++) {
if (hist->bucket[i].low == INT_MAX) continue;
hist->bucket[buckets++] = hist->bucket[i];
+3 -3
View File
@@ -8,8 +8,8 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef RATE_HIST_H_
#define RATE_HIST_H_
#ifndef VPX_RATE_HIST_H_
#define VPX_RATE_HIST_H_
#include "vpx/vpx_encoder.h"
@@ -37,4 +37,4 @@ void show_rate_histogram(struct rate_hist *hist, const vpx_codec_enc_cfg_t *cfg,
} // extern "C"
#endif
#endif // RATE_HIST_H_
#endif // VPX_RATE_HIST_H_
+36 -13
View File
@@ -8,10 +8,14 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef TEST_ACM_RANDOM_H_
#define TEST_ACM_RANDOM_H_
#ifndef VPX_TEST_ACM_RANDOM_H_
#define VPX_TEST_ACM_RANDOM_H_
#include "third_party/googletest/src/include/gtest/gtest.h"
#include <assert.h>
#include <limits>
#include "gtest/gtest.h"
#include "vpx/vpx_integer.h"
@@ -24,37 +28,56 @@ class ACMRandom {
explicit ACMRandom(int seed) : random_(seed) {}
void Reset(int seed) { random_.Reseed(seed); }
uint16_t Rand16(void) {
uint16_t Rand16() {
const uint32_t value =
random_.Generate(testing::internal::Random::kMaxRange);
return (value >> 15) & 0xffff;
}
int16_t Rand9Signed(void) {
// Use 9 bits: values between 255 (0x0FF) and -256 (0x100).
const uint32_t value = random_.Generate(512);
return static_cast<int16_t>(value) - 256;
int32_t Rand20Signed() {
// Use 20 bits: values between 524287 and -524288.
const uint32_t value = random_.Generate(1048576);
return static_cast<int32_t>(value) - 524288;
}
uint8_t Rand8(void) {
int16_t Rand16Signed() {
// Use 16 bits: values between 32767 and -32768.
return static_cast<int16_t>(random_.Generate(65536));
}
uint16_t Rand12() {
const uint32_t value =
random_.Generate(testing::internal::Random::kMaxRange);
// There's a bit more entropy in the upper bits of this implementation.
return (value >> 19) & 0xfff;
}
uint8_t Rand8() {
const uint32_t value =
random_.Generate(testing::internal::Random::kMaxRange);
// There's a bit more entropy in the upper bits of this implementation.
return (value >> 23) & 0xff;
}
uint8_t Rand8Extremes(void) {
uint8_t Rand8Extremes() {
// Returns a random value near 0 or near 255, to better exercise
// saturation behavior.
const uint8_t r = Rand8();
return r < 128 ? r << 4 : r >> 4;
return static_cast<uint8_t>((r < 128) ? r << 4 : r >> 4);
}
uint32_t RandRange(const uint32_t range) {
// testing::internal::Random::Generate provides values in the range
// testing::internal::Random::kMaxRange.
assert(range <= testing::internal::Random::kMaxRange);
return random_.Generate(range);
}
int PseudoUniform(int range) { return random_.Generate(range); }
int operator()(int n) { return PseudoUniform(n); }
static int DeterministicSeed(void) { return 0xbaba; }
static int DeterministicSeed() { return 0xbaba; }
private:
testing::internal::Random random_;
@@ -62,4 +85,4 @@ class ACMRandom {
} // namespace libvpx_test
#endif // TEST_ACM_RANDOM_H_
#endif // VPX_TEST_ACM_RANDOM_H_
+10 -10
View File
@@ -8,7 +8,7 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include <algorithm>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/util.h"
@@ -62,25 +62,25 @@ class ActiveMapRefreshTest
public ::libvpx_test::CodecTestWith2Params<libvpx_test::TestMode, int> {
protected:
ActiveMapRefreshTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~ActiveMapRefreshTest() {}
~ActiveMapRefreshTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
cpu_used_ = GET_PARAM(2);
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
::libvpx_test::Y4mVideoSource *y4m_video =
static_cast<libvpx_test::Y4mVideoSource *>(video);
if (video->frame() == 1) {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, cpu_used_);
encoder->Control(VP9E_SET_AQ_MODE, kAqModeCyclicRefresh);
} else if (video->frame() >= 2 && video->img()) {
vpx_image_t *current = video->img();
vpx_image_t *previous = y4m_holder_->img();
ASSERT_TRUE(previous != NULL);
ASSERT_NE(previous, nullptr);
vpx_active_map_t map = vpx_active_map_t();
const int width = static_cast<int>(current->d_w);
const int height = static_cast<int>(current->d_h);
@@ -122,7 +122,7 @@ TEST_P(ActiveMapRefreshTest, Test) {
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
}
VP9_INSTANTIATE_TEST_CASE(ActiveMapRefreshTest,
::testing::Values(::libvpx_test::kRealTime),
::testing::Range(5, 6));
VP9_INSTANTIATE_TEST_SUITE(ActiveMapRefreshTest,
::testing::Values(::libvpx_test::kRealTime),
::testing::Range(5, 6));
} // namespace
+13 -11
View File
@@ -9,7 +9,7 @@
*/
#include <climits>
#include <vector>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
@@ -19,24 +19,26 @@ namespace {
class ActiveMapTest
: public ::libvpx_test::EncoderTest,
public ::libvpx_test::CodecTestWith2Params<libvpx_test::TestMode, int> {
public ::libvpx_test::CodecTestWith3Params<libvpx_test::TestMode, int,
int> {
protected:
static const int kWidth = 208;
static const int kHeight = 144;
ActiveMapTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~ActiveMapTest() {}
~ActiveMapTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
cpu_used_ = GET_PARAM(2);
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
if (video->frame() == 1) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, cpu_used_);
encoder->Control(VP9E_SET_AQ_MODE, GET_PARAM(3));
} else if (video->frame() == 3) {
vpx_active_map_t map = vpx_active_map_t();
/* clang-format off */
@@ -62,7 +64,7 @@ class ActiveMapTest
vpx_active_map_t map = vpx_active_map_t();
map.cols = (kWidth + 15) / 16;
map.rows = (kHeight + 15) / 16;
map.active_map = NULL;
map.active_map = nullptr;
encoder->Control(VP8E_SET_ACTIVEMAP, &map);
}
}
@@ -85,7 +87,7 @@ TEST_P(ActiveMapTest, Test) {
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
}
VP9_INSTANTIATE_TEST_CASE(ActiveMapTest,
::testing::Values(::libvpx_test::kRealTime),
::testing::Range(0, 9));
VP9_INSTANTIATE_TEST_SUITE(ActiveMapTest,
::testing::Values(::libvpx_test::kRealTime),
::testing::Range(5, 10), ::testing::Values(0, 3));
} // namespace
+33 -19
View File
@@ -8,11 +8,15 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include <math.h>
#include <tuple>
#include "gtest/gtest.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "test/util.h"
#include "./vpx_dsp_rtcd.h"
#include "vpx/vpx_integer.h"
#include "vpx_config.h"
#include "vpx_dsp/postproc.h"
#include "vpx_mem/vpx_mem.h"
@@ -20,15 +24,17 @@ namespace {
static const int kNoiseSize = 3072;
// TODO(jimbankoski): make width and height integers not unsigned.
typedef void (*AddNoiseFunc)(uint8_t *start, const int8_t *noise,
int blackclamp, int whiteclamp, int width,
int height, int pitch);
class AddNoiseTest : public ::testing::TestWithParam<AddNoiseFunc> {
typedef std::tuple<double, AddNoiseFunc> AddNoiseTestFPParam;
class AddNoiseTest : public ::testing::Test,
public ::testing::WithParamInterface<AddNoiseTestFPParam> {
public:
virtual void TearDown() { libvpx_test::ClearSystemState(); }
virtual ~AddNoiseTest() {}
void TearDown() override { libvpx_test::ClearSystemState(); }
~AddNoiseTest() override = default;
};
double stddev6(char a, char b, char c, char d, char e, char f) {
@@ -44,14 +50,14 @@ TEST_P(AddNoiseTest, CheckNoiseAdded) {
const int height = 64;
const int image_size = width * height;
int8_t noise[kNoiseSize];
const int clamp = vpx_setup_noise(4.4, noise, kNoiseSize);
const int clamp = vpx_setup_noise(GET_PARAM(0), noise, kNoiseSize);
uint8_t *const s =
reinterpret_cast<uint8_t *>(vpx_calloc(image_size, sizeof(*s)));
ASSERT_TRUE(s != NULL);
ASSERT_NE(s, nullptr);
memset(s, 99, image_size * sizeof(*s));
ASM_REGISTER_STATE_CHECK(
GetParam()(s, noise, clamp, clamp, width, height, width));
GET_PARAM(1)(s, noise, clamp, clamp, width, height, width));
// Check to make sure we don't end up having either the same or no added
// noise either vertically or horizontally.
@@ -70,7 +76,7 @@ TEST_P(AddNoiseTest, CheckNoiseAdded) {
memset(s, 255, image_size);
ASM_REGISTER_STATE_CHECK(
GetParam()(s, noise, clamp, clamp, width, height, width));
GET_PARAM(1)(s, noise, clamp, clamp, width, height, width));
// Check to make sure don't roll over.
for (int i = 0; i < image_size; ++i) {
@@ -81,7 +87,7 @@ TEST_P(AddNoiseTest, CheckNoiseAdded) {
memset(s, 0, image_size);
ASM_REGISTER_STATE_CHECK(
GetParam()(s, noise, clamp, clamp, width, height, width));
GET_PARAM(1)(s, noise, clamp, clamp, width, height, width));
// Check to make sure don't roll under.
for (int i = 0; i < image_size; ++i) {
@@ -100,15 +106,15 @@ TEST_P(AddNoiseTest, CheckCvsAssembly) {
uint8_t *const s = reinterpret_cast<uint8_t *>(vpx_calloc(image_size, 1));
uint8_t *const d = reinterpret_cast<uint8_t *>(vpx_calloc(image_size, 1));
ASSERT_TRUE(s != NULL);
ASSERT_TRUE(d != NULL);
ASSERT_NE(s, nullptr);
ASSERT_NE(d, nullptr);
memset(s, 99, image_size);
memset(d, 99, image_size);
srand(0);
ASM_REGISTER_STATE_CHECK(
GetParam()(s, noise, clamp, clamp, width, height, width));
GET_PARAM(1)(s, noise, clamp, clamp, width, height, width));
srand(0);
ASM_REGISTER_STATE_CHECK(
vpx_plane_add_noise_c(d, noise, clamp, clamp, width, height, width));
@@ -121,16 +127,24 @@ TEST_P(AddNoiseTest, CheckCvsAssembly) {
vpx_free(s);
}
INSTANTIATE_TEST_CASE_P(C, AddNoiseTest,
::testing::Values(vpx_plane_add_noise_c));
using std::make_tuple;
INSTANTIATE_TEST_SUITE_P(
C, AddNoiseTest,
::testing::Values(make_tuple(3.25, vpx_plane_add_noise_c),
make_tuple(4.4, vpx_plane_add_noise_c)));
#if HAVE_SSE2
INSTANTIATE_TEST_CASE_P(SSE2, AddNoiseTest,
::testing::Values(vpx_plane_add_noise_sse2));
INSTANTIATE_TEST_SUITE_P(
SSE2, AddNoiseTest,
::testing::Values(make_tuple(3.25, vpx_plane_add_noise_sse2),
make_tuple(4.4, vpx_plane_add_noise_sse2)));
#endif
#if HAVE_MSA
INSTANTIATE_TEST_CASE_P(MSA, AddNoiseTest,
::testing::Values(vpx_plane_add_noise_msa));
INSTANTIATE_TEST_SUITE_P(
MSA, AddNoiseTest,
::testing::Values(make_tuple(3.25, vpx_plane_add_noise_msa),
make_tuple(4.4, vpx_plane_add_noise_msa)));
#endif
} // namespace
+10 -10
View File
@@ -7,7 +7,7 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
@@ -20,9 +20,9 @@ class AltRefAqSegmentTest
public ::libvpx_test::CodecTestWith2Params<libvpx_test::TestMode, int> {
protected:
AltRefAqSegmentTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~AltRefAqSegmentTest() {}
~AltRefAqSegmentTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
set_cpu_used_ = GET_PARAM(2);
@@ -30,9 +30,9 @@ class AltRefAqSegmentTest
alt_ref_aq_mode_ = 0;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
if (video->frame() == 1) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, set_cpu_used_);
encoder->Control(VP9E_SET_ALT_REF_AQ, alt_ref_aq_mode_);
encoder->Control(VP9E_SET_AQ_MODE, aq_mode_);
@@ -150,8 +150,8 @@ TEST_P(AltRefAqSegmentTest, TestNoMisMatchAltRefAQ4) {
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
}
VP9_INSTANTIATE_TEST_CASE(AltRefAqSegmentTest,
::testing::Values(::libvpx_test::kOnePassGood,
::libvpx_test::kTwoPassGood),
::testing::Range(2, 5));
VP9_INSTANTIATE_TEST_SUITE(AltRefAqSegmentTest,
::testing::Values(::libvpx_test::kOnePassGood,
::libvpx_test::kTwoPassGood),
::testing::Range(2, 5));
} // namespace
+22 -21
View File
@@ -7,11 +7,12 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
#include "test/util.h"
#include "vpx_config.h"
namespace {
#if CONFIG_VP8_ENCODER
@@ -24,24 +25,24 @@ class AltRefTest : public ::libvpx_test::EncoderTest,
public ::libvpx_test::CodecTestWithParam<int> {
protected:
AltRefTest() : EncoderTest(GET_PARAM(0)), altref_count_(0) {}
virtual ~AltRefTest() {}
~AltRefTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(libvpx_test::kTwoPassGood);
}
virtual void BeginPassHook(unsigned int /*pass*/) { altref_count_ = 0; }
void BeginPassHook(unsigned int /*pass*/) override { altref_count_ = 0; }
virtual void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) {
if (video->frame() == 1) {
void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_ENABLEAUTOALTREF, 1);
encoder->Control(VP8E_SET_CPUUSED, 3);
}
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
if (pkt->data.frame.flags & VPX_FRAME_IS_INVISIBLE) ++altref_count_;
}
@@ -63,8 +64,8 @@ TEST_P(AltRefTest, MonotonicTimestamps) {
EXPECT_GE(altref_count(), 1);
}
VP8_INSTANTIATE_TEST_CASE(AltRefTest,
::testing::Range(kLookAheadMin, kLookAheadMax));
VP8_INSTANTIATE_TEST_SUITE(AltRefTest,
::testing::Range(kLookAheadMin, kLookAheadMax));
#endif // CONFIG_VP8_ENCODER
@@ -75,17 +76,17 @@ class AltRefForcedKeyTestLarge
AltRefForcedKeyTestLarge()
: EncoderTest(GET_PARAM(0)), encoding_mode_(GET_PARAM(1)),
cpu_used_(GET_PARAM(2)), forced_kf_frame_num_(1), frame_num_(0) {}
virtual ~AltRefForcedKeyTestLarge() {}
~AltRefForcedKeyTestLarge() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(encoding_mode_);
cfg_.rc_end_usage = VPX_VBR;
cfg_.g_threads = 0;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, cpu_used_);
encoder->Control(VP8E_SET_ENABLEAUTOALTREF, 1);
@@ -100,7 +101,7 @@ class AltRefForcedKeyTestLarge
(video->frame() == forced_kf_frame_num_) ? VPX_EFLAG_FORCE_KF : 0;
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
if (frame_num_ == forced_kf_frame_num_) {
ASSERT_TRUE(!!(pkt->data.frame.flags & VPX_FRAME_IS_KEY))
<< "Frame #" << frame_num_ << " isn't a keyframe!";
@@ -142,11 +143,11 @@ TEST_P(AltRefForcedKeyTestLarge, ForcedFrameIsKey) {
}
}
VP8_INSTANTIATE_TEST_CASE(AltRefForcedKeyTestLarge,
::testing::Values(::libvpx_test::kOnePassGood),
::testing::Range(0, 9));
VP8_INSTANTIATE_TEST_SUITE(AltRefForcedKeyTestLarge,
::testing::Values(::libvpx_test::kOnePassGood),
::testing::Range(0, 9));
VP9_INSTANTIATE_TEST_CASE(AltRefForcedKeyTestLarge,
::testing::Values(::libvpx_test::kOnePassGood),
::testing::Range(0, 9));
VP9_INSTANTIATE_TEST_SUITE(AltRefForcedKeyTestLarge,
::testing::Values(::libvpx_test::kOnePassGood),
::testing::Range(0, 9));
} // namespace
+11
View File
@@ -10,6 +10,9 @@
# The test app itself runs on the command line through adb shell
# The paths are really messed up as the libvpx make file
# expects to be made from a parent directory.
# Ignore this file during non-NDK builds.
ifdef NDK_ROOT
CUR_WD := $(call my-dir)
BINDINGS_DIR := $(CUR_WD)/../../..
LOCAL_PATH := $(CUR_WD)/../../..
@@ -32,7 +35,11 @@ LOCAL_CPP_EXTENSION := .cc
LOCAL_MODULE := gtest
LOCAL_C_INCLUDES := $(LOCAL_PATH)/third_party/googletest/src/
LOCAL_C_INCLUDES += $(LOCAL_PATH)/third_party/googletest/src/include/
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_PATH)/third_party/googletest/src/include/
LOCAL_SRC_FILES := ./third_party/googletest/src/src/gtest-all.cc
LOCAL_LICENSE_KINDS := SPDX-license-identifier-BSD
LOCAL_LICENSE_CONDITIONS := notice
LOCAL_NOTICE_FILE := $(LOCAL_PATH)/../../LICENSE $(LOCAL_PATH)/../../PATENTS
include $(BUILD_STATIC_LIBRARY)
#libvpx_test
@@ -47,6 +54,9 @@ else
LOCAL_STATIC_LIBRARIES += vpx
endif
LOCAL_LICENSE_KINDS := SPDX-license-identifier-BSD
LOCAL_LICENSE_CONDITIONS := notice
LOCAL_NOTICE_FILE := $(LOCAL_PATH)/../../LICENSE $(LOCAL_PATH)/../../PATENTS
include $(LOCAL_PATH)/test/test.mk
LOCAL_C_INCLUDES := $(BINDINGS_DIR)
FILTERED_SRC := $(sort $(filter %.cc %.c, $(LIBVPX_TEST_SRCS-yes)))
@@ -54,3 +64,4 @@ LOCAL_SRC_FILES := $(addprefix ./test/, $(FILTERED_SRC))
# some test files depend on *_rtcd.h, ensure they're generated first.
$(eval $(call rtcd_dep_template))
include $(BUILD_EXECUTABLE)
endif # NDK_ROOT
+6 -5
View File
@@ -3,19 +3,20 @@ Android.mk will build vpx unittests on android.
./libvpx/configure --target=armv7-android-gcc --enable-external-build \
--enable-postproc --disable-install-srcs --enable-multi-res-encoding \
--enable-temporal-denoising --disable-unit-tests --disable-install-docs \
--disable-examples --disable-runtime-cpu-detect --sdk-path=$NDK
--disable-examples --disable-runtime-cpu-detect
2) From the parent directory, invoke ndk-build:
NDK_PROJECT_PATH=. ndk-build APP_BUILD_SCRIPT=./libvpx/test/android/Android.mk \
APP_ABI=armeabi-v7a APP_PLATFORM=android-18 APP_OPTIM=release \
APP_STL=gnustl_static
APP_STL=c++_static
Note: Both adb and ndk-build are available prebuilt at:
https://chromium.googlesource.com/android_tools
Note: Both adb and ndk-build are available at:
https://developer.android.com/studio#downloads
https://developer.android.com/ndk/downloads
3) Run get_files.py to download the test files:
python get_files.py -i /path/to/test-data.sha1 -o /path/to/put/files \
-u http://downloads.webmproject.org/test_data/libvpx
-u https://storage.googleapis.com/downloads.webmproject.org/test_data/libvpx
4) Transfer files to device using adb. Ensure you have proper permissions for
the target
+9 -8
View File
@@ -38,7 +38,7 @@ def get_file_sha(filename):
buf = file.read(HASH_CHUNK)
return sha_hash.hexdigest()
except IOError:
print "Error reading " + filename
print("Error reading " + filename)
# Downloads a file from a url, and then checks the sha against the passed
# in sha
@@ -67,7 +67,7 @@ try:
getopt.getopt(sys.argv[1:], \
"u:i:o:", ["url=", "input_csv=", "output_dir="])
except:
print 'get_files.py -u <url> -i <input_csv> -o <output_dir>'
print('get_files.py -u <url> -i <input_csv> -o <output_dir>')
sys.exit(2)
for opt, arg in opts:
@@ -79,7 +79,7 @@ for opt, arg in opts:
local_resource_path = os.path.join(arg)
if len(sys.argv) != 7:
print "Expects two paths and a url!"
print("Expects two paths and a url!")
exit(1)
if not os.path.isdir(local_resource_path):
@@ -89,7 +89,7 @@ file_list_csv = open(file_list_path, "rb")
# Our 'csv' file uses multiple spaces as a delimiter, python's
# csv class only uses single character delimiters, so we convert them below
file_list_reader = csv.reader((re.sub(' +', ' ', line) \
file_list_reader = csv.reader((re.sub(' +', ' ', line.decode('utf-8')) \
for line in file_list_csv), delimiter = ' ')
file_shas = []
@@ -104,15 +104,16 @@ for row in file_list_reader:
file_list_csv.close()
# Download files, only if they don't already exist and have correct shas
for filename, sha in itertools.izip(file_names, file_shas):
for filename, sha in zip(file_names, file_shas):
filename = filename.lstrip('*')
path = os.path.join(local_resource_path, filename)
if os.path.isfile(path) \
and get_file_sha(path) == sha:
print path + ' exists, skipping'
print(path + ' exists, skipping')
continue
for retry in range(0, ftp_retries):
print "Downloading " + path
print("Downloading " + path)
if not download_and_check_sha(url, filename, sha):
print "Sha does not match, retrying..."
print("Sha does not match, retrying...")
else:
break

Some files were not shown because too many files have changed in this diff Show More