1007 Commits

Author SHA1 Message Date
Matt Oliver ef79796c8d project: Update for 1.15.1 merge. 2025-06-14 21:52:12 +10:00
Matt Oliver 8186b704b7 Merge commit '39e8b9dcd4696d9ac3ebd4722e012488382f1adb' 2025-06-14 21:02:34 +10:00
Matt Oliver bcecc98a79 project: Update for 1.15.0 merge. 2025-06-14 19:48:26 +10:00
Matt Oliver 70414c8112 Merge commit '9f9b7e9ba2eb9d01640a9e69a3d655866265cf7f' 2025-06-14 19:22:24 +10:00
Jerome Jiang 39e8b9dcd4 Bump VPX_EXT_RATECTRL_ABI_VERSION
Fix the order of the newly added codec control

Bug: webm:384672478
Change-Id: I045c58865399ea9d74c91c1a5521215a0d2032f7
2025-01-10 14:30:59 -05:00
Jerome Jiang 2f1ad02bd5 Add changelog for v1.15.1 release
Bug: webm:384672478
Change-Id: I3346b06eb05e306eb23f7281b65ff7e9c84e7e6b
2025-01-09 14:52:08 -05:00
Jerome Jiang 69da847f66 Bump up version major
Before v1.15.0: c=10, a=1, r=0

Rule #3: source code has changed, increment r:
r=1

Rule #4: interfaces were removed in vpx_tpl.h, set r=0, increment c:
c=11, r=0

Rule #5: no interfaces have been added

Rule #6: interfaces were removed in vpx_tpl.h, set a=0:
a=0

After release: c=11, a=0, r=0

major = c-a = 11
minor = a = 0
patch = r = 0

Bug: webm:384672478
Change-Id: I2e70e7e35c64ece32eaf1dc5625640965483f9b9
2025-01-07 14:17:40 -05:00
James Zern 82a0c8a2db configure: add support for darwin24 (macOS 15.x)
Bug: webm:379534940
Change-Id: I8777b6bb8653a31080801e35916df9aa39a4c999
(cherry picked from commit 6e23d972a7a717f2ba3970c82b6b96d350b5bcde)
2024-11-19 22:03:26 +00:00
Matt Oliver bbedc9be7c project: Update template file for WinRT. 2024-11-09 19:38:13 +11:00
Jerome Jiang 9f9b7e9ba2 Changelog: add neon optimization speed up stats
Bug: webm:372498543
Change-Id: I297be5efb602b0181c2b25ff8b50060c10263130
2024-10-23 14:24:57 -04:00
Jerome Jiang 0ba09cc79f Update CHANGELOG and version
Bug: webm:372498543
Change-Id: Ieddfa0b18f8c5e53ab65e04b52b5a601c672ba62
2024-10-22 15:36:11 -04:00
James Zern 3939c5ebb0 vpx_highbd_convolve8_avg_sve2: fix C fallback typo
vpx_highbd_convolve8_c -> vpx_highbd_convolve8_avg_c

Change-Id: I8bc73c59d3e654739ee5c42a295f4ecdee6d7631
(cherry picked from commit 3500e57e52b6af057ed54223e15e560a95df8479)
2024-10-10 13:32:53 -04:00
Jerome Jiang 816a90fe76 Update AUTHORS and .mailmap
Bug: webm:372498543
Change-Id: I5041fd1558c0a36dce10395ec6e836f3d55384dc
2024-10-09 15:18:53 -04:00
Marco Paniconi 192b4a4ce7 rtc-vp9: Always disable svc_use_low_part
Possible fix for issue below. It was only disabled
for screen in a previous change, but we force it off
always to check if it clears the issue.

The speed feature disabled is only used for 3 spatial
layers and at least 2 temporal. The impact on speed is
expected to be small, ~2%, so ok to disable for now and
see if it clears the issue.

Bug: 366146260
Change-Id: If7af006425e1e0ef297b9d6466507ea4c90ddb6f
(cherry picked from commit 09b3d5fc5aa48752f95f4c0c37b0bd4ff55c0ba1)
2024-10-09 15:07:31 -04:00
Marco Paniconi cdd4e35015 vp8: Fix integer overflow in encode_frame_to_data_rate
Integer overflow in encode_frame_to_data_rate()
for the update:
lc->total_target_vs_actual += bits_off_for_this_layer

Fix is to use int64_t for total_target_vs_actual.

Bug: chromium:368114043
Change-Id: I9a01e1a69e26ae748e8ae23d9e1287431510388d
2024-09-20 09:50:41 -07:00
Wan-Teh Chang aa73610d03 Fix a typo: avg_frame_index => avg_frame_qindex
Change-Id: I8fd9f6f01ae712a9bf3dc9e34fe5f7115a305109
2024-09-19 00:09:36 -07:00
Marco Paniconi 417204d7fd rtc-vp9: Fix to integer overflow in vp9-svc
Divide by 3 instead of multiple by 3, in comparison of
lrc->avg_frame_bandwidth vd lrc->last_avg_frame_bandwidth,
in two functions for reset rc.
Small loss in precision, so acceptable.

Similar change to:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5698570

Bug: chromium:367892770
Change-Id: Ia9ef09a9f6beba930fedd496407cfa7057e39336
2024-09-18 14:33:49 -07:00
James Zern ac68e7f999 aarch64_cpudetect: detect SVE/SVE2 on Windows
PF_ARM_SVE_INSTRUCTIONS_AVAILABLE and PF_ARM_SVE2_INSTRUCTIONS_AVAILABLE
are available in WinSDK 10.0.26100 and recent versions of mingw-w64.

Based on a patch by Martin Storsjö on ffmpeg-devel:
https://ffmpeg.org/pipermail/ffmpeg-devel/2024-September/333611.html

Change-Id: I34b2341a559f95aa400e84d709f3eb36da5dbb7b
2024-09-18 19:43:10 +00:00
James Zern 729b78a127 aarch64_cpudetect: detect I8MM on Windows via SVE-I8MM
There's no direct processor feature constant for I8MM alone, but
there is a flag for SVE-I8MM (added in WinSDK 10.0.26100 and
recent versions of mingw-w64). If SVE-I8MM is available, we can
assume that I8MM is available.

While HW supporting these features isn't yet commonly running
Windows, this at least allows detecting and running the I8MM codepaths
in Windows builds in Wine (possibly running in QEMU).

Based on patch from Martin Storsjö on ffmpeg-devel:
https://ffmpeg.org/pipermail/ffmpeg-devel/2024-September/333609.html

Change-Id: I77117bee8516924fddcdecccae8bab3cf5beed96
2024-09-18 19:43:10 +00:00
James Zern 6dfdc4ee10 tiny_ssim: fix argc check
The program requires a minimum of 2 parameters. Previously the tool
would crash if only one input file was given.

Bug: webm:365481206
Change-Id: I875d81b2db4fcc4338061c03b23bb51b0aad58e4
2024-09-18 19:06:17 +00:00
Marco Paniconi 696c488d35 rtc-vp9: Disable svc_use_low_part for screen
Possible fix for issue below. The speed feature disabled
is only used for 3 spatial layers and at least 2 temporal.
The impact on speed is expected to be small, ~2%, so ok
to disable for now and see if it clears the issue.

Bug: 366146260

Change-Id: I94ab991d583cc2ce758db337abbbb463a65f0767
2024-09-17 23:07:24 +00:00
Jerome Jiang c6de95ce0e Initialize gf_picture in vp9 tpl
Bug: b/365068397
Change-Id: Id267532928353148f916b73feb19de515db14cb9
2024-09-06 13:02:12 -04:00
James Zern 3ba1fada8b vpx_image.h: add lifetime note for img_data
The wrapped storage must exist for the duration of the vpx_image_t
allocation.

Bug: aomedia:363806063
Change-Id: Ic6b79a56b6c07776222d1767490d873d7408ced0
2024-09-03 19:19:09 -07:00
Marco Paniconi fbf63dff1f vp9: clamp the calculation of sb64_target_rate to INT_MAX
Bug: b/361617762
Change-Id: Ie7d2b0973e6de23d6e992ee058cbb94b826fda65
2024-08-28 14:48:57 +00:00
James Zern 507aea8e29 vp9_speed_features.h: fix partition_search_type comment
FIXED_SIZE_PARTITION -> FIXED_PARTITION

Change-Id: I5e6a561042d7dfa87d6f11b033052d340e433440
2024-08-27 17:38:43 -07:00
James Zern 50aa6cca4d README: add security report note
The default template for https://issues.webmproject.org/ is a public bug
report. Security issues can be reported securely using the 'Security
report' template.

Change-Id: Ic7144a6c7a144772b78852d1415a51a570c79d50
2024-08-26 15:30:01 -07:00
Wan-Teh Chang f00fa3ce74 Add macro name as comment for header guard #endif
Change-Id: I948f21f414fc269ad03673636506fa83acf5f5f6
2024-08-23 00:35:45 +00:00
Wan-Teh Chang 35b908f808 Add #ifndef header guard to vpx_version.h
Change-Id: Ief028037a3a56b1f18998298ad594a86cf906bd3
2024-08-22 14:40:16 -07:00
James Zern 2c778f4da6 remove vp9_{highbd_,}resize_frame*()
and examples/resize_util.c. These functions were added in:
  3cd37dfeb Adds a non-normative resize library to vp9 encoder
but never used meaningfully in the library.

This mirrors the change in libaom:
  d10029bb4b Restore function prototype of av1_resize_frame420
except that vp9_resize_frame420() was never exported in the shared
library, so can be deleted along with the rest.

The reasoning for removing examples/resize_util.c is the same: it is not
useful and examples should use the public functions of the libvpx
library.

Change-Id: I386080d3f1a3ef81dfc87fcdf5bbdf459d996f03
2024-08-21 19:22:48 -07:00
James Zern 312a9004c1 remove vpx_ssim_parms_16x16()
The last reference to this function was removed in:
5511968f2 Removed several unused functions.

Change-Id: I644482b4f0c9c4765035adcdc21ec495e3e3a6e6
2024-08-20 17:47:07 -07:00
Yunqing Wang a5ea71f091 Key frame temporal filtering
Added key frame temporal filtering. Enabled it for VOD encoding
with encoder speed < 2.
Minor improvement in prediction.
Added the restriction of using no more than "arnr_max_frames"
frames for temporal filtering.
Key frame temporal filtering is turned off by default for now. To
enable it, set "--enable-keyframe-filtering=1"

Borg result with "--enable-keyframe-filtering=1"
         avg_psnr:  ovr_psnr:   ssim:    vmaf:
hdres2:   -0.762     -0.863    -0.903   -0.680
midres2:  -0.813     -0.753    -0.757   -0.743
lowres2:  -0.492     -0.598    -0.737   -0.881
The impact on the encoder time is minimal.

Change-Id: If6abea3e21efcb96f1978cd9dfaa742c40dc2a56
2024-08-19 17:59:58 +00:00
Jerome Jiang 5d20cc3081 IWYU: include vp9_ext_ratectrl.h for tpl
Change-Id: Ia00dc7f79a69eb73c85fb409418861bef459e863
2024-08-19 11:05:38 -04:00
Jerome Jiang ee2552d903 vp9 ext rc: TPL & final encoding use same QP
Removed codec control VP9E_ENABLE_EXTERNAL_RC_TPL since it is
no longer needed.

Change-Id: I151254ff3f0496c017ddf73c2caf94783ef38f31
2024-08-16 17:16:29 -04:00
Jerome Jiang a69eeb0af2 ext rc: Override encode QP in TPL pass for VBR
Change-Id: I8f32b5847b57313d00401f5596ed62ac7c4817f0
2024-08-16 16:50:21 -04:00
Jerome Jiang d9d6c5e2c9 Remove ext_rc_recode
This flag is always set to 0

Change-Id: I228b3befae478517e7b31228d4a6553af4fd7a27
2024-08-16 13:54:02 -04:00
James Zern 95568674c2 remove redundant && __GNUC__ preproc check
`#if defined(__GNUC__)` is enough if a specific version isn't being
looked for.

Bug: aomedia:356832974
Change-Id: I3fcbecf9d547c6a2d89d7b5456e83ee08ddc6f5e
2024-08-16 16:43:45 +00:00
Yunqing Wang fcd1f39e56 Improve temporal filter prediction and process
Applied 12-tap filter to temporal filter prediction for better
result. Improved the calculation of frames to be used in temporal
filtering.

The overall PSNR gain was -0.511% (lowres), -0.338% (midres), and
-0.288% (hdres).
Encoder time was increased by ~2%, which would be largely reduced
by the following SIMD optimization.

Change-Id: If3ece30f1614beadc99ebf6b4dc3f2d988d3bdb9
2024-08-09 23:05:32 +00:00
Jerome Jiang 13be4a7190 Remove a stale TODO in ext RC
Change-Id: Ie871a476a7a0b04cf88db17da8402dad1c3247f7
2024-08-09 18:02:00 +00:00
Wan-Teh Chang b222d72285 Add the saturate_cast_double_to_int() function
Move the saturate_cast_double_to_int() function in
vp8/encoder/firstpass.c to vpx_dsp/vpx_dsp_common.h so that it can be
used in other files.

Change-Id: I748fea969520542dca68d7a46500d3272f22e16f
2024-08-08 11:42:03 -07:00
Jerome Jiang c18b9f7c68 Add min/max q index to ext rc config
Change-Id: I5d152f3b0868e78c6b33fe651c6a40597b42feef
2024-08-08 10:24:29 -04:00
James Zern 634e1f8fb1 vp9_calc_iframe_target_size_one_pass_cbr: clamp final target
to INT_MAX. This matches calc_iframe_target_size() in VP8
(http://crbug.com/1473473). If rc->avg_frame_bandwidth is large even
small kf_boost values will overflow an int.

Change-Id: Iaca5b47fe97793ae70930b3b2c2f42725d2c96fb
2024-08-08 02:37:47 +00:00
James Zern bb95d3652b update libwebm to libwebm-1.0.0.31-10-g3b63004
This fixes a build error seen in gcc 15:
3b63004 mkvparser/mkvparser.cc: add missing <cstdint> include

Bug: aomedia:357622679
Change-Id: I6c4a1795d189f9993d4f2c5c9f0375912bc58f0c
2024-08-06 11:31:25 -07:00
Wan-Teh Chang 428f3104fa Include "gtest/gtest.h" using the shorter path
Rely on the -I or -system compiler option to find "gtest/gtest.h". This
makes it easier to build our tests against a copy of gtest outside the
libvpx source tree.

Bug: webm:42330726
Change-Id: I3b189c6345e13b36b236d1eedc6ee091bfa71f48
2024-08-02 22:42:20 +00:00
Jerome Jiang 1865f20e9a Extend border for vp8 loopfilter
Bug: webm:356482713
Change-Id: I149d077a57d55c46fe1924cff4c5cfcf5c7609b0
2024-08-02 14:59:58 -04:00
Wan-Teh Chang 9f06827eeb Run clang-format on three files
Change-Id: I055186d915d4660e848f6d856d7895953aaf76ba
2024-08-02 07:03:40 -07:00
James Zern 0c4af6b4c1 vpx_fdct16x16_avx2: add missing cast
Fixes:
vpx_dsp/x86/fwd_txfm_avx2.c:378:50: error: incompatible pointer types
  passing 'int16_t *' (aka 'short *') to parameter of type
  'tran_low_t *' (aka 'int *') [-Werror,-Wincompatible-pointer-types]

Change-Id: I9f50547c1fc885c24b4b91e4c7d6857d397cceed
2024-08-02 00:14:58 +00:00
James Zern b5451de5c5 vp9_extrc_update_encodeframe_result: normalize decl & def
Fixes compiler warning in visual studio after:
2ab292e9e Remove unused parameters from ext rc callback

vp9\encoder\vp9_ext_ratectrl.c(186): warning C4028: formal parameter 3
different from declaration

Change-Id: I4cfddb3f55fb7191ebaf578851ab3bc2c55106e3
2024-08-02 00:14:58 +00:00
James Zern 4295bf4f0f Update third_party/libwebm to commit f4b07ec
Change-Id: I18ff0e388d3c8b683385d98d76bff3e238488a94
2024-08-01 13:38:00 -07:00
Jerome Jiang 2ab292e9e1 Remove unused parameters from ext rc callback
Bug: b/356424505
Change-Id: I1c684e7f4cc9bb7b916354d391abd1ae168af39f
2024-07-31 22:03:08 +00:00
James Zern 3cc287bbd7 vpx_scale,scale1d_c: add assert(dest_scale != 0)
This fixes a 'division by zero' static analysis report (seen with
clang-14).

Bug: b:328632178
Change-Id: I4c051631ff1a948e8f83a831286e01fc50ff1c1d
2024-07-31 18:25:19 +00:00
James Zern 8db1b663e2 vp9_subexp,remap_prob: add an assert
Fixes a 'Result of operation is garbage or undefined' static analysis
report (seen with clang-14) related to left shifting a negative value.

Bug: b:328632178
Change-Id: I18f0100eca0deac1cac9be0c7e848685d2911fb3
2024-07-30 14:54:01 -07:00
James Zern f987e3514c doxygen: quiet warnings in decoder-only config
Fixes:
warning: explicit link request to 'VP9E_SET_EXTERNAL_RATE_CONTROL' could
not be resolved

Change-Id: If7a0d97412cc8fad3457031fbf29cb447635f4a0
2024-07-30 17:55:49 +00:00
James Zern 85d386599d systemdependent.c: fix warning w/CONFIG_MULTITHREAD=0
fixes:
vp8/common/generic/systemdependent.c: In function
   'vp8_machine_specific_config':
vp8/common/generic/systemdependent.c:63:46: warning: unused parameter
   'ctx' [-Wunused-parameter]
    63 | void vp8_machine_specific_config(VP8_COMMON *ctx) {

Change-Id: I0eeaa0c27ccfa901cc62150eed590f5056eb9238
2024-07-29 13:23:58 -07:00
James Zern cdf8da4c03 vp8: fix OOB access in x->MVcount
Motion vectors are now clamped in
vp8_find_best_sub_pixel_step_iteratively, vp8_find_best_sub_pixel_step,
vp8_find_best_half_pixel_step, vp8_full_search_sad,
vp8_refining_search_sadx4 and vp8_refining_search_sad_c (the rtcd for
other optimizations are redirects to vp8_refining_search_sadx4).

The difference of valid motion vectors may still go beyond the range of
the MVcount array, however, so additional checks are added to
rd_update_mvcount() and update_mvcount().

Note the test source and settings (speed 1 and GOOD quality mode) come
from the issue report; additional coverage is added for realtime. The
realtime path does not trigger the error without the fix, but as it's
similar to the rd path, the same clamp is done to be safe.

Fixes:
vp8/encoder/rdopt.c:1579:5: runtime error: index 17467 out of bounds for
  type 'unsigned int[2047]'

Bug: oss-fuzz:69906
Change-Id: Ia8bd087cfe4475ab09ba711ed806fbcbaa72e552
2024-07-25 15:08:02 -07:00
James Zern f9120b789d vp8,calc_iframe_target_size: clamp kf_boost
cpi->output_framerate may be as large as 10M. Previously this would
cause kf_boost to be ~20M which would overflow an int when multiplied by
values in kf_boost_qadjustment[].

Fixes:
vp8/encoder/ratectrl.c:340:25: runtime error: signed integer overflow:
  19999984 * 220 cannot be represented in type 'int'

Bug: oss-fuzz:69100
Change-Id: I2d77c9d2912412f6265f6a8dc0e6b361b63b8242
2024-07-25 19:43:53 +00:00
Jingning Han d63ecb4117 Reset the ref_table array for the key frame GOP
Change-Id: Idda6ad9352d4c74dcbe8f2b6e1615d10e958e4c8
2024-07-24 16:02:39 -07:00
Jingning Han f809c987b5 Remove repeated ref_frame assignments
Change-Id: I0daa5a40489ce14582cb6a1c2816df354f1134f9
2024-07-24 16:01:24 -07:00
Bohan f96deb0bb4 Add tpl propagation with updated ref_frame idx
Change-Id: I6fcef44a90fc434e18447964aa1b4585c7f62310
2024-07-24 18:57:29 +00:00
Wan-Teh Chang 3fb0e5d75d Remove unneeded cpi->output_framerate assignment
The assignment "cpi->output_framerate = cpi->framerate;" after the
vp8_new_framerate() call is not needed, because vp8_new_framerate() sets
cpi->framerate and cpi->output_framerate to the same value.

Change-Id: I4de97b43957142d658e0c08ecfc6628844ce453a
2024-07-23 15:22:55 -07:00
Angie Chiang 057e53d759 Small refactoring in vp9_firstpass.c
Change-Id: If5e76b05f584650ff675363e6eb347bedae7728c
2024-07-19 21:38:08 +00:00
James Zern 9a1e8ae7aa README: add link to issue tracker
Change-Id: Ic8bc0167e5d1975e006135e20afacf27ee6badcf
2024-07-18 23:46:57 +00:00
James Zern efe615f804 add repro for crbug.com/352414650
+ fix an additional double -> int overflow warning (chrome's fuzzers do
  not have the float-cast-overflow sanitizer enabled)

Bug: chromium:352414650
Change-Id: I634bb421a74236eac434df138ed71dadf197596a
2024-07-18 13:36:10 -07:00
Marco Paniconi 3219f76cea Remove printf warning statements in set_size_literal()
Bug: b/347890801
Change-Id: I78c8dd0907d54f6cd1d3972ea6c3897f4b0c5adc
2024-07-15 11:33:26 -07:00
Wan-Teh Chang 72018e8c74 Some cleanup in vbr_rate_correction()
The only real change is in the initialization of frame_window. The (int)
cast is moved to the result of VPXMIN(), so that
cpi->twopass.total_stats.count - cpi->common.current_video_frame is
calculated in double.

Change-Id: Ia80f24614af7184b37cfdd99d8a8b1639460f273
2024-07-13 00:16:11 +00:00
James Zern 77974ec041 vp9_svc_adjust_avg_frame_qindex: fix int overflow
rc->avg_frame_bandwidth is capped at INT_MAX. Rather than multiply the
value by 3, divide projected_frame_size by 3 to avoid the overflow.
Without rounding this differs slightly from the original, but loss of
precision is acceptable in this case.

Bug: chromium:348440590
Change-Id: Id5960825c79d7c764d257e9b4bd0a1de751878d8
2024-07-11 17:34:57 -07:00
Wan-Teh Chang a40848c80f Do not include vpx_version.h
Replace the VERSION_STRING_NOSP macro by the public API function
vpx_codec_version_str().

Treat vpx_version.h as an absolutely internal header of the libvpx
library.

Change-Id: I86ba8548a62adae91ae7f5caad98169707f3fc64
2024-07-09 16:57:20 -07:00
Angie Chiang 1640ed4089 Turn off frame_stats == NULL error.
This change happens in define_gf_group().
Since this part is not critical for ext_ratectrl,
turn off the error reporting for now.

Change-Id: Ie74aa06a116edb8c5d9e7b29cadbd366232fbc1d
2024-07-09 13:35:00 -07:00
Wan-Teh Chang 066ea57e3d Fix unused function warnings in real-time only
The compare_fp_stats() and compare_fp_stats_md5() functions are not used
when CONFIG_REALTIME_ONLY is equal to 1. Define these functions only if
CONFIG_REALTIME_ONLY is 0 to avoid the -Wunused-function warnings.

Change-Id: Iaae208f67708cfaeee5304b0320ebce63c863f96
2024-07-08 14:31:23 -07:00
Jingning Han 7cc7bbba1f Allow TPL group to reference more frames
Allow the TPL group to use up to 3 reference frames from the
previous GOP. This slightly changes the coding stats in the range
of <0.1%.

STATS_CHANGED

Change-Id: Ieb4e948a783bf8ef9ca78717d56ff750f3f795a4
2024-07-08 17:02:15 +00:00
Wan-Teh Chang 4ac9c4ba32 Fix int cast errors in vp8 on max target bitrate
Fix double-to-int cast overflows in vp8 code caused by setting the
target bitrate to the maximum value (2000000).

Tested: Build libvpx with UndefinedBehaviorSanitizer and then run
./vpxenc husky.yuv -o AV1_husky_2000000_10000000_10000000.webm --good \
  --cpu-used=2 -v -t 0 -w 352 -h 288 --fps=10000000/10000000 \
  --target-bitrate=2000000 --limit=150 --test-decode=fatal --passes=2 \
  --lag-in-frames=25 --min-q=0 --max-q=63 --arnr-maxframes=7 \
  --arnr-strength=5 --kf-max-dist=9999 --undershoot-pct=100 \
  --overshoot-pct=100 --bias-pct=50 --codec=vp8

Note: This is essentially the VP8 version of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/191361.

Bug: 349440066
Change-Id: Ia43e1aad8fcab60ace49da960579081c2c3a5445
2024-07-03 17:09:52 +00:00
Wan-Teh Chang 27c39522f5 vpxenc.c: Fix UBSan integer errors in test_decode
Fix the following UBSan integer errors in test_decode():
vpxenc.c:1589:57: runtime error: implicit conversion from type 'int' of
value -16 (32-bit, signed) to type 'unsigned int' changed the value to
4294967280 (32-bit, unsigned)
vpxenc.c:1590:58: runtime error: implicit conversion from type 'int' of
value -16 (32-bit, signed) to type 'unsigned int' changed the value to
4294967280 (32-bit, unsigned)

Tested: Build libvpx with -fsanitize=integer and then run
./vpxenc husky.yuv -o AV1_husky_2000000_10000000_10000000.webm --good \
  --cpu-used=2 -v -t 0 -w 352 -h 288 --fps=10000000/10000000 \
  --target-bitrate=2000000 --limit=150 --test-decode=fatal --passes=2 \
  --lag-in-frames=25 --min-q=0 --max-q=63 --arnr-maxframes=7 \
  --arnr-strength=5 --kf-max-dist=9999 --undershoot-pct=100 \
  --overshoot-pct=100 --bias-pct=50 --codec=vp8

Bug: 349440066
Change-Id: Ice2f0e7176ffec664856559e2c02bd51113c4d74
2024-07-03 16:26:46 +00:00
Wan-Teh Chang a396ac214d Fix unsigned int overflow in init_rate_histogram()
Tested: Build libvpx with -fsanitize=integer and then run
./vpxenc husky.yuv -o AV1_husky_2000000_10000000_10000000.webm --good \
  --cpu-used=2 -v -t 0 -w 352 -h 288 --fps=10000000/10000000 \
  --target-bitrate=2000000 --limit=150 --test-decode=fatal --passes=2 \
  --lag-in-frames=25 --min-q=0 --max-q=63 --min-gf-interval=4 \
  --max-gf-interval=22 --arnr-maxframes=7 --arnr-strength=5 \
  --kf-max-dist=9999 --aq-mode=0 --undershoot-pct=100 \
  --overshoot-pct=100 --bias-pct=50

This unsigned integer overflow seems to be caused by
g_timebase.num=1000000.

Note: This is a port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/191401.

Bug: 349440066
Change-Id: I924fa9c653400764dd7320938b88b4ea40f38172
2024-07-02 15:25:11 -07:00
Wan-Teh Chang af599a0c5f Fix further overflow issue in VBR.
This patch fixes some additional cases where under extreme conditions
some of the VBR adjustment variables can wrap.

As this happens on a per frame level the extra saturation checks should
not be an issue for performance.

Note: This CL is a port of the following libaom CLs:
https://aomedia-review.googlesource.com/c/aom/+/190521
https://aomedia-review.googlesource.com/c/aom/+/190888

Change-Id: I87c4ecca10f39767002f7d90d0f43b19c7150832
2024-06-28 21:29:04 -07:00
Wan-Teh Chang ac117ca7f9 Remove static from vars in parse_stream_params()
Those variables in parse_stream_params() don't need to be function-scope
static variables.

Change-Id: I5e0b0f78deb0aa8b4f95dcd2352d89342b9d528a
2024-06-27 17:30:01 -07:00
Angie Chiang 2693255a25 Let vp9_ext_ratectrl getting key frame decision
BUG = b/347936295

Change-Id: Ic152d3f6873ebe02977c1ede6a5663bd5f9be363
2024-06-21 11:21:18 -07:00
Marco Paniconi 253d6365e3 rtc-vp9: Allow scene detection for all speeds
Current code was disallowing scene detection for
speeds >= 8, to avoid any encode_time increase
(see comment in the code).

But we can expect the cost to be small even at speed 8,9,
and that concern on encode_time was from some time ago
before 8 and 9 were further optimized. And this is
needed for content with scene changes (see issue attached).
So allow scene detection now for all RTC speed settings (speed >= 5).

Bug: b/346846607
Change-Id: I678dbb88ff1399ed89b2bf9770ae9427e3044fc4
2024-06-18 16:58:00 +00:00
James Zern f07ca82f7a set_analyzer_env.sh: remove -fno-strict-aliasing
The last reference to the flag in configure was removed in:
fad70a358 Remove -fno-strict-aliasing flag

The library should be expected to function without this flag; it's built
and tested elsewhere without it.

Bug: webm:570, webm:603
Change-Id: Icf85fd9bd5c9cb0c81d6eecf10fba07807f48b4a
2024-06-14 12:16:33 -07:00
James Zern d6ae3ea465 rtcd.pl: add license header to generated files
Bug: aomedia:3525
Change-Id: I614056558fb5439b448342e0c01e53bd8da85585
2024-06-13 11:54:43 -07:00
Angie Chiang 68deb7ee20 Add missing header in vp9_firstpass.c
Change-Id: I675fa2b74b567e47f2a8fe2a7e4b4d3e77880d13
2024-06-12 14:29:08 -07:00
Angie Chiang ff67a4f209 Fix typo of received again
Change-Id: I6df009ec0423c2ef244399107c968ae1255337e5
2024-06-12 14:17:00 -07:00
Angie Chiang 277a5cdaa4 Remove redundant setting of max_layer_depth.
Change-Id: Ide2b6852339471b8e82109c846ba24fe7dc94aaa
2024-06-12 12:30:41 -07:00
Angie Chiang 2ca6e875c3 Typo recieved -> received
Change-Id: I140b5c2a5cefc346b3961dad09fd145d85d44d17
2024-06-12 10:28:12 -07:00
James Zern fb01e53c98 configure: add -c to ASFLAGS for android + AS=clang
The GNU Assembler was removed in r24. clang's internal assembler works,
but `-c` is necessary to avoid linking.

Bug: webm:1856
Change-Id: I61f80cf78657d3b71d5e73c5b2510575533ca5ea
2024-06-11 22:55:22 +00:00
James Zern b0c9d0c6fe configure: remove unused NM & RANLIB variables
+ update list in README

Change-Id: I363e9bc36b2e160de43d0fbcba4700297a582549
2024-06-11 22:55:22 +00:00
Angie Chiang ed95b102c4 Move ext_rc_define_gf_group_structure
Move the function into define_gf_group().
define_gf_group() has a lot of settings that might cause
performance drop if skipped.

Imitate define_gf_group_structure()'s behavior which add
an extra overlay frame at the end of gf_group whenever
alt_ref is used.

After this change, we can feed the baseline decision through
webmrc and get the same result as baseline.

This CL is tested with city_cif.yuv using ffmpeg

BUG = b/345528565

Change-Id: Ib61f0a0a72251f8662fb4072e0cfd7f456a243b3
2024-06-11 20:19:04 +00:00
James Zern 271b3f0bf0 tiny_ssim: mv read error checks closer to assignment
Quiets some spurious -Wmaybe-uninitialized warnings with gcc 14.1.0.

In function 'calc_plane_error16',
    inlined from 'main' at ../tools/tiny_ssim.c:464:5:
../tools/tiny_ssim.c:37:12: warning: 'v[0]' may be used uninitialized
  [-Wmaybe-uninitialized]
   37 |   if (orig == NULL || recon == NULL) {
      |            ^
In function 'calc_plane_error16',
    inlined from 'main' at ../tools/tiny_ssim.c:462:5:
../tools/tiny_ssim.c:37:12: warning: 'u[0]' may be used uninitialized
  [-Wmaybe-uninitialized]
   37 |   if (orig == NULL || recon == NULL) {
      |            ^
In function 'calc_plane_error',
    inlined from 'main' at ../tools/tiny_ssim.c:461:5:
../tools/tiny_ssim.c:61:12: warning: 'y[0]' may be used uninitialized
  [-Wmaybe-uninitialized]
   61 |   if (orig == NULL || recon == NULL) {

To reduce confusion, read_input_file() is changed to return an int as
previously it would only return (size_t)-1/0/1 (and now returns 0/1).

Change-Id: I2344048ecc2bd233891ffcef08002ee98d6d262a
2024-06-10 16:05:06 -07:00
Matt Oliver c1cc9ebd58 project: Update for 1.14.1 merge. 2024-06-08 22:45:23 +10:00
Matt Oliver 68a5066df4 Merge commit '12f3a2ac603e8f10742105519e0cd03c3b8f71dd' 2024-06-08 20:02:59 +10:00
James Zern a2508b5711 configure: disable runtime cpu detect w/armv7*-darwin
The default behavior changed in:
148d1085f Refactor and extend run-time CPU feature detection on Arm

This fixes build errors with these targets as there is no runtime cpu
detection defined for them.

Change-Id: Ie6b0bae1fc3e244d7dfcc823f60c3e466ccade79
2024-06-07 10:59:16 -07:00
Wan-Teh Chang ec129c190a Document the internal maximum of rc_target_bitrate
Both VP8 and VP9 internally cap the target bitrate to the smaller of the
uncompressed bitrate and 1000000 kilobits per second.

Change-Id: I4008ce09b5e709e75111800341d015e41eb1da42
2024-06-05 17:13:48 -07:00
Wan-Teh Chang b401a1ff2e Remove unnecessary double cast for cpi->framerate
cpi->framerate is already of the double type.

Change-Id: Ia9211b699e25b1c603585a40370a1ed66e7cbf03
2024-06-05 22:57:48 +00:00
Marco Paniconi faf12bdb83 vp9: round for framerate and _min/max_gf_interval()
Fixes for the comments in:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5598161

Change-Id: Ib7db69649c848098cd3f6e4a88233d333e84f628
2024-06-05 14:21:58 -07:00
James Zern 713e0faca0 vp9: round avg_frame_bandwidth result
in vp9_rc_update_framerate() and in functions in vp9_svc_layercontext.c.

This matches the code in VP8 and AV1 as discussed in
https://chromium-review.googlesource.com/c/webm/libvpx/+/5566050/2/vp8/encoder/onyx_if.c

Change-Id: I084f8002f8f6c8efffc511566910b3f3df47ba4e
2024-06-05 00:14:34 +00:00
Marco Paniconi 60807f0aba Use round for RC calcutions in cyclic_refresh
Same as the fix in libaom:
https://aomedia-review.googlesource.com/c/aom/+/190881

Bug: aomedia:3579

Change-Id: Idb4026943a970189e6cd47a29e54e16623595e31
2024-06-04 14:50:58 -07:00
Angie Chiang 9d734db169 Rename gop_size by show_frame_count
Change-Id: Id95fbeaa0ceeb10c077bfd628f45fe880b42b3de
2024-06-03 07:35:49 -07:00
Wan-Teh Chang fd84dccd51 Fix high target data rate overflow.
These change fixes issues that can occur if the user specifies a very
high target data rate or rate per frame.

Fixes some issue with overflow of int variables used to hold bitrate
values (rate per second, rate per frame etc).

Note: This CL is a port of the following libaom CLs:
https://aomedia-review.googlesource.com/c/aom/+/190381
https://aomedia-review.googlesource.com/c/aom/+/190462

All the changes were ported to VP9. For VP8, only the new type of
cpi->bytes (equivalent to ppi->total_bytes in libaom) was ported.

Change-Id: I438dd46efd5a134389b893ffae1f8a2381207906
2024-05-31 16:00:25 -07:00
Jingning Han ffe9c9a457 Handle ARF and GF gop cases
Allow the inference scheme to cover GOPs with and without ARFs.

Change-Id: I68518791e96d7d5b92355c34360bbb74f2ecc436
2024-05-31 09:25:51 -07:00
Jingning Han ddf3c281e6 Remove a redundant condition in firstpass.c
Remove a redundant condition to trigger ext_rc gop structure
function all.

Change-Id: Ia3f135c67982b5539b9a2e8a74ba13edd9b5e46f
2024-05-30 18:56:47 +00:00
Jerome Jiang b5ba2274a0 Merge tag 'v1.14.1' into main-merge-1.14.1
2024-05-21 v1.14.1 "Venetian Duck"

  This release includes enhancements and bug fixes.

  - Upgrading:
    This release is ABI compatible with the previous release.

  - Enhancement:
    Improved the detection of compiler support for AArch64 extensions,
    particularly SVE.

    Added vpx_codec_get_global_headers() support for VP9.

  - Bug fixes:

    Added buffer bounds checks to vpx_writer and vpx_write_bit_buffer.
    Fix to GetSegmentationData() crash in aq_mode=0 for RTC rate control.
    Fix to alloc for row_base_thresh_freq_fac.
    Free row mt memory before freeing cpi->tile_data.
    Fix to buffer alloc for vp9_bitstream_worker_data.
    Fix to VP8 race issue for multi-thread with pnsr_calc.
    Fix to uv width/height in vp9_scale_and_extend_frame_ssse3.
    Fix to integer division by zero and overflow in calc_pframe_target_size().
    Fix to integer overflow in vpx_img_alloc() & vpx_img_wrap()(CVE-2024-5197).
    Fix to UBSan error in vp9_rc_update_framerate().
    Fix to UBSan errors in vp8_new_framerate().
    Fix to integer overflow in vp8 encodeframe.c.
    Handle EINTR from sem_wait().

Change-Id: Ic5e274fdc35c9141591a65e825bf012d2cca3caa
2024-05-30 11:35:52 -04:00
Jerome Jiang 12f3a2ac60 Update CHANGELOG
Bug: webm:1854
Change-Id: I3242d7fd58838aa8c4103ae07a67deb9dcc7dd37
2024-05-29 16:00:23 -04:00
Jerome Jiang be3ea68f9e Update CHANGELOG for fixes to ubsan errors
Bug: webm:1854
Change-Id: I81050a6a69721062078e818ca3ce23994749f711
2024-05-29 12:10:08 -04:00
Wan-Teh Chang 1dbb3b28e8 Fix some UBSan errors in vp8_new_framerate()
Fix some UBSan errors in the calculations of cpi->av_per_frame_bandwidth
and cpi->min_frame_bandwidth in vp8_new_framerate() and in the
calculation of cpi->per_frame_bandwidth in encode_frame_to_data_rate().

A port of the VP9 changes in
https://chromium-review.googlesource.com/c/webm/libvpx/+/4944271 and
https://chromium-review.googlesource.com/c/webm/libvpx/+/5565157 to VP8.
Similar to the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/190462.

Bug: aomedia:3509
Change-Id: I77b0e0b2f9fe667428daa9c4ceec0a35aafbfa81
(cherry picked from commit 25540b3c12)
2024-05-28 17:08:30 +00:00
Wan-Teh Chang c60622ebac Fix a UBSan error in vp9_rc_update_framerate()
Fix a UBSan error in the calculation of rc->min_frame_bandwidth in
vp9_rc_update_framerate().

A follow-up to
https://chromium-review.googlesource.com/c/webm/libvpx/+/4944271.
Similar to the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/190462.

Bug: aomedia:3509
Change-Id: I36168a6d00cd81e60ae19a7d74c21f2e6c2f0caf
(cherry picked from commit 1f65facb63)
2024-05-24 18:13:42 +00:00
Wan-Teh Chang 25540b3c12 Fix some UBSan errors in vp8_new_framerate()
Fix some UBSan errors in the calculations of cpi->av_per_frame_bandwidth
and cpi->min_frame_bandwidth in vp8_new_framerate() and in the
calculation of cpi->per_frame_bandwidth in encode_frame_to_data_rate().

A port of the VP9 changes in
https://chromium-review.googlesource.com/c/webm/libvpx/+/4944271 and
https://chromium-review.googlesource.com/c/webm/libvpx/+/5565157 to VP8.
Similar to the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/190462.

Bug: aomedia:3509
Change-Id: I77b0e0b2f9fe667428daa9c4ceec0a35aafbfa81
2024-05-23 18:05:54 -07:00
Wan-Teh Chang 495c4b596c Add a #endif comment for CONFIG_VP9_HIGHBITDEPTH
Change-Id: Idc388e722e2579ce6935b52b1786038bdf2d5d47
2024-05-23 23:50:28 +00:00
James Zern f3e064e1d8 {aarch*,arm}_cpudetect: align define with comment
ANDROID_USE_CPU_FEATURES_LIB -> VPX_USE_ANDROID_CPU_FEATURES

Change-Id: I2d425cf3cd28219e570efb0c442b33f1a64447ae
2024-05-23 23:48:22 +00:00
Wan-Teh Chang 1f65facb63 Fix a UBSan error in vp9_rc_update_framerate()
Fix a UBSan error in the calculation of rc->min_frame_bandwidth in
vp9_rc_update_framerate().

A follow-up to
https://chromium-review.googlesource.com/c/webm/libvpx/+/4944271.
Similar to the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/190462.

Bug: aomedia:3509
Change-Id: I36168a6d00cd81e60ae19a7d74c21f2e6c2f0caf
2024-05-23 14:38:50 -07:00
Wan-Teh Chang db4d6a5f54 Fix a typo in the CpuSpeedTest.TestTuneScreen test
Change the second rc_2pass_vbr_minsection_pct to
rc_2pass_vbr_maxsection_pct.

This copy-and-paste error was introduced in
https://chromium-review.googlesource.com/c/webm/libvpx/+/332653.

Change-Id: If2c61cd2ce0a6808643b8e80a27f054f7339e0fd
2024-05-22 16:04:40 -07:00
Angie Chiang 6c079c8beb Account for gop_decision->use_alt_ref
The change is in ext_rc_define_gf_group_structure()

Bug: b/339314081

Change-Id: I03576a0407105ced3e7cff4c33986e9a9a83b77f
2024-05-22 10:46:39 -07:00
Jerome Jiang 18da85657c Update AUTHOR, version and CHANGELOG
Bug: webm:1854
Change-Id: I0801e9b685d395c7556e2269601f4c01ab310661
2024-05-22 11:35:53 -04:00
Wan-Teh Chang 61c4d556bd Fix a bug in alloc_size for high bit depths
I introduced this bug in commit 2e32276:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5446333

I changed the line

  stride_in_bytes = (fmt & VPX_IMG_FMT_HIGHBITDEPTH) ? s * 2 : s;

to three lines:

  s = (fmt & VPX_IMG_FMT_HIGHBITDEPTH) ? s * 2 : s;
  if (s > INT_MAX) goto fail;
  stride_in_bytes = (int)s;

But I didn't realize that `s` is used later in the calculation of
alloc_size.

As a quick fix, undo the effect of s * 2 for high bit depths after `s`
has been assigned to stride_in_bytes.

Bug: chromium:332382766
Change-Id: I53fbf405555645ab1d7254d31aadabe4f426be8c
(cherry picked from commit 74c70af016)
2024-05-21 18:43:46 +00:00
Wan-Teh Chang 5193ce7167 Apply stride_align to byte count, not pixel count
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188962.

stride_align is documented to be the "alignment, in bytes, of each row
in the image (stride)."

Change-Id: I2184b50dc3607611f47719319fa5adb3adcef2fd
(cherry picked from commit 7d37ffacc6)
2024-05-21 18:43:07 +00:00
Wan-Teh Chang 5a83437ffc Avoid wasted calc of stride_in_bytes if !img_data
Change-Id: If1ddde5e894a06359f15486a2cee054a2f0cb1a2
(cherry picked from commit 8b2f8baee5)
2024-05-21 18:42:17 +00:00
Wan-Teh Chang 9d7054c0cb Avoid integer overflows in arithmetic operations
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188823.

Impose maximum values on the input parameters so that we can perform
arithmetic operations without worrying about overflows.

Also change the VpxImageTest.VpxImgAllocHugeWidth test to write to the
first and last samples in the first row of the Y plane, so that the test
will crash if there is unsigned integer overflow in the calculation of
stride_in_bytes.

Bug: chromium:332382766
Change-Id: I54cec6c9e26377abaa8a991042ba277ff70afdf3
(cherry picked from commit 06af417e79)
2024-05-21 18:30:51 +00:00
Wan-Teh Chang c5640e3300 Fix integer overflows in calc of stride_in_bytes
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188761.

Fix unsigned integer overflows in the calculation of stride_in_bytes in
img_alloc_helper() when d_w is huge.

Change the type of stride_in_bytes from unsigned int to int because it
will be assigned to img->stride[VPX_PLANE_Y], which is of the int type.

Test:
. ../libvpx/tools/set_analyzer_env.sh integer
../libvpx/configure --enable-debug --disable-optimizations
make -j
./test_libvpx --gtest_filter=VpxImageTest.VpxImgAllocHugeWidth

Bug: chromium:332382766
Change-Id: I3b39d78f61c7255e10cbf72ba2f4975425a05a82
(cherry picked from commit 2e32276277)
2024-05-21 18:30:29 +00:00
Wan-Teh Chang f60da3e3ea Add test/vpx_image_test.cc
Ported from test/aom_image_test.cc in libaom commit 04d6253.

Change-Id: I56478d0a5603cfb5b65e644add0918387ff69a00
(cherry picked from commit 3dbab0e664)
2024-05-21 18:29:47 +00:00
Wan-Teh Chang b9c4f1951f Make img_alloc_helper() fail on VPX_IMG_FMT_NONE
If fmt is VPX_IMG_FMT_NONE, currently img_alloc_helper() allocates a
single plane because VPX_IMG_FMT_NONE (0) is not a planar format (the
VPX_IMG_FMT_PLANAR bit is not set in VPX_IMG_FMT_NONE).

Although this seems correct, the problem is that most of the code in
libvpx assumes planar formats and is likely to dereference a null
pointer when it uses img->planes[1]. Also, VPX_IMG_FMT_NONE isn't really
a valid image format. So it is safer to make img_alloc_helper() fail if
fmt is VPX_IMG_FMT_NONE.

Change-Id: I05b47f4b5eceb631a02384b2cce1c2f6fdca8673
(cherry picked from commit d3a946de8c)
2024-05-21 18:28:09 +00:00
Marco Paniconi 5b4cfe88e4 vp9-rtc: Fix integer overflow in key frame target size
The integer overflow happens
in vp9_calc_iframe_target_size_one_pass_cbr(), when
calculating the target size for L1T3 encoding.

The input target bitrate(kbps) is very large, so it gets set
to INT_MAX (before being multiplied by 1000 to convert to bps),
and avg_frame_bandwidth is then set to (INT_MAX / lc->framerate),
which when multipled by (16 + kf_boost) can exceed INT_MAX.
Fix is to cast the operands to int64_t and final result to int.

Bug: chromium:340918567
Change-Id: Ic00094b22c1f12ca988c0cb1fcaed473e1f8ed2b
2024-05-16 11:51:47 -07:00
Deepa K G ea0cd1a38d Fix error handling in vp9_pack_bitstream()
In multi-threaded scenario, when the bitstream
buffer allocated is insufficient, the main thread
called 'longjmp' without waiting for the completion
of workers. In this patch, 'longjmp' is called by
the main thread after joining other worker threads.

This resolves the assertion failure as reported in
Bug: webm:1847

Bug: webm:1844

Change-Id: I399c76087b65e7b8d9a9fa4f12d784408243d648
(cherry picked from commit 611d9ba0a5)
2024-05-14 15:04:00 -04:00
Wan-Teh Chang 58955cf5f5 Perform bounds checks in vpx_write_bit_buffer
Add the `size` and `error` members to the vpx_write_bit_buffer struct.
Add the vpx_wb_init() and vpx_wb_has_error() functions.

Instances of the vpx_write_bit_buffer struct are only allocated in the
vp9_pack_bitstream() function. So vp9_pack_bitstream() is the only
function outside vpx_dsp/bitwriter_buffer.* that needs updating.

This CL completes the work of adding output buffer bounds checks to
vp9/encoder/vp9_bitstream.c.

Bug: webm:1844
Change-Id: I6b362be572852ee51d96023b35bfb334faada7e1
(cherry picked from commit d790001fd5)
2024-05-14 13:04:03 -04:00
Wan-Teh Chang 3bfd83a70c Perform bounds checks in vpx_writer
In the vpx_writer struct, change the buffer_end field to the size field.
Change vpx_stop_encode() to return true on success, false on failure
(output buffer full).

In write_compressed_header(), remove the assertion
assert(header_bc.pos <= 0xffff). The caller (vp9_pack_bitstream()) will
check that condition.

In vp9_pack_bitstream(), the variable "first_part_size" is renamed
"compressed_hdr_size".

Bug: webm:1844
Change-Id: I4ed6ab905a707ad44d875e53036d5a42523a65d0
(cherry picked from commit 73703c188b)
2024-05-14 13:04:03 -04:00
James Zern 34d3114348 vp9_pack_bitstream: remove a dead store
Fixes a static analysis warning:
Value stored to 'data_size' is never read

Bug: webm:1844
Change-Id: Ia27181b1051bb2c3a6bc4a4c2549df8b0525e889
(cherry picked from commit 9f73377821)
2024-05-14 13:04:03 -04:00
Wan-Teh Chang b1cb83ca01 Add the buffer_end field to the vpx_writer struct
The buffer_end field will allow bounds checking when vpx_writer writes
to the output buffer. This CL sets up the plumbing to pass the output
buffer size from vp9_pack_bitstream() to vpx_start_encode(), which
initializes the vpx_writer struct. vpx_writer doesn't use the output
buffer size in bounds checks yet, but the code in vp9_bitstream.c does.

Bug: webm:1844
Change-Id: I995e469ab453c02d740f54b46e0b08c7f2eb1a2e
(cherry picked from commit e387187438)
2024-05-14 10:20:16 -04:00
Wan-Teh Chang ac433759d1 Pass output buffer size to vp9_pack_bitstream()
Set up the plumbing to pass the size of the output buffer `dest` to
vp9_pack_bitstream(). The output buffer is the cx_data buffer in the
encoder_encode() function in vp9/vp9_cx_iface.c, and its size is
cx_data_sz.

In this CL vp9_pack_bitstream() ignores the `dest_size` parameter.

Bug: webm:1844
Change-Id: I53c80280143d409cf16f87c4d6deec3d9338aea3
(cherry picked from commit d48577579b)
2024-05-14 10:19:55 -04:00
Deepa K G 611d9ba0a5 Fix error handling in vp9_pack_bitstream()
In multi-threaded scenario, when the bitstream
buffer allocated is insufficient, the main thread
called 'longjmp' without waiting for the completion
of workers. In this patch, 'longjmp' is called by
the main thread after joining other worker threads.

This resolves the assertion failure as reported in
Bug: webm:1847

Bug: webm:1844

Change-Id: I399c76087b65e7b8d9a9fa4f12d784408243d648
2024-05-14 01:14:15 +05:30
Wan-Teh Chang b1cf64c40b vpx_decoder.h: Change "size member" to "sz member"
That member of vpx_codec_stream_info_t is named "sz", not "size".

Change-Id: I6cc878709d9dae37b9911cf746ba248a06ec1b1a
2024-05-13 17:03:10 +00:00
Wan-Teh Chang 498097b15b vpx_dec_fuzzer.cc: Initialize stream_info.sz
stream_info.sz should be initialized to sizeof(stream_info).

Bug: oss-fuzz:68912
Change-Id: I0cc0fcdfc93b7188a834ee1896f0bb4cf8c32fa9
2024-05-13 16:59:39 +00:00
Angie Chiang 5913401ebb Add vp9_ratectrl.h header to vp9_firstpass.c
KF_STD/GF_ARF_STD are used in vp9_firstpass.c
and defined in vp9_ratectrl.h

Change-Id: I5a6e42faa23e5f50630926e336daef37055fd195
2024-05-13 01:07:24 +00:00
Wan-Teh Chang db25581967 Assert a vpx_img_set_rect call always succeed
The vpx_img_set_rect() call at the end of img_alloc_helper() always
succeeds, so assert its return value is equal to 0.

A port of the changes to aom/src/aom_image.c in the libaom CLs
https://aomedia-review.googlesource.com/c/aom/+/90307 and
https://aomedia-review.googlesource.com/c/aom/+/190011.

Bug: webm:1850
Change-Id: I559820db245a596b4aed2042bfa7ebe7dd2d69b7
2024-05-10 23:58:23 +00:00
James Zern 1a3cd4922b vpx_dec_fuzzer: add vpx_codec_peek_stream_info coverage
Change-Id: I511539292cb8c2098c81f5fe3d711b9739482ffa
2024-05-09 17:19:33 -07:00
Jerome Jiang e934e35515 vp9 rc: also run tpl for GOPs without ARF
Tested with ffmpeg integration end to end test.

Bug: b/338393251
Change-Id: I4048036d35f8ab64c07305b838d091f765f64a8d
2024-05-09 15:19:35 -04:00
Hirokazu Honda bff1fe63ea vp9 rc: Fix GetSegmentationData() crash in aq_mode=0
cpi_->cyclic_refresh is nullptr if aq_mode is 0, in other words, the
rate controller runs in non adaptive quantization mode. This CL fixes
the crash in GetSegmentationData() in non aq mode.

Bug: b/259487065
Test: video encoding on ChromeOS

Change-Id: I503b30d15c697c8dd1da203b3c7361b91c428e87
(cherry picked from commit 1d007eafa3)
2024-05-08 19:07:00 -04:00
Marco Paniconi 08a79efb18 vp9: Fix to alloc for row_base_thresh_freq_fac
Issue happens for real-time nonrd pickmode.
Due to speed feature: sf->adaptive_rd_thresh_row_mt,
enabled for speed >= 8, and for speed >= 7 svc only.

Issue occurs where resolution (sb_rows) changes and
row_base_thresh_freq_fact needs to be re-allocated.

Fix is to add sb_rows to TileDataEnc and check for
re-alloc of row_base_thresh_freq_fac.

Bug: b:331108922
Change-Id: I1a1ca94c14f343200c180725e4cb8d91d3c55b83
(cherry picked from commit 3f8f19372b)
2024-05-08 19:07:00 -04:00
Wan-Teh Chang 669654dda0 Free row mt memory before freeing cpi->tile_data
In vp9_init_tile_data(), call vp9_row_mt_mem_dealloc(cpi) to free the
row mt memory in cpi->tile_data before freeing cpi->tile_data.

Bug: b:331086799, b:331108729
Change-Id: Idc79984ce7e0110e6858139b2ed286492a2e8622
(cherry picked from commit 34277e53ad)
2024-05-08 19:07:00 -04:00
James Zern 41fd571e7a encode_api_test.cc: assert encoder is initialized
Before proceeding with Encode(). This avoids some static analysis
warnings about uninitialized `cfg_` members.

Change-Id: Ib67b278d6706ab1034219e8c1ad9ba0c5b574ba8
(cherry picked from commit 108f5128e2)
2024-05-08 19:07:00 -04:00
James Zern 8433fe6393 vpx_ext_ratectrl.h,cosmetics: Correspondent -> Corresponds
+ add some doxygen autolinks

Change-Id: Ifceb7d9e89d31037d0b690b1d661cebcd6fa67b8
2024-05-08 14:07:50 -07:00
Wan-Teh Chang 1e5823f682 Handle EINTR from sem_wait()
sem_wait() may be interrupted by a signal and fail with EINTR:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_wait.html

Retry the sem_wait() call if it fails with EINTR.

This finishes the fix started in
https://chromium-review.googlesource.com/c/webm/libvpx/+/5299569. As a
speculative fix, that CL fixed only the sem_wait(&cpi->h_event_end_lpf)
calls responsible for bug chromium:324459561. ClusterFuzz verified the
fix, so this CL extends it to the other sem_wait() calls.

Note that sem_wait() calls like the following do not need this fix,
because the while (1) loop retries the sem_wait() call if it fails:

  while (1) {
    if (vpx_atomic_load_acquire(&cpi->b_multi_threaded) == 0) break;

    if (sem_wait(&cpi->h_event_start_lpf) == 0) {
      ...
    }
  }

Bug: chromium:324459561
Change-Id: I0f0612616eee37fb3da68049e49b3e86927b5e24
(cherry picked from commit d4959f9825)
2024-05-07 18:29:13 +00:00
James Zern 108f5128e2 encode_api_test.cc: assert encoder is initialized
Before proceeding with Encode(). This avoids some static analysis
warnings about uninitialized `cfg_` members.

Change-Id: Ib67b278d6706ab1034219e8c1ad9ba0c5b574ba8
2024-05-03 22:00:56 +00:00
Yunqing Wang 314ee14b64 Fix a rare memory overflow bug
In very rare cases (e.g. encoding with very high bit rate), the
allocated token memory isn't enough, which causes a buffer overflow
and then an encoder failure. This is fixed by using the aligned
number of blocks while allocating this buffer.

BUG=b/328803779

Change-Id: I5437cce13398206bf9982d57f35d6f9da17b187f
2024-05-03 21:01:23 +00:00
James Zern 9f73377821 vp9_pack_bitstream: remove a dead store
Fixes a static analysis warning:
Value stored to 'data_size' is never read

Bug: webm:1844
Change-Id: Ia27181b1051bb2c3a6bc4a4c2549df8b0525e889
2024-05-03 20:59:31 +00:00
James Zern a0c4e53665 configure: Do more elaborate test of whether SVE can be compiled
This is a port of the change in libaom:
https://aomedia-review.googlesource.com/c/aom/+/189761
5ccdc66ab6 cpu.cmake: Do more elaborate test of whether SVE can be compiled

For Windows targets, Clang will successfully compile simpler
SVE functions, but if the function requires backing up and restoring
SVE registers (as part of the AAPCS calling convention), Clang
will fail to generate unwind data for this function, resulting
in an error.

This issue is tracked upstream in Clang in
https://github.com/llvm/llvm-project/issues/80009.

Check whether the compiler can compile such a function, and
disable SVE if it is unable to handle that case.

Change-Id: I8550248abd6a7876bd8ecf6ba66bc70518133566
(cherry picked from commit 35f0262c5e)
2024-05-03 17:46:53 +00:00
James Zern e44918bd4e VP9: add vpx_codec_get_global_headers() support
This returns the contents of CodecPrivate described in:
https://www.webmproject.org/docs/container/#vp9-codec-feature-metadata-codecprivate

The value for 4:2:0 is 1 (colocated) to match the default given for the
codec parameter string:
https://www.webmproject.org/vp9/mp4/#codecs-parameter-string

Bug: b:332052663
Change-Id: Ie50dd8d76e2d7389ac01bf4dbec801f9c8ea0e21
(cherry picked from commit 63b9c2c0e2)
2024-05-02 15:26:47 -07:00
James Zern 35f0262c5e configure: Do more elaborate test of whether SVE can be compiled
This is a port of the change in libaom:
https://aomedia-review.googlesource.com/c/aom/+/189761
5ccdc66ab6 cpu.cmake: Do more elaborate test of whether SVE can be compiled

For Windows targets, Clang will successfully compile simpler
SVE functions, but if the function requires backing up and restoring
SVE registers (as part of the AAPCS calling convention), Clang
will fail to generate unwind data for this function, resulting
in an error.

This issue is tracked upstream in Clang in
https://github.com/llvm/llvm-project/issues/80009.

Check whether the compiler can compile such a function, and
disable SVE if it is unable to handle that case.

Change-Id: I8550248abd6a7876bd8ecf6ba66bc70518133566
2024-05-02 15:21:07 -07:00
James Zern 3e713e39ae vp9_ethread_test: move 'best' mode to a Large test
This mode is used infrequently and is quite slow. This shifts the tests
to nightly to speed up the presubmit.

Change-Id: I3020887e0ca0150d7cbea9cc726649c11f94d56c
2024-05-02 22:20:00 +00:00
Angie Chiang 6db3f6e576 Add several utility functions to set gf_group
Use the utility functions and set gf_group_size in
ext_rc_define_gf_group_structure()

Avoid using gop_decision->update_type to keep the logic simple
for now.

Also simplify the interface.

Change-Id: I78fd5892e6f9731d50d6e5da97598b46c70a1dde
2024-05-02 21:41:30 +00:00
Wan-Teh Chang f65aff7b99 Remove vpx_ports/msvc.h
The vpx_ports/msvc.h header provides snprintf() and round() for MSVC
older than Visual Studio 2015 and Visual Studio 2013, respectively.

Since configure now requires vs14 (Visual Studio 2015) or later, it is
safe to remove vpx_ports/msvc.h.

Change-Id: I2fe4c41eaa126f4cf17639c11895f1e464294c76
2024-05-02 20:01:11 +00:00
James Zern 8372a5cfe1 vpx_ext_ratectrl.h: make rate_ctrl_log_path const
Change-Id: I499d77b25ca3dcdbd3c72fb319f9023e9a2823b0
2024-05-02 09:59:34 -07:00
Jerome Jiang 847b3548b4 Better format comments for vpx_ext_ratectrl.h
For vpx_rc_type_t: comment for each enum is moved to where it is
defined.

Change-Id: Ic1e2097ed381e7d71746792e0d517106db882685
2024-05-02 10:01:00 -04:00
Jerome Jiang 1c77f7fc0e Fix comments in vpx_ext_ratectrl.h
Added file level descriptor

Added comments for vpx_rc_ref_frame_t

Change-Id: Ifb000650821eab719b6e0fd003a00027ea132b9f
2024-05-02 10:01:00 -04:00
Wan-Teh Chang c0db981eaa Include <stdio.h> or <cstdio> for *printf()
Change-Id: Ifc0537fe5ae1223418fb68da5583cc72ae2c32a8
2024-05-02 02:26:16 +00:00
James Zern e9be4f607b encode_api_test.cc: apply iwyu
add missing <cstdio> and <cstdlib> and delete some unused headers.

Change-Id: I6c66368f557e6df896bffb2aa90228811f14f027
2024-05-02 02:25:43 +00:00
James Zern 7a0089dc08 vpx_ext_ratectrl.h: fix doxygen comments
fixes a few warnings about undocumented members update_type,
update_ref_index and ref_frame_list.

Change-Id: I668c61f6a511ba9e6c0907f6dafb0be614678e60
2024-05-01 13:24:59 -07:00
Angie Chiang f93e6aa333 Print gop_index in ENCODE_FRAME_RESULT
Change-Id: Icb522110dd2a7f87212ec0e7fc2638245008365f
2024-05-01 18:26:02 +00:00
James Zern b61b272208 vp9_rdopt.c: make init_frame_mv static
fixes a -Wmissing-prototypes warning

Change-Id: Ie380f9e4211ffab461f15dfe84184b8769d4f7bd
2024-04-26 12:54:08 -07:00
James Zern 63b9c2c0e2 VP9: add vpx_codec_get_global_headers() support
This returns the contents of CodecPrivate described in:
https://www.webmproject.org/docs/container/#vp9-codec-feature-metadata-codecprivate

The value for 4:2:0 is 1 (colocated) to match the default given for the
codec parameter string:
https://www.webmproject.org/vp9/mp4/#codecs-parameter-string

Bug: b:332052663
Change-Id: Ie50dd8d76e2d7389ac01bf4dbec801f9c8ea0e21
2024-04-25 15:20:18 -07:00
Angie Chiang 3015c41f06 Add VPX_RC_NONE
Change-Id: I8ca4caa7ffc4e9f8590ad8d02de0348b88c45254
2024-04-19 22:38:33 +00:00
James Zern 6f5839f986 vp9_encoder.c: fix printf format string
Replace %ld with %zu for `size_t`. Added in:
fd28f6f3c Add rate_ctrl_log_path

Fixes:
vp9\encoder\vp9_encoder.c(5748,15): warning C4477: 'fprintf' : format
  string '%ld' requires an argument of type 'long', but variadic
  argument 2 has type 'size_t'

Change-Id: I36fa9c7a9e14d4a2d9ef51a7f5c55de71bb34518
2024-04-19 10:55:20 -07:00
James Zern 2b88a07bc9 vpx_image_test.cc: add missing stdint include
fixes clang-tidy warning:
no header providing "uint16_t" is directly included

Change-Id: Ic71045ce6f88659ecd22243d473a3b6dc8c827dd
2024-04-18 12:50:01 -07:00
Angie Chiang fd28f6f3cc Add rate_ctrl_log_path
Change-Id: I4dc25c9ce4103cf3de44cff4d63e8ff8c82f35c0
2024-04-17 19:43:37 -07:00
Jerome Jiang 85dafa9c61 Initialize frame_mv in rd pick inter
Bug: b/334626386
Change-Id: Ie480a08f09c1b212b4163a5f6eb191c35510236f
2024-04-16 14:51:27 -04:00
Wan-Teh Chang 976134c50d Add 10 and 12b ranges to vpx_color_range_t comment
Add note about undefined behavior in vpx_codec_encode() description.

A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/158001
by Yannis Guyon <yguyon@google.com>.

Bug: webm:1850
Change-Id: Ia90f0bfd8265e35e9f33c17400c1c065d7915b77
2024-04-13 03:20:57 +00:00
Wan-Teh Chang 89efe85cd4 Clarify comment about buf_align in vpx_img_wrap.
If img_data is not NULL, img_alloc_helper ignores buf_align, so
vpx_img_wrap can set buf_align to any placeholder value.

A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/90362.

Bug: webm:1850
Change-Id: I42bc45aecf822a9314caf23058fe123d0574dc20
2024-04-13 03:19:57 +00:00
Wan-Teh Chang 3f4055b05b Introduce local vars uv_x,uv_y in vpx_img_set_rect
Port the changes to aom/src/aom_image.c in the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/56643. The changes
related to `border` are not ported.

Bug: webm:1850
Change-Id: Ie81fffe0c84e912da880ffca245ae27cd71cf348
2024-04-13 03:19:00 +00:00
Wan-Teh Chang 74c70af016 Fix a bug in alloc_size for high bit depths
I introduced this bug in commit 2e32276:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5446333

I changed the line

  stride_in_bytes = (fmt & VPX_IMG_FMT_HIGHBITDEPTH) ? s * 2 : s;

to three lines:

  s = (fmt & VPX_IMG_FMT_HIGHBITDEPTH) ? s * 2 : s;
  if (s > INT_MAX) goto fail;
  stride_in_bytes = (int)s;

But I didn't realize that `s` is used later in the calculation of
alloc_size.

As a quick fix, undo the effect of s * 2 for high bit depths after `s`
has been assigned to stride_in_bytes.

Bug: chromium:332382766
Change-Id: I53fbf405555645ab1d7254d31aadabe4f426be8c
2024-04-12 15:48:04 -07:00
Wan-Teh Chang 7d37ffacc6 Apply stride_align to byte count, not pixel count
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188962.

stride_align is documented to be the "alignment, in bytes, of each row
in the image (stride)."

Change-Id: I2184b50dc3607611f47719319fa5adb3adcef2fd
2024-04-11 16:46:13 -07:00
Wan-Teh Chang 8b2f8baee5 Avoid wasted calc of stride_in_bytes if !img_data
Change-Id: If1ddde5e894a06359f15486a2cee054a2f0cb1a2
2024-04-11 15:59:44 -07:00
Wan-Teh Chang 06af417e79 Avoid integer overflows in arithmetic operations
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188823.

Impose maximum values on the input parameters so that we can perform
arithmetic operations without worrying about overflows.

Also change the VpxImageTest.VpxImgAllocHugeWidth test to write to the
first and last samples in the first row of the Y plane, so that the test
will crash if there is unsigned integer overflow in the calculation of
stride_in_bytes.

Bug: chromium:332382766
Change-Id: I54cec6c9e26377abaa8a991042ba277ff70afdf3
2024-04-11 10:29:38 -07:00
Wan-Teh Chang 2e32276277 Fix integer overflows in calc of stride_in_bytes
A port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/188761.

Fix unsigned integer overflows in the calculation of stride_in_bytes in
img_alloc_helper() when d_w is huge.

Change the type of stride_in_bytes from unsigned int to int because it
will be assigned to img->stride[VPX_PLANE_Y], which is of the int type.

Test:
. ../libvpx/tools/set_analyzer_env.sh integer
../libvpx/configure --enable-debug --disable-optimizations
make -j
./test_libvpx --gtest_filter=VpxImageTest.VpxImgAllocHugeWidth

Bug: chromium:332382766
Change-Id: I3b39d78f61c7255e10cbf72ba2f4975425a05a82
2024-04-10 20:47:45 -07:00
Wan-Teh Chang 3dbab0e664 Add test/vpx_image_test.cc
Ported from test/aom_image_test.cc in libaom commit 04d6253.

Change-Id: I56478d0a5603cfb5b65e644add0918387ff69a00
2024-04-10 18:15:00 -07:00
Matt Oliver f4d13145a2 project: Update for 1.14.0 merge. 2024-04-06 22:50:00 +11:00
Matt Oliver 9e260c493d Merge commit '602e2e8979d111b02c959470da5322797dd96a19' 2024-04-06 22:13:57 +11:00
Wan-Teh Chang 8762f5efb2 Define the MAX_NUM_THREADS macro in vp9_ethread.h
The MAX_NUM_THREADS macro is unrelated to the VPxWorkerInterface, so it
doesn't need to be defined in vpx_util/vpx_thread.h.

The VP8 code doesn't seem to depend on MAX_NUM_THREADS, so VP8 can use
64 directly in the range check of its g_threads option. Move the
definition of the MAX_NUM_THREADS macro to vp9/encoder/vp9_ethread.h and
use it in VP9 code only.

Change-Id: Ibf788ca2496c743a2ac0498fefaab8a3c181228d
2024-04-04 20:27:32 +00:00
Chun-Min Chang 0752960c6a Add missing header for EBUSY on mingw
The `error: use of undeclared identifier 'EBUSY'` in
vpx_util/vpx_pthread.h was found in Mozilla's bug 1886318 [1]. This
patch addresses the issue by adding the `<errno.h>` header to introduce
the `EBUSY` identifier, resolving the problem.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1886318#c1

Change-Id: Ic417dafebf5ab160060dd29f692fa9c40d8db05a
2024-04-04 10:10:37 -07:00
Wan-Teh Chang 6445da1b40 Fix GCC -Wmissing-braces warnings
warning: missing braces around initializer [-Wmissing-braces]

Bug: webm:1846
Change-Id: I007a68d09f48d4199ecd948136e69f9cf5f219f5
2024-04-04 00:03:35 +00:00
Casey Smalley 2bafeadd3e Add missing configuration includes
The Google cpp style guide dictates that you should "include what you
use" with respect to symbols. This CL adds vpx_config.h imports to unit
tests that rely on config flags but were otherwise indirectly included.

Change-Id: Ia70a512cebe6c104d2d64afbed3cde8a405c68df
2024-04-03 20:20:44 +00:00
Casey Smalley 588beb020b Unit test config changes for Chromium
This CL will help run libvpx tests under Chromium against its partition
allocator. The allocator does not support single allocations above
3.998GiB. Because of this tests related to large video sizes that
Chromium is configured for are expected to fail.

Chromium also only supports the CONFIG_REALTIME_ONLY option,
some changes are scoped behind this flag.

Change-Id: I80e8743c0619ce502688109ce0be01cb252d5f92
2024-04-03 20:20:44 +00:00
Wan-Teh Chang 05a4c855be Compare ctx->pending_cx_data with NULL
ctx->pending_cx_data is a pointer. It looks nicer to compare
ctx->pending_cx_data with NULL than with 0.

Change-Id: I18815907b3d75551abfc603cb3c5c0297dceed23
2024-04-03 02:53:45 +00:00
Hirokazu Honda 1d007eafa3 vp9 rc: Fix GetSegmentationData() crash in aq_mode=0
cpi_->cyclic_refresh is nullptr if aq_mode is 0, in other words, the
rate controller runs in non adaptive quantization mode. This CL fixes
the crash in GetSegmentationData() in non aq mode.

Bug: b/259487065
Test: video encoding on ChromeOS

Change-Id: I503b30d15c697c8dd1da203b3c7361b91c428e87
2024-04-02 23:00:50 +00:00
Wan-Teh Chang 976cedd643 Set priv->cx_data_sz to 0 if cx_data alloc fails
Change-Id: I6553cd7b09270b4d60ccd7199d499e03c22b3936
2024-04-02 22:53:17 +00:00
Wan-Teh Chang 419f36e8ed encoder_encode: Assert pending_cx_data_sz is valid
In encoder_encode(), assert ctx->pending_cx_data_sz is not too big
before the memmove() call.

Change-Id: Icd1e95f6d751b0bf67386d0d99218b256bc91ebd
2024-04-02 21:52:03 +00:00
Wan-Teh Chang dc74cf923b Dont use VPX_CODEC_CORRUPT_FRAME in set_frame_size
VPX_CODEC_CORRUPT_FRAME is a decoder error. It is strange for
vpx_codec_encode() to fail with this error. In set_frame_size(), change
VPX_CODEC_CORRUPT_FRAME to VPX_CODEC_ERROR.

The use of VPX_CODEC_CORRUPT_FRAME was originally added in
commit 1ed56a46b3.

Change-Id: Iee92ed4cfca5061289b278ece2ba475cf98fec06
2024-04-02 17:56:48 +00:00
Gerda Zsejke More bf932674a8 Add SVE2 implementation of vpx_highbd_convolve8_avg
Add SVE2 implementation of vpx_highbd_convolve8_avg function and the
corresponding tests as well.

Change-Id: I2ff707da55d11b1d5376eb0a7ec85c343a2709c2
2024-04-02 13:32:51 +02:00
Gerda Zsejke More 9274c2bbf0 Merge horiz. and vert. passes in HBD SVE2 2D 4tap convolution
The current SVE2 approach to 2D convolution is:
1) Filter horizontally, storing to an intermediate buffer.
2) Filter vertically and store the final output.

This patch merges the two phases for high bitdepth 2D convolution for
filter sizes smaller or equal to 4 to avoid the storing and
re-loading from the intermediate buffer.

This approach is not beneficial when applying an 8tap filter in the
convolution.

Change-Id: Ie090eb79f1cbf182300d9343ae63069396ef3956
2024-04-02 13:29:48 +02:00
Jingning Han 43d12d5079 Update yv12_mb initialization
BUG=webm:1846

Change-Id: If8475d46397f04ef769f3e4647de5c2d4b6760a4
2024-03-30 00:02:02 +00:00
Jerome Jiang 5396643be6 Add invalid value to gop decision enums
These invalid value definitions are necessary to initialize
the gop decision in external RC so libvpx can tell which is populated
and which is not

Bug: b/329483680
Change-Id: I06bbb41fa59d0fb95296aebd0d05a703ec953b81
2024-03-29 21:20:42 +00:00
Wan-Teh Chang 5f5dfb3303 Assert the return value of read_tx_mode() is < 5
Coverity somehow thinks the return value of read_tx_mode() is between 0
and 7 (inclusive).

Hopefully this will fix Coverity CID 1584457: Out-of-bounds access in
read_coef_probs().

Change-Id: I49fbddf6fd6861bc9def9dfa91eaaaa4aefe5710
2024-03-29 11:22:41 -07:00
Jingning Han ccefddef33 Initialize yv12_mb array
This array will be partially configured and used in later rate
distortion optimization search.

BUG=webm:1846

Change-Id: I83daba341c56767187031edb1c10d4528a4257a3
2024-03-29 09:10:23 -07:00
Wan-Teh Chang d790001fd5 Perform bounds checks in vpx_write_bit_buffer
Add the `size` and `error` members to the vpx_write_bit_buffer struct.
Add the vpx_wb_init() and vpx_wb_has_error() functions.

Instances of the vpx_write_bit_buffer struct are only allocated in the
vp9_pack_bitstream() function. So vp9_pack_bitstream() is the only
function outside vpx_dsp/bitwriter_buffer.* that needs updating.

This CL completes the work of adding output buffer bounds checks to
vp9/encoder/vp9_bitstream.c.

Bug: webm:1844
Change-Id: I6b362be572852ee51d96023b35bfb334faada7e1
2024-03-28 14:05:47 -07:00
Jerome Jiang d5501945fc vp9 rc: override GF_GROUP decisions using ext RC
Bug: b/329483680
Change-Id: I2e02673f1bca56bfa24545b4e25d5e3fd3b0e863
2024-03-28 17:33:54 +00:00
Marco Paniconi 3f8f19372b vp9: Fix to alloc for row_base_thresh_freq_fac
Issue happens for real-time nonrd pickmode.
Due to speed feature: sf->adaptive_rd_thresh_row_mt,
enabled for speed >= 8, and for speed >= 7 svc only.

Issue occurs where resolution (sb_rows) changes and
row_base_thresh_freq_fact needs to be re-allocated.

Fix is to add sb_rows to TileDataEnc and check for
re-alloc of row_base_thresh_freq_fac.

Bug: b:331108922
Change-Id: I1a1ca94c14f343200c180725e4cb8d91d3c55b83
2024-03-28 16:47:45 +00:00
Wan-Teh Chang 73703c188b Perform bounds checks in vpx_writer
In the vpx_writer struct, change the buffer_end field to the size field.
Change vpx_stop_encode() to return true on success, false on failure
(output buffer full).

In write_compressed_header(), remove the assertion
assert(header_bc.pos <= 0xffff). The caller (vp9_pack_bitstream()) will
check that condition.

In vp9_pack_bitstream(), the variable "first_part_size" is renamed
"compressed_hdr_size".

Bug: webm:1844
Change-Id: I4ed6ab905a707ad44d875e53036d5a42523a65d0
2024-03-27 09:07:03 -07:00
Wan-Teh Chang 5bea4606dd Fix a typo in comment: "it" -> "is"
Change-Id: I5d36c5198c67cbb2f424901ec045d0620fea2f04
2024-03-27 14:48:55 +00:00
Wan-Teh Chang 34277e53ad Free row mt memory before freeing cpi->tile_data
In vp9_init_tile_data(), call vp9_row_mt_mem_dealloc(cpi) to free the
row mt memory in cpi->tile_data before freeing cpi->tile_data.

Bug: b:331086799, b:331108729
Change-Id: Idc79984ce7e0110e6858139b2ed286492a2e8622
2024-03-26 23:38:09 +00:00
Marco Paniconi c84d23c6cd vp9: fix to integer overflow test
failure for the 16k test: issue introduced
in: c29e637283

Bug: b/329088759, b/329674887, b/329179808

Change-Id: I88e8a36b7f13223997c3006c84aec9cfa48c0bcf
(cherry picked from commit 19832b1702)
2024-03-26 19:22:17 +00:00
Marco Paniconi b6847dcf72 Fix to buffer alloc for vp9_bitstream_worker_data
The code was using the bitstream_worker_data when it
wasn't allocated for big enough size. This is because
the existing condition was to only re-alloc the
bitstream_worker_data when current dest_size was larger
than the current frame_size. But under resolution change
where frame_size is increased, beyond the current dest_size,
we need to allow re-alloc to the new size.

The existing condition to re-alloc when dest_size is
larger than frame_size (which is not required) is kept
for now.

Also increase the dest_size to account for image format.

Added tests, for both ROW_MT=0 and 1, that reproduce
the failures in the bugs below.

Note: this issue only affects the REALTIME encoding path.

Bug: b/329088759, b/329674887, b/329179808

Change-Id: Icd65dbc5317120304d803f648d4bd9405710db6f
(cherry picked from commit c29e637283)
2024-03-26 19:21:31 +00:00
Wan-Teh Chang 08781b2e51 Add high bit depths, 4:2:2, 4:4:4 to VP9Encoder
A port of the changes to vp9_encoder_fuzz_test.cc in
https://chromium-review.googlesource.com/c/chromium/src/+/5292940.

Change-Id: Ie143ffd9cffbd6a8639812c72e85c9a017aa554e
(cherry picked from commit 8c36d36bcc)
2024-03-26 19:20:29 +00:00
Gerda Zsejke More d2ba3a22b4 Add 2D-specific highbd SVE2 horizontal convolution function
2D 8-tap convolution filtering is performed in two passes -
horizontal and vertical. The horizontal pass must produce enough
input data for the subsequent vertical pass - 3 rows above and 4 rows
below, in addition to the actual block height.

At present, all highbd SVE horizontal convolution algorithms process
4 rows at a time, but this means we end up doing at least 1 row too
much work in the 2D first pass case where we need h + 7, not h + 8
rows of output.

This patch adds an additional SVE2 path that processes h + 7 rows of
data exactly, saving the work of the unnecessary extra row.

Change-Id: I2f5d39ad737dbd7eccb08dd2b51586c6710119b8
2024-03-26 19:00:26 +00:00
Gerda Zsejke More cd9d72c065 Add SVE2 implementation of vpx_highbd_convolve8
Add SVE2 implementation of vpx_highbd_convolve8 function. Add the
corresponding tests as well.

Change-Id: I783cc083f1bce5f13ce721bc191b34c48033f5ae
2024-03-26 19:00:26 +00:00
Jerome Jiang 219d7e6a0c Fix several clang-tidy complaints
Change-Id: I78721d6b7ed692ad9363b5cac4e3324a3136d5b6
(cherry picked from commit 4c2435c33e)
2024-03-26 17:42:20 +00:00
Wan-Teh Chang c9bd573531 Replace "cpi->common." with "pc->"
If a local variable "pc" is defined as &cpi->common, replace
"cpi->common." with "pc->".

Also replace a memcpy() call with a struct assignment.

Change-Id: I6f4f12e69d9989beaa6e04c83d93230e7d726278
2024-03-25 14:46:50 -07:00
Wan-Teh Chang 4f579df337 Declare VP9BitstreamWorkerData dest_size as size_t
Declare the dest_size member of the VP9BitstreamWorkerData struct as
size_t instead of int.

Fix the following MSVC warning:
vp9\encoder\vp9_bitstream.c(1031,37): warning C4267: '=':
conversion from 'size_t' to 'int', possible loss of data

Change-Id: Idab5ad5d4bf4d1e4754f011a3073c9a89da29f55
2024-03-23 11:20:22 -07:00
Wan-Teh Chang e387187438 Add the buffer_end field to the vpx_writer struct
The buffer_end field will allow bounds checking when vpx_writer writes
to the output buffer. This CL sets up the plumbing to pass the output
buffer size from vp9_pack_bitstream() to vpx_start_encode(), which
initializes the vpx_writer struct. vpx_writer doesn't use the output
buffer size in bounds checks yet, but the code in vp9_bitstream.c does.

Bug: webm:1844
Change-Id: I995e469ab453c02d740f54b46e0b08c7f2eb1a2e
2024-03-23 02:15:42 +00:00
James Zern 9137f7fa4b rtcd.pl: add empty specialize() check
This was added in libaom in:
5ddac0aac8 RTCD defs: Remove empty specialize statements once and for all.
https://aomedia-review.googlesource.com/c/aom/+/9062

Change-Id: I9c8fb0c8e4bd4dc9373d8533ab083dff816e7cbe
2024-03-22 09:45:42 -07:00
Wan-Teh Chang d48577579b Pass output buffer size to vp9_pack_bitstream()
Set up the plumbing to pass the size of the output buffer `dest` to
vp9_pack_bitstream(). The output buffer is the cx_data buffer in the
encoder_encode() function in vp9/vp9_cx_iface.c, and its size is
cx_data_sz.

In this CL vp9_pack_bitstream() ignores the `dest_size` parameter.

Bug: webm:1844
Change-Id: I53c80280143d409cf16f87c4d6deec3d9338aea3
2024-03-21 18:59:58 -07:00
James Zern cab4f31e1d encodeframe.c: remove some unused includes
clears some clang-tidy warnings

Change-Id: I82c1d212126b9c7b010b6bc8ac32d92453f6d376
2024-03-21 01:59:48 +00:00
James Zern 3c58cb1bc2 VP8_COMMON: remove unused cpu_caps member
This was set, but not read; rtcd covers this.

Change-Id: I1d8b8f8d8ed9e7bc56c3734cb96b79b937b5e20c
2024-03-20 13:14:35 -07:00
Wan-Teh Chang 6e879c6173 Save encode_tiles_buffer_alloc_size() result
Avoid calling encode_tiles_buffer_alloc_size() twice by saving its
return value in a local variable.

Change-Id: I3050f9cf7c3520f7edc80abf66620ba233fadad8
2024-03-20 20:03:57 +00:00
James Zern afc8b452b8 aarch64_cpudetect: add missing include
clears clang-tidy warning for HAS_*

Change-Id: I1b21326480b7c5c3be18c055f848071df0076915
2024-03-20 18:24:37 +00:00
James Zern 55d4b736b2 vpx_scaled_convolve8_neon: add missing include
clears clang-tidy warnings for types and constants in vpx_filter.h

Change-Id: I1f3f843b9ab6fd0ad038e33a048e8708cbd2a950
2024-03-20 18:24:37 +00:00
Jerome Jiang 6358ef6261 vp9 rc: Add ref frame list for each frame in GOP
Bug: b/329483680
Change-Id: I24573c25e70b41b7af243c473f28fa1b290cc373
2024-03-20 12:17:52 -04:00
Jerome Jiang 18059190d7 Remove vpx_rc_gop_info_t
Not being used anywhere any more. This was used to pass
GOP info to ML models.

Bug: b/617172914
Change-Id: Ibcfa00bc8215b73f43d9a11edbe00b6a2d7fb137
2024-03-20 16:17:23 +00:00
Jerome Jiang 6641e9e03c Add update type and ref update idx to gop decision
Bug: b/329483680
Change-Id: Ifc82ad79415400bbec4efe6ab9b78496d5f73ee7
2024-03-20 16:17:23 +00:00
Jerome Jiang 458e1c6875 Remove TPL IO functions
These are not used in libvpx.

Bug: webm:1837
Change-Id: Ic234b2dcd47d6614030b8a066c921e4285af5e99
2024-03-19 19:05:24 +00:00
Marco Paniconi 19832b1702 vp9: fix to integer overflow test
failure for the 16k test: issue introduced
in: c29e637283

Bug: b/329088759, b/329674887, b/329179808

Change-Id: I88e8a36b7f13223997c3006c84aec9cfa48c0bcf
2024-03-17 10:07:34 -07:00
Marco Paniconi c29e637283 Fix to buffer alloc for vp9_bitstream_worker_data
The code was using the bitstream_worker_data when it
wasn't allocated for big enough size. This is because
the existing condition was to only re-alloc the
bitstream_worker_data when current dest_size was larger
than the current frame_size. But under resolution change
where frame_size is increased, beyond the current dest_size,
we need to allow re-alloc to the new size.

The existing condition to re-alloc when dest_size is
larger than frame_size (which is not required) is kept
for now.

Also increase the dest_size to account for image format.

Added tests, for both ROW_MT=0 and 1, that reproduce
the failures in the bugs below.

Note: this issue only affects the REALTIME encoding path.

Bug: b/329088759, b/329674887, b/329179808

Change-Id: Icd65dbc5317120304d803f648d4bd9405710db6f
2024-03-15 21:43:28 +00:00
Wan-Teh Chang 7fb8ceccf9 Restrict ranges of duration,deadline to UINT32_MAX
Bug: webm:1828
Change-Id: I3b1d208cf300d7c4c5584681183d45b1e97c7380
2024-03-15 01:11:50 +00:00
Wan-Teh Chang bc5a22eb60 Replace timestamp_ratio by oxcf->g_timebase_in_ts
Fix a TODO comment in encoder_init().

Change-Id: Id737142202c807a3f538fdf50612e77ca790990c
2024-03-14 14:32:44 -07:00
Wan-Teh Chang 6c0bf97a98 Detect integer overflows related to pts & duration
A port of the following two libaom CLs:
https://aomedia-review.googlesource.com/c/aom/+/187902
https://aomedia-review.googlesource.com/c/aom/+/188161

Bug: webm:1828
Change-Id: Id25039b000c3d04e7a4c8d71579a6932e9fd65ef
2024-03-14 11:14:25 -07:00
Wan-Teh Chang 8c36d36bcc Add high bit depths, 4:2:2, 4:4:4 to VP9Encoder
A port of the changes to vp9_encoder_fuzz_test.cc in
https://chromium-review.googlesource.com/c/chromium/src/+/5292940.

Change-Id: Ie143ffd9cffbd6a8639812c72e85c9a017aa554e
2024-03-13 13:45:41 -07:00
Jonathan Wright ad1d0ece31 Disable SVE2 if compiler doesn't support arm_neon_sve_bridge.h
SVE and SVE2 code paths in libvpx require intrinsics from
arm_neon_sve_bridge.h. SVE is disabled if the compiler does not
support this header. This patch conditionally disables SVE2 in the
same way.

Also gate the check for arm_neon_sve_bridge.h on whether SVE is
enabled in the first place. The check isn't necessary if the user has
explicitly disabled SVE. (Explicitly disabling SVE already disables
SVE2 since the former is a pre-requisite for the latter.)

Change-Id: Ibb21f09e8b2470d1ce5d98b71b101f5b7f7dbcdc
2024-03-13 18:38:10 +00:00
James Zern c1494fa57e neon: fix -Woverflow warnings
with signed char values, 128 -> -128

Change-Id: Iec2257729d7878459794d6a3d6bc3f745d39e97c
2024-03-13 18:36:14 +00:00
Wan-Teh Chang daa33cca37 Remove return statement after vpx_internal_error()
In encoder_encode(), remove the return statement after a
vpx_internal_error() call because setjmp() has been called at that
point.

Change-Id: Ib8ebbfbacb21097ce7f1b4e3bf53004bbe88a42b
2024-03-13 00:57:33 +00:00
Wan-Teh Chang 0ba7b50338 Ignore the pts parameter when flushing the encoder
Change-Id: I8380a4a7ffcbf7a6f183d02d363473273b47f064
2024-03-12 11:02:30 -07:00
Wan-Teh Chang 7f6ba04e87 Move the local variable sd to the innermost scope
Change-Id: Ie6bfda6247ba408b9dbcf0b94fa95dbca0c57adb
2024-03-11 17:07:39 -07:00
James Zern cf1b7a65ff VP9RateControlRtcConfig: relocate some initializations
use default member initializers for all members for consistency.

Change-Id: I1956163c995d94aadbde38b4edaf21dc722e50c4
2024-03-11 18:50:00 +00:00
James Zern 0af7244971 ratectrl_rtc.h: remove use of vp9_zero()
This is an internal define that shouldn't be exposed in this header.

Change-Id: I43b793ab18c19ffab8bcc71fcd7097216989ca5a
2024-03-11 18:50:00 +00:00
James Zern ca7fd396e7 ratectrl_rtc.h: move some includes to .cc
This is the first step in removing the use of internal headers in a
public header.

Change-Id: Ia71b0b16a01037baa72942fc8ee7aeb4ffc04b86
2024-03-11 18:50:00 +00:00
James Zern 0ba67bb93d *ratectrl_rtc.h: remove unneeded 'public:'
in struct VP8RateControlRtcConfig and struct VP9RateControlRtcConfig;
structs default to public access.

Change-Id: Icdc5b44fb4c7297b0cb3c6cde8bec33ea5cee18c
2024-03-11 18:50:00 +00:00
James Zern cd88d25c53 vp8_ratectrl_rtc.cc: fix include order
vp8/vp8_ratectrl_rtc.h should come first as it's implemented in this
module. Split the rest of the groups on C/C++/vpx bounds.

Change-Id: If6bbbd8f3adf3766fa36fbc53ae06c9f6f76ebe9
2024-03-11 18:48:31 +00:00
Gerda Zsejke More 5391609fbe Add SVE2 implementation of vpx_highbd_convolve8_avg_vert
Add SVE2 implementation of vpx_highbd_convolve8_avg_vert function.
Add the corresponding tests as well.

Change-Id: I20ca19e09a1686bb00c0b51bf756ddab0adbc2c0
2024-03-11 18:43:35 +00:00
Gerda Zsejke More 45ea306dad Add SVE implementation of vpx_highbd_convolve8_avg_horiz
Add SVE implementation of vpx_highbd_convolve8_avg_horiz function.
Add the corresponding tests as well.

Change-Id: If13793fa653834dfdfeddfee60b80129eea85dd7
2024-03-11 18:43:35 +00:00
Gerda Zsejke More 2c3a9b69e7 Add SVE2 implementation of vpx_highbd_convolve8_vert
Add SVE2 implementation of vpx_highbd_convolve8_vert function. Add
the corresponding tests as well.

Change-Id: I289ac79d4493935217feaa4fd2fa0b8ef9a62972
2024-03-11 18:43:35 +00:00
Gerda Zsejke More 282e9aa0eb Add Arm SVE2 build flags and run-time CPU feature detection
Add 'sve2' arch options to the configure, build and unit test files -
adding appropriate conditional options where necessary. Arm SIMD
extensions are treated as supersets in libvpx, so disable SVE2 if
SVE is unavailable.

Change-Id: Icdec2aace357e36fba77c77cd8b70da1e5427fce
2024-03-11 18:43:35 +00:00
Wan-Teh Chang a87978a53d VP8: Always reset the setjmp flag before returning
Always reset the setjmp flag to 0 before returning from the function
where setjmp() was called.

Change-Id: I80bf39ef1769f656f53c6c6657c06e34489750f4
2024-03-11 16:45:59 +00:00
Wan-Teh Chang f51417671e Include system headers first
Change-Id: Ia096dacb3dd102829196e5ebd1bc148cf2ea2f93
2024-03-09 18:27:18 +00:00
James Zern 03c7f6a108 libs.doxy_template: remove DOT_TRANSPARENT
This was deprecated in 1.9.5 [1]. It is now enabled by default. For
earlier versions of doxygen this will set the value to false, but I
don't believe we were relying on this functionality.

[1]: https://www.doxygen.nl/manual/changelog.html#log_1_9_5

Change-Id: I75f576d35ca86636761cf70fda0dd0ad37f71d71
2024-03-09 02:33:16 +00:00
Wan-Teh Chang f46d99bcf7 Clear dangling ptr in vp8_remove_decoder_instances
Change-Id: I80c7d41c4675305efbbfbaddd45b42122979b318
2024-03-09 01:02:58 +00:00
Wan-Teh Chang ec06dcc314 Subtract pts_offset from pts after calling setjmp
This allows us to call vpx_internal_error() if the relative pts would be
negative.

Change-Id: I9ca314c4e32bb2c17bbe20ede6ea854bf9701ade
2024-03-08 14:07:38 -08:00
James Zern 99e887c09e vp8/encoder/encodeframe.c: sort includes
Change-Id: I30a8117754e8168a3f6fe37c4ea459475ad1b9aa
2024-03-07 19:57:44 -08:00
Wan-Teh Chang a6647c9cab Add vp8_ prefix to sem_* macro names
The sem_* macros do not behave exactly like the POSIX sem_* functions.
Add the vp8_ prefix to the sem_* macro names to make it clear that they
are not the POSIX sem_* functions. Another reason for adding the vp8_
prefix is that we need to wrap sem_wait() (to handle EINTR) on the Unix
platforms that have real sem_wait() function.

Handle EINTR in the Unix (non-Apple) definition of vp8_sem_wait().

Change-Id: I3df02a30f851d41691a55cf7a84aa2ff054bba9c
2024-03-08 01:46:59 +00:00
Jonathan Wright 6b6916be0a Refactor standard bitdepth Neon scaled convolve
Tidy up the standard bitdepth Armv8.0 Neon implementation of scaled
convolution.

Change-Id: I9e48e773b4a4b252b9254a22af23c8e834407b8a
2024-03-08 00:39:58 +00:00
Jonathan Wright 9b94b7bd01 Optimize Arm Neon implementation of transpose_u8_8x8()
Operate on 128-bit vectors to reduce the total number of instructions
by two.

Change-Id: I252e67831ccbb51adcfe5caaadb3205d3eb11b79
2024-03-08 00:39:58 +00:00
James Zern fa64af7bbf vp8/encoder/encodeframe.c: add missing include
Based on a clang-tidy warning:
  `no header providing "sem_wait" is directly included`
Though this may not clear it entirely, it's the closest that can be
done given the platform-dependent includes and implementation in
vp8/common/threading.h

Change-Id: I19984f820f3f380e58deef40563a2f0c66187748
2024-03-07 13:53:56 -08:00
James Zern 1f066bf77c build/make/Android.mk: update configure/build comments
set --target to the more modern aarch64-android-gcc and remove an
incorrect comment regarding realtime-only.

Change-Id: I5f6c9de9fcd96a60817e37fc6f6505725ddea6b9
2024-03-07 00:40:17 +00:00
George Steed b0e26cdcfd aarch64_cpudetect.c: Avoid unused variable warning
When dot-product and SVE support are disabled the hwcap variable is
currently unused. Fix this by wrapping it in an #ifdef matching the
conditions where it is needed.

Change-Id: I1c2e302d861c6c726b314e374f07d4fafe17ffc7
2024-03-06 18:52:38 +00:00
Jerome Jiang 148c7f65f0 IWYU: fix clang-tidy complaints
Include vp9_firstpass.h for KF_UPDATE

Change-Id: Ie1805a2201f3c42c7d3a0102e4eaa0378cca315e
2024-03-06 10:11:46 -05:00
Daniel Cheng b207d1c9bd Only #define __builtin_prefetch if it doesn't exist.
libvpx's check for conditionally defining __builtin_prefetch is broken,
since clang-cl defines __builtin_prefetch on Win ARM64: in addition, it
supports up to 3 arguments, with the latter 2 being optional. This
causes build breaks when paired with other libraries, like Abseil, which
do perform the conditional test correctly.

The real fix here is to define something like VPX_PREFETCH rather than
trying to #define an implementation-reserved name, which is undefined
behavior.

Bug: 328105513
Change-Id: Ibe14d9ce34306654bd20e560973f76c3b40036ee
2024-03-04 18:11:26 -08:00
Jerome Jiang a571299b07 vp9 ext rc: Do motion search on key frame in TPL
Bug: b/327254742
Change-Id: I7448c09994441c89c36420e780cd2641c6f1aa5a
2024-03-04 21:54:11 +00:00
Jonathan Wright 9d8d71b41b Refactor Arm Neon transpose_concat_*() to not need lookup table
Refactor the transpose_concat_*() helper function used in the Arm Neon
DotProd and I8MM vertical convolution implementations to not use TBL
instructions. Using vzip* to achieve the same outcome (with the same
number of instructions) avoids needing/loading the lookup indices and
also increases performance on little (in-order) Arm Cortex cores.

Change-Id: Iff62a44f8a9bf0ee239d5bb36be8424cab0dbca5
2024-03-04 20:24:03 +00:00
Jonathan Wright 5a8e2f705e Cosmetic: Remove 'vpx_' prefix from static Neon functions
Tidy up some of the naming in Arm Neon convolution functions.

Change-Id: I9cfd925dbcb754bdf9fe0860a46a1c9dca2c7f9a
2024-03-04 20:24:03 +00:00
Cheng Chen f394f2be74 Delete "public" from struct definitions
Struct by default is public.

Change-Id: I87dc164d6a63fcc950c6e513901fc2826e53a8ae
2024-02-29 14:16:55 -08:00
Wan-Teh Chang d4959f9825 Handle EINTR from sem_wait()
sem_wait() may be interrupted by a signal and fail with EINTR:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_wait.html

Retry the sem_wait() call if it fails with EINTR.

This finishes the fix started in
https://chromium-review.googlesource.com/c/webm/libvpx/+/5299569. As a
speculative fix, that CL fixed only the sem_wait(&cpi->h_event_end_lpf)
calls responsible for bug chromium:324459561. ClusterFuzz verified the
fix, so this CL extends it to the other sem_wait() calls.

Note that sem_wait() calls like the following do not need this fix,
because the while (1) loop retries the sem_wait() call if it fails:

  while (1) {
    if (vpx_atomic_load_acquire(&cpi->b_multi_threaded) == 0) break;

    if (sem_wait(&cpi->h_event_start_lpf) == 0) {
      ...
    }
  }

Bug: chromium:324459561
Change-Id: I0f0612616eee37fb3da68049e49b3e86927b5e24
2024-02-28 21:06:36 +00:00
George Steed 793c0b9196 Only enable AArch64 extensions if the compiler supports them
We already have some logic in the configure.sh file to selectively
disable code dependent on particular architecture extensions, however we
do not yet have anything to check that the compiler being supplied
recognises and can compile code using these extensions.

This commit adds compiler "-march=..." flag tests to the existing
extension-disable loop so that we now correctly disable extensions that
are not supported by the compiler. For AArch64 this loop also needs to
move below the existing compiler/OS handling to ensure that prefixes
like $CROSS are handled correctly before running compiler tests.

Bug: webm:1841
Change-Id: I936b911c4b0ebf03abc34b7532b2bb4568129f57
(cherry picked from commit fa50b26848)
2024-02-28 02:46:29 +00:00
Gerda Zsejke More 3ac1316c46 Require Arm Neon-SVE bridge header for enabling SVE
Disable SVE feature if arm_neon_sve_bridge header is not supported
by the compiler.

Change-Id: I3f78be2dd95b37b8d51b9f1fceca1f9701535eca
(cherry picked from commit 6ea3b51ec2)
2024-02-28 02:45:59 +00:00
Wan-Teh Chang b7b5d0a568 Use the value param in Win32 version of sem_init
Name the three parameters of sem_init() as sem, pshared, value. See
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_init.html.

Pass the `value` parameter to CreateSemaphore() as the second
(lInitialCount) parameter:
https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createsemaphorea

Remove unneeded parentheses around semaphore_wait(*sem).

Change-Id: I1735c94adb511ca539159dfea19421595ec15d24
2024-02-27 13:44:23 -08:00
Wan-Teh Chang 7b9843099c Handle EINTR from sem_wait(&cpi->h_event_end_lpf)
sem_wait() may be interrupted by a signal and fail with EINTR:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_wait.html

Retry the sem_wait(&cpi->h_event_end_lpf) call if it fails with EINTR.

Bug: chromium:324459561
Change-Id: Icc957e8b9f21f25ec3c95e22cab502af417443f2
(cherry picked from commit d63efe0679)
2024-02-27 12:11:58 -08:00
Marco Paniconi 4c80888a71 vp8: Fix to race issue for multi-thread with pnsr_calc
Added unitest which triggers the data race in the
bug below, when only C code is forced.

The data race is between the loopfilter and variance
computation from generate_psnr_packet calculation.
Proposed fix is to move the wait for loopfilter thread to
finish up before entering generate_psnr_packet().

Bug: b/266833179.

Change-Id: Id2871c53274be0f404e65601c9a5c98aaead0c72
(cherry picked from commit 756b29a776)
2024-02-27 19:57:23 +00:00
George Steed fa50b26848 Only enable AArch64 extensions if the compiler supports them
We already have some logic in the configure.sh file to selectively
disable code dependent on particular architecture extensions, however we
do not yet have anything to check that the compiler being supplied
recognises and can compile code using these extensions.

This commit adds compiler "-march=..." flag tests to the existing
extension-disable loop so that we now correctly disable extensions that
are not supported by the compiler. For AArch64 this loop also needs to
move below the existing compiler/OS handling to ensure that prefixes
like $CROSS are handled correctly before running compiler tests.

Bug: webm:1841
Change-Id: I936b911c4b0ebf03abc34b7532b2bb4568129f57
2024-02-27 19:53:03 +00:00
Gerda Zsejke More 3646b12927 Specialise highbd_convolve8_horiz_sve for 4-tap filter
Add SVE implementation for vpx_highbd_convolve8_horiz that specialises
for 4-tap filters. This way we avoid a lot of redundant work to
multiply and add zero, given that some of the 8-tap filters are
zero-padded, so they are effectively 4-tap filters.

Change-Id: Ib5e0377f924df1d893e9436f443fcbe7d196ea27
2024-02-27 19:38:09 +00:00
Gerda Zsejke More c78f1ef4a0 Rename dot_neon_sve_bridge header file
Rename dot_neon_sve_bridge.h to vpx_neon_sve_bridge.h in order to
reflect that other instructions can be implemented in the header
file. In a subsequent patch, the usage of vtbl with Neon-SVE bridge
intrinsics will be added.

Change-Id: I8f71aad2b7fb4932c9554badf041a80aca58c7cf
2024-02-27 19:38:09 +00:00
Jonathan Wright 2bc1012c53 Remove redundant code for neon_dotprod 2D convolution
Remove the 4-tap Neon DotProd path for the horizontal pass of 2D
convolution since it has been made redundant by the horizontal-
vertical merged implementation. Also move the 8-tap path closer to
where it is used and call it explicitly rather than the filter-
agnostic wrapper.

Change-Id: I1861dc88a67a759c3e8deb0b471ec447a62063f2
2024-02-26 22:28:05 +00:00
Jonathan Wright baece7460d Merge h. and v. passes in 4-tap SBD Neon DotProd 2D convolution
The current SBD Neon DotProd approach to 2D convolution is:
1) Filter horizontally, storing to an intermediate buffer.
2) Filter vertically and store the final output.

This patch merges the two phases for 4-tap standard bitdepth 2D
convolution to avoid storing to and re-loading from the intermediate
buffer - giving a 10-25% speedup depending on block size. Merging the
passes for 8-tap filters does not have the same benefit, so keep the
existing implementation.

Change-Id: Ic6008836d1a499ee2cd957b9db194fca5671ccb4
2024-02-26 22:28:05 +00:00
Jonathan Wright d191c5f984 Remove redundant code for neon_i8mm 2D convolution
Remove the 4-tap Neon i8mm path for the horizontal pass of 2D
convolution since it has been made redundant by the horizontal-
vertical merged implementation. Also move the 8-tap path closer to
where it is used and call it explicitly rather than the filter-
agnostic wrapper.

Change-Id: Icddecb7e133656c54aa5e79536b49759715b6fcb
2024-02-26 20:59:41 +00:00
Jonathan Wright cef5b0da97 Merge h. and v. passes in 4-tap SBD Neon i8mm 2D convolution
The current SBD Neon i8mm approach to 2D convolution is:
1) Filter horizontally, storing to an intermediate buffer.
2) Filter vertically and store the final output.

This patch merges the two phases for 4-tap standard bitdepth 2D
convolution to avoid storing to and re-loading from the intermediate
buffer - giving a 5-40% speedup depending on block size. Merging the
passes for 8-tap filters does not have the same benefit, so keep the
existing implementation.

Change-Id: Ic8ec2822681176ef879dcaf8424d8d91c5e8d2df
2024-02-26 20:59:41 +00:00
James Zern a3209600f2 codec_factory.h: fix -Wpedantic warnings
With either CONFIG_VP8=0 or CONFIG_VP9=0. Fixes a warning about an extra
';' outside of a function due to VP[89]_INSTANTIATE_TEST_SUITE() being
defined to nothing.

Change-Id: I1878d7596e39c5166efbe96450a733efc08665ea
2024-02-26 20:52:35 +00:00
Jerome Jiang b5578f1283 Add inter/intra_pred_err to VpxTplBlockStats
inter/intra_cost in VP9 TPL is calculated with SATD
which should be close enough to be used as inter/intra_pred_err

Bug: b/326262148
Change-Id: Ic0fd08708fcf3640398fc22a1a6bb6f449b2a9b8
2024-02-26 12:27:38 -05:00
Jerome Jiang ff9591f8df vp9 ext rc: assign srcrf_dist/rate instead
Bug: b/326262148
Change-Id: I3af0b5d28c58447862eb11d5b10afa8a32d82ada
2024-02-26 17:27:11 +00:00
James Zern 5433b943a4 resize_test.cc: fix warning w/CONFIG_VP9=0
fixes -Wunused-but-set-variable

Change-Id: Id9431342745baa1492f5da0e32d09372e10fdcd2
2024-02-24 00:53:28 +00:00
James Zern fca3d1755f fix void param declarations
These should be funcname(void) rather than funcname(). Quiets some
-Wstrict-prototypes warnings.

Change-Id: I68705fe53f4438c9584e7040c39cecec859af27c
2024-02-22 15:13:03 -08:00
James Zern 5e90a97fa2 tokenize.h: remove undefined vp8_tokenize_initialize()
It was removed in:
f039a85fd Make global data const

Change-Id: Ib5aa35500c3ee7caf1ec216e0351c32ef373f5f2
2024-02-22 13:33:26 -08:00
Marco Paniconi 79284f4c84 vp8: add uv_delta_q support to external RC
Bug: b/321137490

Change-Id: Id9ccf8e80d693b296b224846094fc7c0f71c5d0a
2024-02-22 17:31:40 +00:00
James Zern 1659e73b0b vp9_context_tree.h: add name to union
Anonymous unions are not supported in C99, they were added in C11:
https://en.cppreference.com/w/c/language/union

Fixes -Wpendantic warning:
vp9/encoder/vp9_context_tree.h:93:4: warning: ISO C99 doesn’t support
  unnamed structs/unions [-Wpedantic]

Change-Id: Ibd29d6deca35d81ea886e80e9f44575c73ecd96d
2024-02-21 23:15:19 +00:00
James Zern 5d022e45ec vp9_rdopt,skip_iters: normalize use of const
Fixes a -Wpedantic warning:
vp9/encoder/vp9_rdopt.c:1988:20: warning: invalid use of pointers to
  arrays with different qualifiers in ISO C before C2X [-Wpedantic]

Change-Id: I581e21d7e59c0bae0e44056a3b3f049c5a4e7cf2
2024-02-20 13:56:59 -08:00
Gerda Zsejke More 9c0c5144e7 Add SVE implementation of vpx_highbd_convolve8_horiz
Add SVE implementation of vpx_highbd_convolve8_horiz function. Add
the corresponding tests as well.

Change-Id: I0b2815831daf203e167ea5289307087ce53ff9da
2024-02-20 19:14:56 +00:00
Jonathan Wright 7e9da9702c Use Armv8.0 Neon 4-tap vertical convolution for all arch levels
The new Armv8.0 Neon implementation of 4-tap vertical convolution is
faster than Armv8.4 DotProd and Armv8.6 I8MM implementations. This
patch removes the DotProd and I8MM implementations in favour of using
the Armv8.0 version everywhere.

Change-Id: I126470fd4862d8bb116153e90bb2e4f2f2dba1e4
2024-02-20 15:22:35 +00:00
Jonathan Wright 9f7a70bdf2 Further accelerate Armv8.0 Neon 4-tap convolution
Refactor Armv8.0 Neon 4-tap convolution functions to operate on 8-bit
types directly, rather than first widening to 16-bit.

2-tap (bilinear) filter values are always positive, but 4-tap filter
values are negative on the outer edges (taps 0 and 3), with taps 1
and 2 having much greater positive values to compensate. To use
instructions that operate on 8-bit types we also need the types to be
unsigned. In the convolution kernel, subtracting the products of taps
0 and 3 from the products of taps 1 and 2 always works since 2-tap
filters are 0-padded.

Co-authored by: Hari Limaye <hari.limaye@arm.com>

Change-Id: I87b32e2ef8cbd21eebb8cd2642e8826b704905b1
2024-02-20 15:22:22 +00:00
Wan-Teh Chang 4340382bb0 Move THREADFN macro definitions to vpx_pthread.h
The THREADFN and THREAD_EXIT_SUCCESS macros are used to define the
thread start routines passed to our implementation of pthread_create(),
so it makes sense to define these macros in vpx_util/vpx_pthread.h. This
also allows the VP8 and VP9 code to share the macro definitions.

Replace the THREAD_FUNCTION macro by THREADFN. They have the same
definition.

Change-Id: I79a7476e43652667af6a8da7ad7ce346b1b6b024
2024-02-16 09:46:39 -08:00
Wan-Teh Chang 4e384da53d Delete a duplicate definition of thread_sleep()
There are two identical definitions of thread_sleep() for Win32. Delete
the first one.

Change-Id: I617e180e3459a24fbafec5b060bbdcd4fcee8128
2024-02-15 20:31:25 -08:00
Wan-Teh Chang 3316c11240 Delete unused macro definitions
Change-Id: Ic12d5b9ec9b18743e8ef67d132ed3bbbc90c7fa6
2024-02-15 17:04:32 -08:00
Wan-Teh Chang d63efe0679 Handle EINTR from sem_wait(&cpi->h_event_end_lpf)
sem_wait() may be interrupted by a signal and fail with EINTR:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_wait.html

Retry the sem_wait(&cpi->h_event_end_lpf) call if it fails with EINTR.

Bug: chromium:324459561
Change-Id: Icc957e8b9f21f25ec3c95e22cab502af417443f2
2024-02-16 00:14:40 +00:00
Jerome Jiang e1da3834ba Add base qp to ext rc config
Change-Id: I0eb7e0dbe3d1784c4408fdddf763d2b64c90fbb5
2024-02-15 13:08:46 -05:00
Peter Kasting e92dd05124 Add VPX_WORKER_STATUS_ to values of global-scope status enum.
This helps prevent name clashes if code e.g. #includes headers from both
libvpx and libaom.

Bug: none
Change-Id: Ifc9e7ac4862dc04a399e7777d2636e1453627970
2024-02-14 20:37:40 -08:00
Peter Kasting 4c0cf7458c Split pthread wrapper to vpx_pthread.h.
Also does a bit of cleanup to the THREAD macros as suggested in review.

Bug: none
Change-Id: I1fbfacf99b2439ac1147e346e53d72d7ee39c298
2024-02-14 18:54:43 +00:00
Jerome Jiang b01d61c9af Remove unused signals for get_encodeframe_decision
Bug: b/323234722
Change-Id: Iab5c27b232552f924b05fdd7fa1cd6792e04faed
2024-02-14 10:46:18 -05:00
Jerome Jiang 591c787436 vp9 ext rc: Remove initializer for gop_decision
Change-Id: Ie4ebcc39ab8c34631395ce81e2916c766c3a7f13
2024-02-13 23:21:18 +00:00
Jonathan Wright 455cb26998 Optimize Arm Neon USDOT narrowing sequences in convolve kernels
Currently we use two rounds of complex right-shift operations to
narrow and pack results from the dot-product convolution kernels.
This patch refactors these sequences to use one "simple" right-shift
and one complex right-shift - reducing the latency by 4 cycles on
modern out-of-order Arm CPUs.

Change-Id: I3fd38560bb14d85826e417f40d35f11165ab80da
2024-02-13 19:58:25 +00:00
Jonathan Wright 939bcd4026 Optimize Arm Neon SDOT narrowing sequences in convolve kernels
Currently we use two rounds of complex right-shift operations to
narrow and pack results from the dot-product convolution kernels.
This patch refactors these sequences to use one "simple" right-shift
and one complex right-shift - reducing the latency by 4 cycles on
modern out-of-order Arm CPUs.

Change-Id: I908147ed65a87157009363782399ff398406cdf9
2024-02-13 19:58:25 +00:00
Jerome Jiang a64bf87fb9 Fix gop decision and gop index in TPL pass
- Initialize gop_decision
 - Initialize GF group for a new one
 - GF group index for key frame special treatment is not needed any more
   when key frame is decided by the RC

Bug: b/323050877
Change-Id: Iaf36ea4f671b833f3ba4c524b9799a3093412dfa
2024-02-13 18:01:20 +00:00
Peter Kasting 8cf26c1284 Backport thread-related changes from libaom.
This ports changes that touched aom_thread.[c,h] from the time after
libaom copied libvpx' sources, where they hadn't already been present.
The goal here is to unify the two repos' thread implementations in hopes
of ultimately sharing one.

The list of commits is approximately as follows; however, I made a few
other changes as necessary where noted.
https://aomedia-review.googlesource.com/c/aom/+/64044
  Edited other hook func return values similarly.
https://aomedia-review.googlesource.com/c/aom/+/71321
https://aomedia-review.googlesource.com/c/aom/+/71327
https://aomedia-review.googlesource.com/c/aom/+/71436
https://aomedia-review.googlesource.com/c/aom/+/72481
  Also removed conflicting MAX_NUM_THREADS definition of 80. I think
  this was incorrect as the relevant array was indexed by variables that
  were in turn controlled by the global config values that were clamped
  to <=64.
https://aomedia-review.googlesource.com/c/aom/+/102621
  Also removed a pre-XP handling block in
  vp8/common/generic/systemdependent.c.
https://aomedia-review.googlesource.com/c/aom/+/102601
  MAX_DECODE_THREADS was the only relevant piece.
https://aomedia-review.googlesource.com/c/aom/+/102741
https://aomedia-review.googlesource.com/c/aom/+/109025
https://aomedia-review.googlesource.com/c/aom/+/160961
https://aomedia-review.googlesource.com/c/aom/+/169684
  Also removed OS/2 support from elsewhere.
https://aomedia-review.googlesource.com/c/aom/+/169823
https://aomedia-review.googlesource.com/c/aom/+/170022
https://aomedia-review.googlesource.com/c/aom/+/169685
https://aomedia-review.googlesource.com/c/aom/+/173761
https://aomedia-review.googlesource.com/c/aom/+/174842

Bug: none
Change-Id: I91462873a57e9efa120288d1bd8af3a6c09d423d
2024-02-12 18:45:00 -08:00
Jonathan Wright 491c16a9f3 Merge horiz. and vert. passes in HBD Neon 2D avg convolution
The current Neon approach to 2D convolution is:
1) Filter horizontally, storing to an intermediate buffer.
2) Filter vertically, average with the dst block and store the final
   output.

This patch merges the two phases for high bitdepth 2D convolution to
avoid the storing and re-loading from the intermediate buffer. This
provides a small gain (<5%) for large block sizes but the benefit
increases for small block sizes - as the proportion of compute to
memory access decreases. These effects are amplified further when
considering little (in-order) core performance.

Change-Id: I84f1cafcfbbfa48b2cfe4b20881da9c4bc3b56ac
2024-02-12 13:22:19 +00:00
Jonathan Wright 364326c37f Merge horiz. and vert. passes in HBD Neon 2D convolution
The current Neon approach to 2D convolution is:
1) Filter horizontally, storing to an intermediate buffer.
2) Filter vertically and store the final output.

This patch merges the two phases for high bitdepth 2D convolution to
avoid the storing and re-loading from the intermediate buffer. This
provides a small gain (<5%) for large block sizes but the benefit
increases for small block sizes - as the proportion of compute to
memory access decreases. These effects are amplified further when
considering little (in-order) core performance.

Change-Id: I8ec13fb9edd642fdb927bf5394a3c2a349d22a29
2024-02-12 12:58:45 +00:00
Jonathan Wright 58731e2b7a Specialise highbd Neon 2D horiz convolution for 4-tap filters
Add a highbd Neon implementation of the horizontal portion of 2D
convolution specialised for executing with 4-tap filters. This new
path is also used when executing with bilinear (2-tap) filters.

Change-Id: I513e35c4f8857bc89e0def5e9402bc31ddd46440
2024-02-09 17:05:47 +00:00
Jonathan Wright 3127962e71 Specialise highbd Neon vert convolution for 4-tap filters
Add a highbd Neon implementation of vertical convolution specialised
for executing with 4-tap filters. This new path is also used when
executing with bilinear (2-tap) filters.

Change-Id: I30469c7b8e6ccff31d96588a3e4c21b401f1ed09
2024-02-09 15:37:51 +00:00
Jonathan Wright 70b14bf4dc Specialise highbd Neon horiz convolution for 4-tap filters
Add a highbd Neon implementation of horizontal convolution specialised
for executing with 4-tap filters. This new path is also used when
executing with bilinear (2-tap) filters.

Change-Id: Icabeea295af3e0bbeda755168996668cb960b0de
2024-02-09 15:37:51 +00:00
Jonathan Wright b1c9bbeaae Remove unneeded assert in vpx_filter.h
Filter tap reporting was made more granular recently[1] to enable Arm
Neon optimizations that specialise convolution implementations
according to the filter size. This patch removes an assert that
should have been removed during that change - it no longer serves any
purpose to assert that the filter being used is a no-op filter.

This change is a pre-requisite for some highbd Neon convolution
changes that specialise implementations according to filter size.
(Without this change a convolve-copy test would fail should we
interrogate the size of the filter.)

[1] https://chromium-review.googlesource.com/c/webm/libvpx/+/5063929

Change-Id: I2a71680d27134535e6c0663b1668ba1b150b1a6f
2024-02-09 15:29:18 +00:00
Jonathan Wright 00135942da Add 2D-specific highbd Neon horizontal convolution function
2D 8-tap convolution filtering is performed in two passes -
horizontal and vertical. The horizontal pass must produce enough
input data for the subsequent vertical pass - 3 rows above and 4 rows
below, in addition to the actual block height.

At present, all highbd Neon horizontal convolution algorithms process
4 rows at a time, but this means we end up doing at least 1 row too
much work in the 2D first pass case where we need h + 7, not h + 8
rows of output.

This patch adds an additional Neon path that processes h + 7 rows of
data exactly, saving the work of the unnecessary extra row.

Change-Id: Id6658b4e9e774effc760ff131e188b6907a57676
2024-02-09 10:38:12 +00:00
Jonathan Wright 08eb51bc1a Call scalar impl. immediately from HBD Neon 2D convolution
Call scalar C implementation of 2D convolution immediately if scaling
is required - instead of entering the Neon functions for the
horizontal and vertical passses and then falling back to the scalar
implementation. This has the benefit of being able to allocate a
smaller intermediate buffer.

Change-Id: Icacdd5f3a1401395951b613da1cd6932955bd0f8
2024-02-09 10:37:19 +00:00
Jonathan Wright ef3fd00c27 Refactor Neon highbd 2D convolution definitions and merge files
There's no reason for these files to be separate, and merging them
will make life easier in subsequent commits adding a horizontal pass
specialised for the first pass of 2D.

Also perform some refactoring for 2D convolution definitions:
- Add a comment deriving the intermediate buffer height.
- Align the intermediate buffers to 32 bytes.

Change-Id: Ib92524396e6f9c58295339de54d08d894ace3bd1
2024-02-08 17:49:45 +00:00
Jonathan Wright 72cc21e3a2 Refactor SBD Armv8.0 Neon horizontal convolve8 paths
Mostly a cosmetic change:
1) Remove forward declarations.
2) Remove excessive prefetches.

Change-Id: I88d42d8f9ee828c6c4095ffaec8e0333d776a4a1
2024-02-08 09:44:33 +00:00
Jonathan Wright 81ce6067cc Refactor SBD Armv8.0 Neon vertical convolve8 paths
Mostly a cosmetic change:
1) Remove forward declarations.
2) Remove excessive prefetches - some of which were wrong, prefetching
   data that had just been loaded.

Change-Id: I17d8accc2abf3a9b2050603f859fce588a1f7178
2024-02-08 09:44:19 +00:00
James Zern e32f9d4139 configure: remove profile from CONFIG_LIST
CONFIG_PROFILE is unused currently. The option can still be selected
because it is in the CMDLINE_SELECT list and interpreted by configure
directly.

Bug: webm:1835
Change-Id: Id9667289113335a10018803f578b255967bd60b1
2024-02-08 02:49:44 +00:00
James Zern 8408251f47 README,cosmetics: break some long lines / fix whitespace
+ normalize configure commands

Change-Id: Id21ebca85e7e8e4df128e986d4f4ec33c7f1483f
2024-02-07 19:24:45 +00:00
Jonathan Wright 96b64eaac5 Refactor Neon highbd_convolve8 kernels
Move narrowing shift and max value clipping into the 4-pixel-output
kernel. As well as cleaning up the code quite a bit, this also
improves performance by 5-10% as it eliminates the implied top /
bottom register shuffling of the previous approach.

Also clean up the formatting and magic numbers in the 8-pixel-output
kernel.

Change-Id: I77a5e9e317ef4097f187330d4b32973022ba573f
2024-02-07 00:35:02 +00:00
Jonathan Wright 18bc7ffe59 Optimize vpx_highbd_convolve8_horiz_avg_neon
Avoid transposes before and after convolution kernels by using extra
loads.

Change-Id: Iddee4752d7f9ed644502176ed863742fa77fe5a6
2024-02-07 00:34:19 +00:00
Jonathan Wright 04c8813a2c Optimize vpx_highbd_convolve8_horiz_neon
Avoid transposes before and after convolution kernels by using extra
loads.

Change-Id: I20c622fcb208e83534d604af50a58ba5ac472264
2024-02-07 00:33:55 +00:00
Wan-Teh Chang a0f3eb8ce4 Delete a useless clamp(q, x, y) call
In https://chromium-review.googlesource.com/c/webm/libvpx/+/71356, the
statement
  clamp(q, active_best_quality, active_worst_quality);
was added to rc_pick_q_and_bounds_two_pass() (recently renamed
vp9_rc_pick_q_and_bounds_two_pass()).

The result of the clamp() call is not used, so the clamp() call has no
side effect.

Fix Coverity CID 1577645 Useless call:
  side_effect_free: Calling
  clamp(q, active_best_quality, active_worst_quality) is only useful for
  its return value, which is ignored.

Change-Id: I014c3e4caf2bc999fe480000acc4e49e7ad15aaf
2024-02-06 23:31:27 +00:00
Jerome Jiang f5e1a0ab7e Include headers to fix clang-tidy complaints
Change-Id: I7fd2a10b4775e7e7fca49339832c257d84d99e33
2024-02-06 22:26:21 +00:00
Jonathan Wright 6c00356485 Refactor vpx_highbd_convolve8_avg_vert_neon
Various bits of tidying up to make the code more compact:
- Use appropriate load/store helper functions from mem_neon.h.
- Remove variable forward declarations.
- Use != 0 instead of > 0 in loop termination tests.
- Remove excessive prefetches.

Change-Id: I114cf4d2a34f02acc130558d125d2c191c6c5992
2024-02-06 21:24:27 +00:00
Jonathan Wright 01edfb3df4 Refactor vpx_highbd_convolve8_vert_neon
Various bits of tidying up to make the code more compact:
- Use/create appropriate mem_neon.h load/store helper functions.
- Remove variable forward declarations.
- Use != 0 instead of > 0 in loop termination tests.
- Remove excessive prefetches.

Change-Id: Ida7d3c4a3fe084600417f196baa26501c6e2d45a
2024-02-06 21:24:27 +00:00
Jonathan Wright de7883604f Init using 0-vector instead of load-broadcast in mem_neon.h
Initialise result vectors of mem_neon.h helpers with vdup_n_<type>(0)
instead of load-broadcast of the first loaded elements. The former is
more easily optimized by modern compilers.

Change-Id: If967e2bb55523670c3e433dd66d060665e13b4f2
2024-02-05 21:40:36 +00:00
Jonathan Wright a7a853c3a2 Remove stride width == 4 tests in mem_neon.h helpers
This condition is only ever true in unit tests. It does not benefit
real usage scenarios.

Change-Id: I0c1b09b0b371cfe99ba1e26aba57740a67434070
2024-02-05 21:40:36 +00:00
Jonathan Wright 4084250ccd Align intermediate buffers for 2D Neon convolutions
Align the intermediate buffers to 32 bytes and always use a stride of
64, regardless of the actual data block width.

Change-Id: I738eaa711168bc8231d8ac54d9e5e5e87b62e703
2024-02-05 21:40:36 +00:00
Zoltan Kuscsik c6a8fa27b7 Added documentation on PGO for optimization analysis
tools/README.pgo.md: documentation added

Bug: webm:1835
Change-Id: Iad72ca63fd143a1c36c7347f723578d11158e81b
2024-02-05 21:12:01 +00:00
Zoltan Kuscsik 7eec109a83 Add profile guided optimization support
Tested on x64/ARM64.

To generate a new profile
$ export CC=clang
$ export CXX=clang++
$ ./libvpx/configure --enable-encode_perf_tests --enable-profile

Using the profile:

$  make clean
$  llvm-profdata  merge  -o perf.profdata  default_xxx_0.profraw
$ ./libvpx/configure --use-profile=perf.profdata

Bug:webm:1835
Change-Id: I8ab53fef1f8e2cc98c3b0f5c0f50eece5466965d
2024-02-05 21:12:01 +00:00
Gerda Zsejke More 58fb0f1d27 Add SVE implementation of vp9_block_error_fp
Add SVE implementation of vp9_block_error_fp function. Add the
corresponding tests as well.

Change-Id: I81f4b11bd2f1d0b9f377553bb9298d735308da30
2024-02-05 21:09:18 +00:00
Gerda Zsejke More a9d91d7a0a Add SVE implementation of vp9_block_error function
Add SVE implementation of vp9_block_error function. Add the
corresponding tests as well.

Change-Id: Iebba73ee845855be939e120326e1005237230c2a
2024-02-05 21:09:18 +00:00
Gerda Zsejke More 075569f3a5 Add SVE implementation of vpx_sum_squares_2d_i16
Add SVE implementation of vpx_sum_squares_2d_i16 function. Add the
corresponding test as well.

Change-Id: If3b31c9882e2b7bed0106011efb0bb5522de7008
2024-02-05 21:09:18 +00:00
Jerome Jiang 1258773dc2 Ext rc: Remove max_frame_size from frame decision
Add rdmult to the frame decision as RC can return this information, and
we may want to use it in the future.

Bug: b/323234722
Change-Id: I8ddb7038073d89af1ef84932448b1abaf1937cee
2024-02-05 14:21:29 +00:00
Wan-Teh Chang a9bd789d24 Delete #if USE_PARTIAL_COPY code
The USE_PARTIAL_COPY macro was added in
https://chromium-review.googlesource.com/c/webm/libvpx/+/51505 and
the location of #if USE_PARTIAL_COPY was slightly adjusted in
https://chromium-review.googlesource.com/c/webm/libvpx/+/73600.

Delete the unused function vp9_copy_and_extend_frame_with_rect().

Change-Id: I160b312177ba2fabbea2638172af37f8144d60b1
2024-02-02 18:16:09 +00:00
James Zern fecaf72c30 vp9_scale_and_extend_frame_ssse3: fix uv width/height
Use uv_crop_(width|height). This fixes an issue with 1 to 2 scaling from
1x1 where the unrounded value would go to zero, resulting in a heap
overflow. This path is only executed when the library is built without
--enable-vp9-highbitdepth.

Bug: b:319964497
Change-Id: I9cb6632f864ec54c045608af86aede20657d6253
(cherry picked from commit 7ad5f4f695)
2024-02-02 01:35:18 +00:00
James Zern d10bdbcc40 encode_api_test,RandomPixelsVp8: fix stack overflow
Observed when built using Visual Studio 2019.

Move 720P image allocation to the heap.

Bug: webm:1831
Change-Id: I4e343af08d2f282618ad1b328a39d7dba5e79654
(cherry picked from commit 43e1c8bf10)
2024-02-02 01:35:04 +00:00
Marco Paniconi 4f94206a53 vp8: Fix to integer division by zero and overflow
This can happen in the setting of the frame
target size for delta frames, for non-CBR mode
(end_usage != USAGE_STREAM_FROM_SERVER) and with
temporal layers.

In calc_pframe_target_size(): the percent_high
(factor to adjust the target_size) may end up dividing
bits_off_target by total_byte_count. The total_byte_count
is define per layer for temporal layers, so it will be zero
for delta frames if the enhancement layer has never been
encoded before.

Since percent_high is capped to over_shoot_pct, the proposed
fix is to apply this cap if total_byte_count is zero.
Also this CL fixes a few integer overflow issues in setting
the layer target_bandwidth, the recale function, and in
setting target_bits_per_mb.

Unittest is added by Wan-Teh which triggers this issue.

Bug: chromium:1514684

Change-Id: I091158e720ece75d7ab9b7c4d18d30a5783102ab
(cherry picked from commit 43bd567950)
2024-02-02 01:34:49 +00:00
Marco Paniconi 9b913654e8 Fix to integer overflow in vp8 encodeframe.c
Unit test added.

Bug:webm:1831

Change-Id: Ib85f4f0fbdbebc0b49555f206a36376cea687df6
(cherry picked from commit 193b151195)
2024-02-02 01:34:23 +00:00
Wan-Teh Chang 105bc8ff18 Make encoder know frame size increase from config
Equivalent to the change to av1_change_config() in the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/182413.

Because we call alloc_compressor_data() only if
cm->mi_alloc_size < new_mi_size, this change won't cause
alloc_compressor_data() to be called unnecessarily, unlike the libaom
bug https://crbug.com/aomedia/3526.

Bug: b:317105128
Change-Id: I8a772a1d5c4766846641a6d541a6d861bf76c60f
(cherry picked from commit aef73b22cb)
2024-02-02 01:26:45 +00:00
Jonathan Wright c35f3e9e35 Cosmetic: Refactor Arm Neon i8mm convolution functions
Tweak some comments and remove forward declarations.

Change-Id: I6eb01621cee838f29981853ee1ef615947e05563
2024-02-02 00:24:06 +00:00
Jonathan Wright 224f2dc82a Refactor Arm Neon DotProd convolution functions
This change was intended to be cosmetic in that it tweaks some
comments, removes forward declarations and moves some constant
declarations into the kernels where they're used. However, it also
adds some performance for 8-tap vertical convolution paths as it
appears removing forward declarations also removes some false loop-
carried dependencies that the compiler wasn't able to figure out.

Change-Id: Ic58658b10fbe8378062920199819359d2df008de
2024-02-02 00:24:06 +00:00
Jerome Jiang 3b1039c822 Rewrite ext RC test
The updated test will validate the QP / frame type / ARF settings by the
rate controller and callbacks, making sure the callbacks are working as
expected.

Removed the old tests that verify the signals from the encoder, which
are not needed any more.

Change-Id: Ida3c484e2ac520f3e81358d7cbf7918abfdaca54
2024-02-01 12:29:35 -05:00
Jerome Jiang 9aefcb317a Ext RC: remove gop_info parameter
This is not used any more

Bug: b/323234722

Change-Id: I74bbed38a4a23f2aec8e05413754565e67437e9e
2024-01-31 21:08:07 -05:00
Jerome Jiang 91bc8ec56a vp9: Set VPX_FRAME_IS_INVISIBLE for no show frame
Detect if the frame is arf without parsing bitstream.

Change-Id: I3dd70369ef156508624c45591302b682b1785fa8
2024-01-31 21:07:07 -05:00
Jerome Jiang 861981b135 Allow external RC to control key frame
Disable some tests because they rely on vpx_rc_gop_info_t
which isn't populated when the callback is used for key frame

This parameter will be deleted / cleaned up in the follow-up.

Bug: b/323050877
Change-Id: If1c0476eac8d324c8d5a460bfc9afdb6d93aacdf
2024-01-31 22:17:06 +00:00
Jerome Jiang 56b67113d0 Move vp9_estimate_qp_gop to vp9_tpl_model.c
vp9_estimate_qp_gop is only used for TPL
Rename to vp9_estimate_tpl_qp_gop

Change-Id: I87246f72e90174bf3ba3bf8e1061f5f31edfddff
2024-01-30 19:49:31 -05:00
Jerome Jiang 8630b18323 Fix gf group index used in TPL pass for WebM RC
Change-Id: Ia781caf2de550015822ef5c954b36314ba8a2942
2024-01-30 23:19:00 +00:00
James Zern 7ad5f4f695 vp9_scale_and_extend_frame_ssse3: fix uv width/height
Use uv_crop_(width|height). This fixes an issue with 1 to 2 scaling from
1x1 where the unrounded value would go to zero, resulting in a heap
overflow. This path is only executed when the library is built without
--enable-vp9-highbitdepth.

Bug: b:319964497
Change-Id: I9cb6632f864ec54c045608af86aede20657d6253
2024-01-30 20:50:56 +00:00
James Zern ef09c2e1a4 vp9_encoder.c: make vp9_svc_twostage_scale static
Change-Id: Id7fa33df3f45e46968869ec15a6892c79aac263c
2024-01-30 20:50:33 +00:00
James Zern adac06ace7 vp9_scale_references: condense hbd #if
Change-Id: I6f1d85885b6bdced3925f86da0a60421a6058e91
2024-01-30 20:50:16 +00:00
Jonathan Wright 6cf6e1f082 Simplify Armv8.4 DotProd correction constant computation
Simplify the computation of the Armv8.4 DotProd convolution
correction constant. Summing 128 * filter_tap[0,7] is always the same
as 128 * 128 since the filter taps always sum to 128.

Change-Id: I227ba47ae47bed8304a695a2395bcc85f33c245c
2024-01-30 11:02:37 +00:00
Jonathan Wright fd6b80b153 Move Neon dotprod and i8mm convolution kernels into .c files
Move the convolution kernels using Armv8.4 dotprod and Armv8.6 i8mm
instructions into the respective .c files. These kernels are only used
in the respective .c files so it isn't useful for them to be declared
in a header.

This change also removes the need for feature-macro guarding - which
wasn't being done correctly for MSVC (since Microsoft's Arm
architecture feature macros are named differently to those defined by
GNU-compliant compilers.)

Bug: webm:1838
Change-Id: I495fca2a982c34978b6c9102f144bb9c45352a9a
2024-01-29 15:46:06 +00:00
Jonathan Wright 189c135d5d Merge Arm Neon dotprod and i8mm convolution files
Move the Arm Neon dotprod and i8mm 2D convolution functions into the
appropriate vpx_convolve8_neon_[dotprod|i8mm].c file. Only the
Armv7/Armv8.0 Neon files needed to be split in this way to allow
linking against a handwritten assembly implementation of the kernels
for Armv7 builds.

Change-Id: Ifc363556c3961aa78b9e53761537d4816c5b9964
2024-01-29 15:43:50 +00:00
James Zern 433577ae31 Update third_party/libwebm to commit affd7f4
This is one commit after the libwebm-1.0.0.31 tag:
affd7f4 In MakeUID(), call rand() under #ifdef _WIN32

Change-Id: I5979a8cd3b064d4f4f0dbeca9f84f6791e593b47
2024-01-27 03:02:10 +00:00
Marco Paniconi 2edd69749f Pass the aligned width/height in lookahead_push
Also use the crop_width/height for the setting
of larger_dimensions.

Change-Id: I6b3a3e49944b17f7b51f0705d7a95c2a43255f8c
2024-01-26 20:19:40 +00:00
Jonathan Wright fed0dfe965 Allow SVE variance functions to be called from Neon subpel var
Call indirect RTCD high bitdepth variance functions (instead of the
Neon functions) in the high bitdepth Neon subpel variance paths so
that faster SVE variance functions can be used on CPUs where SVE is
supported.

Change-Id: I04bdef235afac06f2100df0cbaccfb8caef41ac7
2024-01-26 00:16:50 +00:00
Gerda Zsejke More 33ef1caf2f Add SVE implementation of HBD get<w>x<h>var functions
Add SVE implementation of get<w>x<h>var functions for 8-, 10-, 12- bit
depth. Add the corresponding tests as well.

Change-Id: Id4feb8726a3eb0a963e3dd8932ee52374a67da48
2024-01-25 20:58:29 +00:00
Gerda Zsejke More 0e23256348 Enable HBDGetVariance test for different implementations
Enable HBDGetVariance test for Neon and SSE2 implementations.

Change-Id: I77dcf0243784e79b21d956f3899903e2e13a545a
2024-01-25 20:58:29 +00:00
Gerda Zsejke More c43ec846f3 Enable GetVariance test for different implementations
Enable GetVariance test for Neon, Neon Dotprod, SSE2, MSA, VSX
implementations.

Change-Id: Ia6f42af5ef99ad1bc6319cdb46e6bd164f7eea94
2024-01-25 20:58:29 +00:00
Gerda Zsejke More 95d0fcae01 Add unit tests for vpx_get<w>x<h>var functions
Add standard and high bitdepth unit tests for vpx_get<w>x<h>var
functions. Enable these unit tests for the C implementation.

Change-Id: I8716fd6a9718dab3eef218a8a60a1efd4c0e316c
2024-01-25 20:58:29 +00:00
Wan-Teh Chang 989c393b2b Initialize members in VP8/VP9RateControlRTC ctors
Fix Coverity defects CID 1568604 and CID 1568615 (Uninitialized pointer
field). Since the constructors are private and the Create() factory
methods set the cpi_ pointer field, these two Coverity defects are
harmless.

Define the constructors with "= default" instead of "{}".

Change-Id: Ie6b45fce66c23941a9a5c38ee0bccbc4b7d3a2a2
2024-01-24 14:04:24 -08:00
Gerda Zsejke More d84436d533 Add SVE implementation of HBD variance functions
Add SVE implementation of variance functions for 8-, 10-, 12- bit
depth. Add the corresponding tests as well.

Change-Id: I785d85760ad4346cbfbf0f842784b4945870afee
2024-01-22 22:57:55 +00:00
James Zern 0f6a274964 vp9_ratectrl.c: add missing include for INTER_LAYER_PRED_ON
clears a clang-tidy warning

Change-Id: Ie5b8825ef645658304252b0d0554cdefa3de26c2
2024-01-22 20:21:06 +00:00
James Zern 43e1c8bf10 encode_api_test,RandomPixelsVp8: fix stack overflow
Observed when built using Visual Studio 2019.

Move 720P image allocation to the heap.

Bug: webm:1831
Change-Id: I4e343af08d2f282618ad1b328a39d7dba5e79654
2024-01-22 18:04:03 +00:00
Wan-Teh Chang eeb1be7f23 Support VPX_IMG_FMT_NV12 in vpx_img_read/write
read_yuv_frame() supports VPX_IMG_FMT_NV12. Port its code to
vpx_img_read() and vpx_img_write().

The code in vp9/simple_encode.cc, including img_read(), doesn't support
VPX_IMG_FMT_NV12. Check before the vpx_img_alloc() calls and abort the
process if the image format is VPX_IMG_FMT_NV12.

Bug: chromium:1510090
Change-Id: Ie77e29c2c9ee7a01e6a59c8ad3cbcc769d9f2d4c
2024-01-19 11:48:46 -08:00
Wan-Teh Chang d3a946de8c Make img_alloc_helper() fail on VPX_IMG_FMT_NONE
If fmt is VPX_IMG_FMT_NONE, currently img_alloc_helper() allocates a
single plane because VPX_IMG_FMT_NONE (0) is not a planar format (the
VPX_IMG_FMT_PLANAR bit is not set in VPX_IMG_FMT_NONE).

Although this seems correct, the problem is that most of the code in
libvpx assumes planar formats and is likely to dereference a null
pointer when it uses img->planes[1]. Also, VPX_IMG_FMT_NONE isn't really
a valid image format. So it is safer to make img_alloc_helper() fail if
fmt is VPX_IMG_FMT_NONE.

Change-Id: I05b47f4b5eceb631a02384b2cce1c2f6fdca8673
2024-01-19 19:43:26 +00:00
James Zern db6a5c09ce README: remove library version
This often falls out of sync with the release and the version is already
contained in CHANGELOG.

Bug: webm:1833
Change-Id: Ieee6ca40249bf6e77037fbec30d87b109ca8fe21
2024-01-18 15:44:19 -08:00
Jerome Jiang 2dead7118a Merge tag 'v1.14.0'
Release v1.14.0 Venetian Duck

2024-01-18 v1.14.0 "Venetian Duck"

  This release drops support for old C compilers, such as Visual Studio 2012
  and older, that disallow mixing variable declarations and statements (a C99
  feature). It adds support for run-time CPU feature detection for Arm
  platforms, as well as support for darwin23 (macOS 14).

  - Upgrading:
    This release is ABI incompatible with the previous release.

    Various new features for rate control library for real-time: SVC parallel
    encoding, loopfilter level, support for frame dropping, and screen content.

    New callback function send_tpl_gop_stats for vp9 external rate control
    library, which can be used to transmit TPL stats for a group of pictures. A
    public header vpx_tpl.h is added for the definition of TPL stats used in
    this callback.

    libwebm is upgraded to libwebm-1.0.0.29-9-g1930e3c.

  - Enhancement:
    Improvements on Neon optimizations: VoD: 12-35% speed up for bitdepth 8,
    68%-151% speed up for high bitdepth.
    Improvements on AVX2 and SSE optimizations.
    Improvements on LSX optimizations for LoongArch.
    42-49% speedup on speed 0 VoD encoding.
    Android API level predicates.

  - Bug fixes:
    Fix to missing prototypes from the rtcd header.
    Fix to segfault when total size is enlarged but width is smaller.
    Fix to the build for arm64ec using MSVC.
    Fix to copy BLOCK_8X8's mi to PICK_MODE_CONTEXT::mic.
    Fix to -Wshadow warnings.
    Fix to heap overflow in vpx_get4x4sse_cs_neon.
    Fix to buffer overrun in highbd Neon subpel variance filters.
    Added bitexact encode test script.
    Fix to -Wl,-z,defs with Clang's sanitizers.
    Fix to decoder stability after error & continued decoding.
    Fix to mismatch of VP9 encode with NEON intrinsics with C only version.
    Fix to Arm64 MSVC compile vpx_highbd_fdct4x4_neon.
    Fix to fragments count before use.
    Fix to a case where target bandwidth is 0 for SVC.
    Fix mask in vp9_quantize_avx2,highbd_get_max_lane_eob.
    Fix to int overflow in vp9_calc_pframe_target_size_one_pass_cbr.
    Fix to integer overflow in vp8,ratectrl.c.
    Fix to interger overflow in vp9 svc.
    Fix to avg_frame_bandwidth overflow.
    Fix to per frame qp for temporal layers.
    Fix to unsigned integer overflow in sse computation.
    Fix to uninitialized mesh feature for BEST mode.
    Fix to overflow in highbd temporal_filter.
    Fix to unaligned loads w/w==4 in vpx_convolve_copy_neon.
    Skip arm64_neon.h workaround w/VS >= 2019.
    Fix to c vs avx mismatch of diamond_search_sad().
    Fix to c vs intrinsic mismatch of vpx_hadamard_32x32() function.
    Fix to a bug in vpx_hadamard_32x32_neon().
    Fix to Clang -Wunreachable-code-aggressive warnings.
    Fix to a bug in vpx_highbd_hadamard_32x32_neon().
    Fix to -Wunreachable-code in mfqe_partition.
    Force mode search on 64x64 if no mode is selected.
    Fix to ubsan failure caused by left shift of negative.
    Fix to integer overflow in calc_pframe_target_size.
    Fix to float-cast-overflow in vp8_change_config().
    Fix to a null ptr before use.
    Conditionally skip using inter frames in speed features.
    Remove invalid reference frames.
    Disable intra mode search speed features conditionally.
    Set nonrd keyframe under dynamic change of deadline for rtc.
    Fix to scaled reference offsets.
    Set skip_recode=0 in nonrd_pick_sb_modes.
    Fix to an edge case when downsizing to one.
    Fix to a bug in frame scaling.
    Fix to pred buffer stride.
    Fix to a bug in simple motion search.
    Update frame size in actual encoding.

Change-Id: I9c27fb2b917f9b80ed4bcc5cb3b4f87c56b62c2f
2024-01-18 12:33:56 -05:00
Gerda Zsejke More 71a5cb6e8a Add SVE implementation of HBD MSE functions
Add SVE implementation of MSE functions for 10-, 12- bit depth. Add
the corresponding tests as well.

An implementation was not added for 8 bit depth as the Neon DotProd
version is faster than the SVE implementation.

Change-Id: I0c5712ba2735a2879a0aa3a9a52980032fddc7a6
2024-01-17 21:38:54 +00:00
Marco Paniconi b95d175726 vp9-rtc: Fix to reset on scene change for temporal layers
Revert the part of the fix regarding temporal layers in:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5191480

Keep it as it was for now until further testing.

Change-Id: If747aabf907ba93cc20bcc067d2ca8f7758a91dd
2024-01-17 18:33:32 +00:00
Gerda Zsejke More e001eeb5bc Enable Neon Dotprod impl for HBD MSE
Enable Neon Dotprod 8-bit high bitdepth implementation for MSE
function as it is now not called with bit depth 10 or 12.

Bug: webm:1819
Change-Id: I9d1d506401aa0523fba2d8ea4978dc00fdacbb95
2024-01-17 18:15:17 +00:00
Gerda Zsejke More 41e0655e5e Fix highbd_get_block_variance_fn input parameter
Instead of always calling highbd_get_block_variance_fn with bit depth
8 use the macroblock's bit depth.

Bug: webm:1819
Change-Id: Ib4b19703384e897ee9ffeef73a11a8af2d262558
2024-01-17 18:15:17 +00:00
Marco Paniconi 25f03e456f vp9-svc: Fix to sample encoder for mismatch check
Don't check for mismatch for superframes whose
top spatial layer resolution was dropped.

Change-Id: I0abef43a710f0fb52ba2490fc784e57cda9952a0
2024-01-16 19:49:06 +00:00
Marco Paniconi cc306fac74 vp9-svc: Fix to max-q on scene change for svc
For svc with no inter-layer prediction: reset
the RC and force max_qp on all spatial layers
on scene/slide changes. In the current code it was only
reset on current spatial layer because it was assumed
we can predict off lower spatial layer to avoid
prediction across scene change. But this does not apply
when inter-layer prediction is off on delta frames.

Also reset only up to current temporal layer.
Because of the hierarchical prediction structure
only the lower temporal layers need the RC to be reset.

This helps to reduce excessive frame drops for the
full_superframe_drop mode.

Change-Id: I76925681850b82aa7fff7f9b1c1a0a605cf3cf3b
2024-01-16 15:49:10 +00:00
Jerome Jiang 8aeb5848a5 Do not use TPL QP from RC for final encoding
Bug: b/316610379
Change-Id: Ie7c6f8be0132602155102a72a16b2ee94c1c3dbd
2024-01-12 21:26:42 +00:00
James Zern 0eecce72b2 vp9_quantize.c: add missing include for get_msb()
clears a clang-tidy warning

Change-Id: Iaf58775084e758246a8fe0a4828ae954ea95f5b1
2024-01-11 16:33:53 -08:00
James Zern 0a91e18eca vp8_datarate_test.cc: add missing include
for VPX_CODEC_USE_PSNR. This clears a clang-tidy warning. vpx_encoder.h
exports vpx_codec.h so it shouldn't be necessary.

Change-Id: I863b6f8689eeef59cd9eadf3cdc177247a0653f8
2024-01-11 16:30:21 -08:00
Marco Paniconi 43bd567950 vp8: Fix to integer division by zero and overflow
This can happen in the setting of the frame
target size for delta frames, for non-CBR mode
(end_usage != USAGE_STREAM_FROM_SERVER) and with
temporal layers.

In calc_pframe_target_size(): the percent_high
(factor to adjust the target_size) may end up dividing
bits_off_target by total_byte_count. The total_byte_count
is define per layer for temporal layers, so it will be zero
for delta frames if the enhancement layer has never been
encoded before.

Since percent_high is capped to over_shoot_pct, the proposed
fix is to apply this cap if total_byte_count is zero.
Also this CL fixes a few integer overflow issues in setting
the layer target_bandwidth, the recale function, and in
setting target_bits_per_mb.

Unittest is added by Wan-Teh which triggers this issue.

Bug: chromium:1514684

Change-Id: I091158e720ece75d7ab9b7c4d18d30a5783102ab
2024-01-11 23:05:47 +00:00
Gerda Zsejke More aeb4928c68 Add SVE 16-bit dot-product helper functions
Add header file containing helper functions to make use of SVE
dot-product intrinsics via the Neon-SVE bridge.

Change-Id: I6cd198f8202559672817cbc19f890db35c03d3ff
2024-01-11 19:26:52 +00:00
Salome Thirot 0801bfca3f Add -flax-vector-conversions=none to Clang builds
GCC already does not allow implicit vector type conversions by default,
add -flax-vector-conversions=none to Clang builds to have the same
behavior.

Change-Id: I9d1adb836377077cf48818c80fe71025e2d2bdc7
2024-01-11 18:26:28 +00:00
Zoltan Kuscsik e03c9d2a62 Update of get_files.py to support Python3.x
Change-Id: I92aeb2a060338bdfc0083602b837b99181a8421c
2024-01-11 18:25:39 +00:00
Gerda Zsejke More 6ea3b51ec2 Require Arm Neon-SVE bridge header for enabling SVE
Disable SVE feature if arm_neon_sve_bridge header is not supported
by the compiler.

Change-Id: I3f78be2dd95b37b8d51b9f1fceca1f9701535eca
2024-01-11 18:23:46 +00:00
Marco Paniconi 756b29a776 vp8: Fix to race issue for multi-thread with pnsr_calc
Added unitest which triggers the data race in the
bug below, when only C code is forced.

The data race is between the loopfilter and variance
computation from generate_psnr_packet calculation.
Proposed fix is to move the wait for loopfilter thread to
finish up before entering generate_psnr_packet().

Bug: b/266833179.

Change-Id: Id2871c53274be0f404e65601c9a5c98aaead0c72
2024-01-11 00:07:08 +00:00
Wan-Teh Chang aef73b22cb Make encoder know frame size increase from config
Equivalent to the change to av1_change_config() in the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/182413.

Because we call alloc_compressor_data() only if
cm->mi_alloc_size < new_mi_size, this change won't cause
alloc_compressor_data() to be called unnecessarily, unlike the libaom
bug https://crbug.com/aomedia/3526.

Bug: b:317105128
Change-Id: I8a772a1d5c4766846641a6d541a6d861bf76c60f
2024-01-10 21:42:45 +00:00
Wan-Teh Chang c5f8086709 Move VPX_TPL_ABI_VERSION to the ext RC ABI version
The VpxTpl* structs defined in vpx_tpl.h are only used by the external
rate control library. Add a VPX_TPL_ABI_VERSION component to
VPX_EXT_RATECTRL_ABI_VERSION and remove the VPX_TPL_ABI_VERSION
component from VPX_ENCODER_ABI_VERSION.

The current value of VPX_TPL_ABI_VERSION is 2. It is subtracted from
VPX_EXT_RATECTRL_ABI_VERSION and added to VPX_ENCODER_ABI_VERSION so
that the values of those two macros stay the same.

Add a note to explain why VPX_ENCODER_ABI_VERSION has a
VPX_EXT_RATECTRL_ABI_VERSION component.

Change-Id: I680b8522dc04328cd51df6de590fdec75ca88ae8
2024-01-09 21:18:26 +00:00
Jerome Jiang 602e2e8979 Fix a typo in changelog for v1.14.0
Bug: webm:1833
Change-Id: I8133f678cf4231f2d048c61e42622b883897712a
2024-01-09 16:12:22 -05:00
Hari Limaye 469150a922 configure: add -arch flag when targeting darwin23
Commit db83435 introduced support for configuring for *-darwin23-gcc.
However configuring for *-darwin23-gcc does not currently add the
`-arch` flag to CFLAGS/LDFLAGS, so correct this here.

Change-Id: Ieeda1a5039ad40590dfcdcc6ba615a1d1697d54d
2024-01-09 20:38:58 +00:00
Jerome Jiang 2b1f6859f6 Update version
Before release:
c-a=8, a=0, r=1 -> c=8, a=0, r=1

After release:
 - If the library source code has changed at all since the last
   update, then increment revision:
   c=8, a=0, r=r+1=2

 - If any interfaces have been added, removed, or changed since
   the last update, increment current, and set revision to 0:
   c=c+1=9, a=0, r=0

 - If any interfaces have been added since the last public release,
   then increment age:
   c=9, a=a+1=1, r=0

 - If any interfaces have been removed or changed since the last
   public release, then set age to 0:
   c=9, a=0, r=0 (VpxTpl* structure changes)

MAJOR=c-a=9
MINOR=a=0
PATCH=r=0

Bug: webm:1833
Change-Id: Id24c9a0ff415a6f625d17b6098cdd0baf27432e3
2024-01-09 09:18:35 -05:00
Jerome Jiang e32df98af5 Update changelog with vp9 ext rc
Bug: webm:1833
Change-Id: I7e6e1da7965f098c8b62c55a09619d0bf703b516
2024-01-04 15:23:38 -05:00
Jerome Jiang 3b3e8b5f29 vp9 ext rc: if->assert, more comment for TPL ctrl
Change if to assertion in vp9_extrc_get_encodeframe_decision

Clarify comment for VP9E_ENABLE_EXTERNAL_RC_TPL that
rc_type | VPX_RC_QP must be non zero for this control to work.

Change-Id: I2c54cf7eda1f0f12f4ff7ac929e8e6a1fdd2215d
2024-01-04 13:49:03 -05:00
Jerome Jiang 1474e3c729 Return error if TPL related interface isn't set
Bug: b/316610379
Change-Id: I391d9ef308a1c2d763b124e451ebb22a05060102
2024-01-03 11:11:41 -05:00
Jerome Jiang 22b17dc3fb Update changelog
Bug: webm:1833
Change-Id: I90ffd457cafe705a040f9a63b870da66c076076e
2024-01-02 16:05:38 -05:00
Jerome Jiang f6b7166a2b Clear some clang-tidy complaints
Change-Id: I749c0b2b97f26923fc5e1c1e46a1c017cf25823f
2024-01-02 15:08:06 -05:00
Jerome Jiang bd78034078 Add codec ctrl to control TPL by external RC
Bug: b/316610379
Change-Id: Ic18aad8da35436b3de81b9ddf9359407da522701
2024-01-02 15:24:00 +00:00
Matt Oliver 33ecc6cc5f project: Update for 1.13.1 merge. 2023-12-24 00:07:29 +11:00
Matt Oliver a5f55b5c6b Merge commit '10b9492dcf05b652e2e4b370e205bd605d421972' 2023-12-24 00:01:10 +11:00
Zoltan Kuscsik 655da33b89 Use get_lsb in vp9 and vp8 invert_quant functions
Performance optimization. get_msb utilizes
the compiler/platform specific last significant bit
operator.

Note: 32 bit unsigned assumed, like all get_msb implementations do.
Change-Id: Ib013ad24aa0ea845efeb52aacd448b067edf91da
2023-12-21 14:23:04 +01:00
Jerome Jiang b8b6b4d4cc Remove VP9E_GET_TPL_STATS
This is never used.
A callback in external rc func was added and used instead.

Change-Id: Iade6f361072f0c28af98904baf457d2f0e9ca904
(cherry picked from commit 41ced868a6)
2023-12-18 12:23:32 -05:00
Jerome Jiang 41ced868a6 Remove VP9E_GET_TPL_STATS
This is never used.
A callback in external rc func was added and used instead.

Change-Id: Iade6f361072f0c28af98904baf457d2f0e9ca904
2023-12-16 20:12:34 +00:00
Jerome Jiang d0e2c30aa4 Update AUTHORS and .mailmap
Bug: webm:1833
Change-Id: I4569b9dc1ec1c70458120bebc45b2c963796ed87
2023-12-14 23:52:30 +00:00
Hari Limaye 0c2314d82e configure: add -arch flag when targeting darwin23
Commit db83435 introduced support for configuring for *-darwin23-gcc.
However configuring for *-darwin23-gcc does not currently add the
`-arch` flag to CFLAGS/LDFLAGS, so correct this here.

Change-Id: Ieeda1a5039ad40590dfcdcc6ba615a1d1697d54d
2023-12-14 20:23:26 +00:00
Wan-Teh Chang df655cf4fb Clarify the comment for update_error_state()
Explain why the encoder init functions cannot call update_error_state().

In vp8/vp8_cx_iface.c, this comment should have been added in
https://chromium-review.googlesource.com/c/webm/libvpx/+/4506609.

Rewrite update_error_state() in vp8/vp8_cx_iface.c to look like the
versions in vp9/vp9_cx_iface.c and av1/av1_cx_iface.c (in libaom).

Change-Id: I3f153d67b8c549ca5ac8ea0cfbcaad4ae705c8e6
2023-12-13 20:46:29 +00:00
Wan-Teh Chang 4fe07a0c41 Return correct error after longjmp in vp8e_encode
After a longjmp() call in vp8e_encode(), call update_error_state() so
that we return the error code and error detail set by the
vpx_internal_error() call.

Change-Id: I1f2428eb1b1f61e46c02604e16a5d44dcf162479
2023-12-13 10:56:28 -08:00
Marco Paniconi 193b151195 Fix to integer overflow in vp8 encodeframe.c
Unit test added.

Bug:webm:1831

Change-Id: Ib85f4f0fbdbebc0b49555f206a36376cea687df6
2023-12-12 22:49:46 +00:00
Hari Limaye a75859c439 Remove redundant comment in convolve8_4_usdot
The function convolve8_4_usdot contains a comment relating to the
SDOT implementation of convolve8, which requires addition of a
correction constant to account for range clamp of the input values.

This is not performed in the i8mm USDOT implementation - so remove the
comment.

Also add some const qualifiers to function arguments.

Change-Id: I10aff560d20403897f708ee293bf873be9c35761
2023-12-12 20:21:38 +00:00
James Zern 7be2dadc76 README: update target list
Change-Id: I001179ce34b2bf2350dce5f0197b6be175ab1c37
(cherry picked from commit f9b7c85768)
2023-12-11 21:24:10 +00:00
Wan-Teh Chang 7e735cdf43 IWYU: include vp9_scale.h and vpx_codec.h
Fix the following clang-tidy misc-include-cleaner warnings:
vp9/encoder/vp9_encoder.c:
  no header providing "vp9_is_valid_scale" is directly included
  no header providing "VPX_CODEC_CORRUPT_FRAME" is directly included
vp9/vp9_cx_iface.c:
  no header providing "valid_ref_frame_size" is directly included

Change-Id: I20e846f5b14c42c72aaefec0718b4ae9c7eea44a
2023-12-09 00:15:40 +00:00
Wan-Teh Chang 3a88c0c204 Avoid dangling pointers in vp9_encode_free_mt_data
Set cpi->tile_thr_data and cpi->workers to NULL after freeing them.

Change-Id: I46fec5f08a6dd034c8d76828f4d546630442f216
2023-12-08 14:39:18 -08:00
Cheng Chen 6bb806b177 Update frame size in actual encoding
Issue explanation:
The unit test calls set_config function twice after encoding the
first frame.
The first call of set_config reduces frame width, but is still within
half of the first frame.
The second call reduces frame width even more, making is less than
half of the first frame, which according to the encoder logic,
there is no valid ref frames, and this frame should be set as a
forced keyframe. This leads to null pointer access in scale_factors
later.

Solution:
To make sure the correct detection of a forced key frame,
we need to update the frame width and height only when the actual
encoding is performed.

Bug: b/311985118

Change-Id: Ie2cd3b760d4a4b399845693d7421c4eb11a12775
(cherry picked from commit 1ed56a46b3)
2023-12-08 21:34:57 +00:00
Yunqing Wang 75d7727f58 Fix a bug in simple motion search
This change fixed a bug revealed by b/311294795.
In simple motion search, the reference buffer pointer needs to be
restored after the search. Otherwise, it causes problems while the
reference frame scaling happens. This CL fixes the bug.

Bug: b/311294795
Change-Id: I093722d5888de3cc6a6542de82a6ec9d601f897d
(cherry picked from commit 50ed636e49)
2023-12-08 19:29:46 +00:00
Jerome Jiang 36b2dec5ee Set pred buffer stride correctly
Bug: b/312875957
Change-Id: I2eb5ab86d5fe30079b3ed1cbdb8b45bb2dc72a1d
(cherry picked from commit 585798f756)
2023-12-08 19:29:46 +00:00
Bohan Li eba5ceb9d1 Improve test comments.
Change-Id: I42dddb946193e30cf07e39b43eaad051c5da479a
(cherry picked from commit 9ad598f249)
2023-12-08 02:12:17 +00:00
Marco Paniconi c884b2e60e Add unittest for issue b/314857577
Bug: b/314857577

Change-Id: I591036c1ad3362023686d395adb4783c51baa62d
(cherry picked from commit 12e928cb34)
2023-12-08 01:45:50 +00:00
Wan-Teh Chang 6fca4de48e Remove SSE code for 128x* blocks
The maximum block size is 64x64 in VP9.

Bug: webm:1819
Change-Id: If9802be9f81b51dbcdbc8a68d5afe48ca6d3d0e7
(cherry picked from commit c4c9208054)
2023-12-08 00:34:23 +00:00
Wan-Teh Chang 0d5811e4ef Use vpx_sse instead of vpx_mse to compute SSE
Use vpx_sse and vpx_highbd_sse instead of vpx_mse16x16 and
vpx_highbd_8_mse16x16 respectively to compute SSE for PSNR
calculations. This solves an issue whereby vpx_highbd_8_mse16x16
was being used to calculate SSE for 10- and 12-bit input.

This is a port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/175063
by Jonathan Wright <jonathan.wright@arm.com>.

Bug: webm:1819
Change-Id: I37e3ac72835e67ccb44ac89a4ed16df62c2169a7
(cherry picked from commit 7dfe343199)
2023-12-08 00:34:23 +00:00
James Zern c64a85d25a vp9_frame_scale.c,cosmetics: funnction -> function
Change-Id: I8ecbd52037ff096f5c84c834b193b0a34c55a8b7
(cherry picked from commit 2f258fdee1)
2023-12-07 20:45:55 +00:00
Jerome Jiang fa60c7d9c1 IWYU: include yv12config.h for YV12_BUFFER_CONFIG
Fix clang-tiday warning

Change-Id: Ic4d6739cb933a37168176f6b481afdfd2562acfc
2023-12-07 12:35:30 -05:00
Cheng Chen 1ed56a46b3 Update frame size in actual encoding
Issue explanation:
The unit test calls set_config function twice after encoding the
first frame.
The first call of set_config reduces frame width, but is still within
half of the first frame.
The second call reduces frame width even more, making is less than
half of the first frame, which according to the encoder logic,
there is no valid ref frames, and this frame should be set as a
forced keyframe. This leads to null pointer access in scale_factors
later.

Solution:
To make sure the correct detection of a forced key frame,
we need to update the frame width and height only when the actual
encoding is performed.

Bug: b/311985118

Change-Id: Ie2cd3b760d4a4b399845693d7421c4eb11a12775
2023-12-06 21:46:54 -08:00
Yunqing Wang 50ed636e49 Fix a bug in simple motion search
This change fixed a bug revealed by b/311294795.
In simple motion search, the reference buffer pointer needs to be
restored after the search. Otherwise, it causes problems while the
reference frame scaling happens. This CL fixes the bug.

Bug: b/311294795
Change-Id: I093722d5888de3cc6a6542de82a6ec9d601f897d
2023-12-06 16:50:33 -08:00
Wan-Teh Chang c3b821fd4a Add the needed Android API level predicates.
fseeko and ftello are available on Android only from API level 24. Add
the needed guards for these functions.

Suggested by Yifan Yang.

Change-Id: I3a6721d31e1d961ab10b434ea6e92959bd5a70ab
(cherry picked from commit bf07554183)
2023-12-06 18:57:20 -05:00
Jerome Jiang 585798f756 Set pred buffer stride correctly
Bug: b/312875957
Change-Id: I2eb5ab86d5fe30079b3ed1cbdb8b45bb2dc72a1d
2023-12-06 23:54:31 +00:00
Yunqing Wang ebca0ab6fa Fix a bug in frame scaling
This change fixed a corner case bug reealed by b/311394513.
During the frame scaling, vpx_highbd_convolve8() and vpx_scaled_2d()
requires both x_step_q4 and y_step_q4 are less than or equal to a
defined value. Otherwise, it needs to call vp9_scale_and_extend_
frame_nonnormative() that supports arbitrary scaling.

The fix was done in LBD and HBD funnctions.

Bug: b/311394513
Change-Id: Id0d34e7910ec98859030ef968ac19331488046d4
(cherry picked from commit 8bf3649d41)
2023-12-06 16:22:35 -05:00
Bohan Li ffd533161a Fix edge case when downsizing to one.
BUG: b/310329177
Change-Id: I2ebf4165adbc7351d6cc73554827812dedc4d362
(cherry picked from commit a9f1bfdb8e)
2023-12-06 16:22:24 -05:00
Angie Chiang 5d49fa1f01 Set skip_recode=0 in nonrd_pick_sb_modes
Need to set skip_recode properly so that
vp9_encode_block_intra() can work properly when it is
called by block_rd_txfm(). We can not skip "recode" because
it is still at the rd search stage.

Bug: b/310340241
Change-Id: I7d7600ef72addd341636549c2dad1868ad90e1cb
(cherry picked from commit f10481dc0a)
2023-12-06 16:22:10 -05:00
James Zern 2f258fdee1 vp9_frame_scale.c,cosmetics: funnction -> function
Change-Id: I8ecbd52037ff096f5c84c834b193b0a34c55a8b7
2023-12-06 20:45:12 +00:00
Wan-Teh Chang 476d02a2d2 Fix two clang-tidy misc-include-cleaner warnings
no header providing "CONFIG_VP9_HIGHBITDEPTH" is directly included
no header providing "VPX_BITS_8" is directly included

Change-Id: Ie6d78c79ab462501417f2b451bbe808a1fdce931
2023-12-06 19:17:08 +00:00
James Zern f9b7c85768 README: update target list
Change-Id: I001179ce34b2bf2350dce5f0197b6be175ab1c37
2023-12-06 19:11:27 +00:00
Bohan Li 6f1001a894 Fix scaled reference offsets.
Since the reference frame is already scaled, do not scale the offsets.

BUG: b/311489136, b/312656387
Change-Id: Ib346242e7ec8c4d3ed26668fa4094271218278ed
(cherry picked from commit 845a817c05)
2023-12-06 14:01:33 -05:00
Gerda Zsejke More a05cfd672d configure: Add darwin23 support
Add target arm64-darwin23-gcc, x86_64-darwin23-gcc for MacOS 14.

Change-Id: I6b68a6a61d51aaa78ec11a5055bb95ce77a81d9c
(cherry picked from commit db83435afb)
2023-12-06 13:57:50 -05:00
Wan-Teh Chang c4c9208054 Remove SSE code for 128x* blocks
The maximum block size is 64x64 in VP9.

Bug: webm:1819
Change-Id: If9802be9f81b51dbcdbc8a68d5afe48ca6d3d0e7
2023-12-05 14:29:37 -08:00
Wan-Teh Chang 7dfe343199 Use vpx_sse instead of vpx_mse to compute SSE
Use vpx_sse and vpx_highbd_sse instead of vpx_mse16x16 and
vpx_highbd_8_mse16x16 respectively to compute SSE for PSNR
calculations. This solves an issue whereby vpx_highbd_8_mse16x16
was being used to calculate SSE for 10- and 12-bit input.

This is a port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/175063
by Jonathan Wright <jonathan.wright@arm.com>.

Bug: webm:1819
Change-Id: I37e3ac72835e67ccb44ac89a4ed16df62c2169a7
2023-12-05 22:13:43 +00:00
Jerome Jiang 4c2435c33e Fix several clang-tidy complaints
Change-Id: I78721d6b7ed692ad9363b5cac4e3324a3136d5b6
2023-12-05 22:08:56 +00:00
Marco Paniconi 12e928cb34 Add unittest for issue b/314857577
Bug: b/314857577

Change-Id: I591036c1ad3362023686d395adb4783c51baa62d
2023-12-05 22:02:45 +00:00
Wan-Teh Chang 97184161d5 Add "IWYU pragma: export" to some public headers
vpx/vpx_integer.h is clearly intended as the facade header for the
Standard C Library headers <stddef.h>, <inttypes.h>, and <stdint.h>.

It is reasonable to expect that vpx/vpx_decoder.h and vpx/vpx_encoder.h
should provide the symbols from vpx/vpx_codec.h.

Change-Id: I220797e63b2efc3dd9e2ac197fe2f918bf80d247
2023-12-05 21:18:45 +00:00
Jerome Jiang 7cfc58de48 RTC RC: add screen content support for vp8
Bug: b/281463780
Change-Id: I446c00bf8d794aa9134e4fe37960dd8a465448a4
2023-12-05 19:53:38 +00:00
Yunqing Wang 8bf3649d41 Fix a bug in frame scaling
This change fixed a corner case bug reealed by b/311394513.
During the frame scaling, vpx_highbd_convolve8() and vpx_scaled_2d()
requires both x_step_q4 and y_step_q4 are less than or equal to a
defined value. Otherwise, it needs to call vp9_scale_and_extend_
frame_nonnormative() that supports arbitrary scaling.

The fix was done in LBD and HBD funnctions.

Bug: b/311394513
Change-Id: Id0d34e7910ec98859030ef968ac19331488046d4
2023-12-04 19:39:08 -08:00
Bohan Li 9ad598f249 Improve test comments.
Change-Id: I42dddb946193e30cf07e39b43eaad051c5da479a
2023-12-04 23:32:08 +00:00
Gerda Zsejke More db83435afb configure: Add darwin23 support
Add target arm64-darwin23-gcc, x86_64-darwin23-gcc for MacOS 14.

Change-Id: I6b68a6a61d51aaa78ec11a5055bb95ce77a81d9c
2023-12-04 21:22:35 +00:00
Angie Chiang f10481dc0a Set skip_recode=0 in nonrd_pick_sb_modes
Need to set skip_recode properly so that
vp9_encode_block_intra() can work properly when it is
called by block_rd_txfm(). We can not skip "recode" because
it is still at the rd search stage.

Bug: b/310340241
Change-Id: I7d7600ef72addd341636549c2dad1868ad90e1cb
2023-12-02 07:18:11 +00:00
Wan-Teh Chang 5dcb4c1740 Define VPX_DL_* macros as unsigned long constants
Define the VPX_DL_REALTIME, VPX_DL_GOOD_QUALITY, and VPX_DL_BEST_QUALITY
macros as unsigned long, because the deadline parameter of
vpx_codec_encode() is of the unsigned long type. This enables C++
templates to deduce the unsigned long type from these macros.

Change-Id: I2173e3bbf5e15c84c11843790df93a497a35ed7d
2023-12-02 02:28:57 +00:00
Wan-Teh Chang bf07554183 Add the needed Android API level predicates.
fseeko and ftello are available on Android only from API level 24. Add
the needed guards for these functions.

Suggested by Yifan Yang.

Change-Id: I3a6721d31e1d961ab10b434ea6e92959bd5a70ab
2023-12-02 02:13:32 +00:00
Wan-Teh Chang 378af160ff Merge "Document vpx_codec_decode() ignores deadline param" into main 2023-12-01 23:09:06 +00:00
Wan-Teh Chang 478df94cd2 Merge "Define vpx_enc_deadline_t type for encode deadline" into main 2023-12-01 22:57:58 +00:00
Bohan Li a9f1bfdb8e Fix edge case when downsizing to one.
BUG: b/310329177
Change-Id: I2ebf4165adbc7351d6cc73554827812dedc4d362
2023-12-01 13:44:56 -08:00
Wan-Teh Chang 967c59b190 Merge "Fix scaled reference offsets." into main 2023-12-01 21:40:42 +00:00
James Zern d5ec829489 Merge "CHANGELOG: add CVE for issue #1642" into main 2023-12-01 21:39:31 +00:00
James Zern a88e7d869f Merge changes Ic2c3cb30,I027eaf2d,I455e5a94,I8f7633d9,I5116d10d, ... into main
* changes:
  Specialise Armv8.0 Neon horiz convolution for 4-tap filters
  Specialise Armv8.0 Neon vert convolution for 4-tap filters
  Specialise Armv8.6 Neon 2D horiz convolution for 4-tap filters
  Specialise Armv8.6 Neon horiz convolution for 4-tap filters
  Specialise Armv8.4 Neon 2D horiz convolution for 4-tap filters
  Specialise Armv8.4 Neon horiz convolution for 4-tap filters
  Specialise Armv8.6 Neon vert convolution for 4-tap filters
  Specialise Armv8.4 Neon vert convolution for 4-tap filters
  Make reporting of filter sizes more granular
  Delete redundant code in Neon SDOT/USDOT vertical convolutions
2023-12-01 21:38:51 +00:00
James Zern 5cad6fdc92 CHANGELOG: add CVE for issue #1642
CVE-2023-6349 was reserved for this issue. It's not yet published.

Bug: webm:1642, b:302710624
Change-Id: Iaab2a0bcae449a45e35678f5c049413fe0a4d2a4
2023-12-01 13:10:58 -08:00
Wan-Teh Chang 070d7e5cf3 Document vpx_codec_decode() ignores deadline param
The changes in this CL show that both the VP8 and VP9 implementations of
the decode function eventually discard the deadline parameter. Change
the code to ignore the deadline parameter in vpx_codec_decode() without
passing it to the decode function, and document that the deadline
parameter is ignored and 0 should be passed.

Change-Id: Ia977e16cdbdf97901207aa2d749887980137c4c0
2023-12-01 09:55:23 -08:00
Bohan Li 845a817c05 Fix scaled reference offsets.
Since the reference frame is already scaled, do not scale the offsets.

BUG: b/311489136, b/312656387
Change-Id: Ib346242e7ec8c4d3ed26668fa4094271218278ed
2023-12-01 09:52:59 -08:00
Wan-Teh Chang 4e29b9638d Merge "Add a test for b/312517065" into main 2023-12-01 02:18:21 +00:00
Jonathan Wright d144e6e95d Specialise Armv8.0 Neon horiz convolution for 4-tap filters
Add an Armv8.0 MLA Neon implementation of horizontal convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: Ic2c3cb307b95964cd0ba86f1c42eece3a8ab7cf4
2023-12-01 00:20:00 +00:00
Wan-Teh Chang 15c2a9a02f Add a test for b/312517065
Bug: b/312517065
Change-Id: I6b5529a8e034fb0468f110e420fafb4944a19d0f
2023-11-30 16:10:04 -08:00
Jonathan Wright a33ac12dc0 Specialise Armv8.0 Neon vert convolution for 4-tap filters
Add an Armv8.0 MLA Neon implementation of vertical convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: I027eaf2d1bb9711c2217cc8aa6b1e379d3e66b26
2023-11-30 12:00:00 +00:00
Wan-Teh Chang b027590c30 Define vpx_enc_deadline_t type for encode deadline
The deadline parameter of vpx_codec_encode() is of the unsigned long
type. The cpplint runtime/int check and the clang-tidy
google-runtime-int warn about the use of the unsigned long type. Adding
a type alias works around this issue.

Note: vpx_codec_decode() also has a deadline parameter, but it is of the
long type. So unfortuntely this type alias cannot be simply named
vpx_codec_deadline_t and the name must suggest it is encoder-specific.

Change-Id: I27b6b25730b620b328422ec3f91e63fdc55b377a
2023-11-29 17:36:24 -08:00
James Zern 97a0d139ce psnr.h,cosmetics: fix a typo (PNSR -> PSNR)
Change-Id: I2adea9f150852c106acc57e5aeeac571d6bd15fb
2023-11-29 16:47:34 -08:00
Jerome Jiang 2f3d3726b2 Merge "vp9 rc: support screen content" into main 2023-11-29 23:52:13 +00:00
Jerome Jiang fe5dc9f7fc vp9 rc: support screen content
Bug: b/281463780
Change-Id: I23668550257b28031bdca0537459f93ec63f1e2e
2023-11-29 17:38:38 -05:00
Wan-Teh Chang 57b72fe807 Add VP9Encoder class to simplify fuzz test cases
Bug: b:306422625
Change-Id: I8344cb7fb4e1aee87d46f683746517cdcddf5c5d
2023-11-29 14:22:10 -08:00
Wan-Teh Chang 73e38df5d5 Merge "Adding "const" to vpx_codec_iface_t is redundant" into main 2023-11-29 20:04:24 +00:00
Hirokazu Honda 2faf9c3e0e Merge "ratectrl_rtc: Remove duplicated DropFrameReason enum class" into main 2023-11-29 01:32:21 +00:00
Wan-Teh Chang 59a3b791fb Merge "Add vpx_sse and vpx_highbd_sse" into main 2023-11-28 18:39:49 +00:00
Marco Paniconi 77b4614b2f Merge "rtc: Set nonrd keyframe under dynamic change of deadline" into main 2023-11-28 17:24:48 +00:00
Marco Paniconi adebf364cb rtc: Set nonrd keyframe under dynamic change of deadline
For realtime mode: if the deadline mode (good/best/realtime)
is changed on the fly (via codec_encode() call), force a
key frame and set the speed feature nonrd_keyframe = 1 to
avoid entering the rd pickmode.

nonrd_pickmode=0/off is the only feature in realtime mode that
involves rd pickmode, so by forcing it on/1 we can cleanly
separate nonrd (realtime) from rd (good/best), so we can
avoid possible issues on this dynamic mode switch, such as in
bug listed below.

Dynamic change of deadline, in particular for realtime mode,
involves a lot of coding/speed feature changes, so best to
also force reset with keyframe.

Added unitest that triggers the issue in the bug.
Bug: b/310663186

Change-Id: Idf8fd7c9ee54b301968184be5481ee9faa06468d
2023-11-27 20:08:57 -08:00
Wan-Teh Chang 72a57ebe48 Merge "Tests kf_max_dist in one-pass zero-lag encoding" into main 2023-11-27 23:54:04 +00:00
Marco Paniconi d7358ed53a Unitest for issue: b/310477034
Fix is made here:
https://chromium-review.googlesource.com/c/webm/libvpx/+/5055827

Bug: b/310477034
Change-Id: Id1cc7a6a95e1ea5d1a022f36d7971915c9918339
2023-11-27 09:35:10 -08:00
Jonathan Wright cc89450a48 Specialise Armv8.6 Neon 2D horiz convolution for 4-tap filters
Add an Armv8.6 USDOT Neon path for the horizontal portion of 2D
convolution, specialised for executing with 4-tap filters (the most
common filter size for settings --good --cpu-used=1.) This new path
is also used when executing with bilinear (2-tap) filters.

Change-Id: I455e5a94bdcea1358025bd8e4d4c8c62e373aa5d
2023-11-27 16:44:02 +00:00
Jonathan Wright 0dc67ecf54 Specialise Armv8.6 Neon horiz convolution for 4-tap filters
Add an Armv8.6 USDOT Neon implementation of horizontal convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: I8f7633d9852ebfe8feb9b4a055715f849cccf297
2023-11-27 16:43:56 +00:00
Jonathan Wright 9cdb688919 Specialise Armv8.4 Neon 2D horiz convolution for 4-tap filters
Add an Armv8.4 SDOT Neon path for the horizontal portion of 2D
convolution, specialised for executing with 4-tap filters (the most
common filter size for settings --good --cpu-used=1.) This new path
is also used when executing with bilinear (2-tap) filters.

Change-Id: I5116d10ddb371ac2cf302ef905d06f2140dc7600
2023-11-27 16:43:21 +00:00
Jonathan Wright 68ef57f997 Specialise Armv8.4 Neon horiz convolution for 4-tap filters
Add an Armv8.4 SDOT Neon implementation of horizontal convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: Ib396681b3f7b8b0eeba94381fbe33a06cf7b4a13
2023-11-27 16:43:15 +00:00
Jonathan Wright 2f8e94715d Specialise Armv8.6 Neon vert convolution for 4-tap filters
Add an Armv8.6 USDOT Neon implementation of vertical convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: Ic893b25541e3317c5d5c270c338f868f080aed7c
2023-11-27 16:38:20 +00:00
Jonathan Wright bdc9e1c9d4 Specialise Armv8.4 Neon vert convolution for 4-tap filters
Add an Armv8.4 SDOT Neon implementation of vertical convolution
specialised for executing with 4-tap filters (the most common filter
size for settings --good --cpu-used=1.) This new path is also used
when executing with bilinear (2-tap) filters.

Change-Id: I3eb00b5a34f5676b68bda60a2a29be56e3d7d0cd
2023-11-27 16:37:56 +00:00
Jonathan Wright 1b3ec0676c Make reporting of filter sizes more granular
vpx_get_filter_taps() currently reports either 8-tap or 2-tap.
However, many 8-tap filters are actually 0-padded, resulting in a
lot of redundant work (multiplying by, and adding, 0) when processing
using an 8-tap convolution function. In preparation for adding 2- and
4-tap SIMD implementations for the convolution paths, make the filter
size reporting more granular, stripping any 0 padding. Filter sizes
can now be reported as 2-, 4-, 6- or 8-tap.

Change-Id: I100133aac7173134af34b918c9ad3007d98d6060
2023-11-23 15:11:11 +00:00
Jonathan Wright 9b729500d5 Delete redundant code in Neon SDOT/USDOT vertical convolutions
Delete redundant transpose/permute code in the Neon dot-product
vertical convolution paths. Variable values were assigned but never
used before subsequent assignment.

Change-Id: I15b29d0c993f56599e0d18ac1d5787e6385d2a3a
2023-11-23 13:58:42 +00:00
Wan-Teh Chang 366425079b Tests kf_max_dist in one-pass zero-lag encoding
The test shows that the comment for kf_max_dist in vpx/vpx_encoder.h
differs from its behavior by one. We should modify the comment to match
the encoding behavior.

Bug: webm:1829
Change-Id: Icdc58b8f6b25353f10ce8ecc481c862bd3fe86df
2023-11-22 17:50:42 -08:00
Jingning Han 3bd54a37d0 Disable intra mode search speed features conditionally
When all the inter reference frames are invalid, disable the speed
features that bypass intra mode search.

BUG=b/312517065

Change-Id: I246c953fad3be61b9d307da11c752a21a36b90ff
2023-11-22 15:38:27 -08:00
Wan-Teh Chang 635eba3319 Adding "const" to vpx_codec_iface_t is redundant
vpx_codec_iface_t is defined as follows:

  typedef const struct vpx_codec_iface vpx_codec_iface_t;

Since vpx_codec_iface_t is already a const struct, it is redundant to
add "const" to vpx_codec_iface_t.

Note: I think vpx_codec_iface_t should not have been defined as a const
struct, but it is too late to change that now.

Change-Id: Ifbd3f8a63c1d48e9169ff77fa0b505ea1e65519d
2023-11-22 15:38:04 -08:00
Jingning Han b562fdd4e6 Remove invalid reference frames
Remove the reference frames whose scaling factor is not in the
supported range.

BUG=b/312517065

Change-Id: Iaf8610ff7a95cd4a433bf529f741459d820d4f8b
2023-11-22 15:07:04 -08:00
Jingning Han 79257fd459 Conditionally skip using inter frames in speed features
When the reference frame's scaling factor is not in the supported
range, skip using it for motion compensation prediction in the
partition speed features.

BUG=b/312517065

Change-Id: Ie3687186521ad2616be258e80d3e5b16e5f2d5e9
2023-11-22 15:06:29 -08:00
Jerome Jiang 741b8f6228 Check null ptr before use
prev_mi is a pointer to pointer

Bug: b/310401647
Bug: b/310590556
Change-Id: Ic3c39a7eec14693357bd2485a5451d4b7f031b5e
2023-11-21 16:33:50 -05:00
Hirokazu Honda 56c78a68b0 ratectrl_rtc: Remove duplicated DropFrameReason enum class
DropFrameReason is declared in two places. This moves it
to the common place.

Change-Id: I04c16db4a49135588edff7e1746dcf9172750bb9
2023-11-21 17:10:48 +09:00
James Zern b067e73e4b Merge "vp8_dx_iface.c: add include for MAX_PARTITIONS" into main 2023-11-21 02:21:20 +00:00
Wan-Teh Chang a8db542b24 Add vpx_sse and vpx_highbd_sse
The code is ported from libaom's aom_sse and aom_highbd_sse at
commit 1e20d2da96515524864b21010dbe23809cff2e9b.

The vpx_sse and vpx_highbd_sse functions will be used by vpx_dsp/psnr.c.

Bug: webm:1819
Change-Id: I4fbffa9000ab92755de5387b1ddd4370cb7020f7
2023-11-20 14:59:27 -08:00
James Zern 1231fce45e vp8_dx_iface.c: add include for MAX_PARTITIONS
fixes clang-tidy warning:
no header providing "MAX_PARTITIONS" is directly included

Change-Id: Iba7a9d95df7f5bdee76e7975df764cd71461fc93
2023-11-20 11:55:49 -08:00
James Zern 9f8776ff4a vp8_ratectrl_rtc.h: fix a few typos
is -> if
returns -> computes

in the documentation for ComputeQP().

This is the same as:
9142314c2 ratectrl_rtc.h: fix a few typos

+ remove a duplicate, commented out, version of GetLoopfilterLevel()

Change-Id: I8832e628b63b0b7dac6236631072f36ad55d90e8
2023-11-20 11:49:14 -08:00
James Zern 9142314c2c ratectrl_rtc.h: fix a few typos
is -> if
returns -> computes

in the documentation for ComputeQP().

Change-Id: If70706736b0dc2ae56e45e2489dc208c61fd557a
2023-11-15 18:55:04 -08:00
Marco Paniconi 81aaa7f04b rtc: Add frame dropper to VP8 external RC
Move some internal drop_frame code to separate
function so the external RC can use.
And add new flag setting under VP8E_SET_RTC_EXTERNAL_RATECTRL
to disable vp8_drop_encodedframe_overshoot() for
testing the external RC.

Unittest added for single layer and 3 temporal layers.

Bug: b/280363228

Change-Id: Ibea2f627cc54e7156ff35259a64dd111d42d146c
2023-11-14 12:06:17 -08:00
Wan-Teh Chang b7d847d0e7 Merge "Delete -Wdeclaration-after-statement" into main 2023-11-11 02:29:46 +00:00
Wan-Teh Chang e4127f591d Document how VP9 treats a negative speed value
Change-Id: I12948b08a7bb5beb5024b8676de9dafc239f8e89
2023-11-10 13:14:18 -08:00
Wan-Teh Chang d15a1970c1 Delete -Wdeclaration-after-statement
Older versions of MSVC do not allow declarations after statements in C
files. We don't need to support those versions of MSVC now.

Use -std=gnu99 instead of -std=gnu89.

Change-Id: I76ba962f5a2bca30d6a5b2b05c5786507398ad32
2023-11-09 19:23:24 -08:00
Wan-Teh Chang f05122d35c Fix ClangTidy warnings
Most are related to include-what-you-use. One is to avoid using the
unsigned long type explicitly (by passing VPX_DL_REALTIME directly to
vpx_codec_encode).

Change-Id: Ieaf3418382ad8516cb4b172f7678893286fcb8cf
2023-11-09 18:35:51 -08:00
Wan-Teh Chang 97833b61ee Merge "Declare oxcf arg of vp8_create_compressor as const" into main 2023-11-10 01:40:17 +00:00
Wan-Teh Chang ef3eb01269 Merge "Fix float-cast-overflow in vp8_change_config()" into main 2023-11-09 23:10:35 +00:00
Wan-Teh Chang 296784c83a Declare oxcf arg of vp8_create_compressor as const
Declare the oxcf parameters of vp8_create_compressor() and init_config()
as const. This helps code analysis.

Change-Id: I344ef3e6afc3adced2b2865b7e0057c6d4b1d3c0
2023-11-09 12:42:58 -08:00
Wan-Teh Chang 4e05c38c85 Document the units of VP8 target_bandwidth/bitrate
Change-Id: I6298a0acb4ef546ae198bb1f16dea50ed34b2dae
2023-11-09 12:38:05 -08:00
Wan-Teh Chang 7ab673a9f6 Fix float-cast-overflow in vp8_change_config()
Bug: b:309716574
Change-Id: I9c523d5e9211f895c7497a9e3674b55f6be6c742
2023-11-09 12:01:13 -08:00
Wan-Teh Chang 8a35c7e585 Merge "Use symbolic constant VPX_CBR instead of 1" into main 2023-11-08 02:40:28 +00:00
Wan-Teh Chang c732fa7070 Use symbolic constant VPX_CBR instead of 1
Change-Id: Idae94cfc6d7a882691deeb4fa3ce0015f80ed937
2023-11-07 15:47:43 -08:00
Jerome Jiang 11b98025c3 Merge "Check fragments count before use" into main 2023-11-07 15:26:52 +00:00
James Zern 5b8d24f678 configure: detect PIE and enable PIC
Fixes the creation of DT_TEXTREL entries in binaries built with PIE
enabled:
  /usr/bin/ld: warning: creating DT_TEXTREL in a PIE

This matches the changes made in libaom:
1df26009da aom_configure: only override CONFIG_PIC if not set on cmd line
7235e65746 aom_configure.cmake: detect PIE and set CONFIG_PIC

Change-Id: I0a43e964af2d8eb8c5e7811ce14ad39285eec3a8
2023-11-06 15:17:19 -08:00
Jerome Jiang 879c9bd906 Check fragments count before use
Bug: webm:1827
Change-Id: I8d603d5db92476222cbee1c61fece957ad03a49f
2023-11-03 16:24:46 -04:00
Yunqing Wang f08d238867 Merge "Modify C vs SIMD test script" into main 2023-11-03 16:11:24 +00:00
Anupam Pandey 1464d7738a Modify C vs SIMD test script
- Enable C vs SIMD test for x86 32-bit platform
- Correct a print message in run_tests()

BUG=webm:1800

Change-Id: Ib1ccd3a87a64b5ec6cde524a14d5d1b7e200abfb
2023-11-03 18:26:33 +05:30
Yunqing Wang 54fc6a7558 Merge "Add an emms instruction to vpx_subtract_block" into main 2023-11-02 03:39:59 +00:00
Marco Paniconi 0d3ef6ffd2 vp9-RC: Add drop_frame support to external RC
Supports single layer and svc. For svc only the
framedrop_mode = FULL_SUPERFRAME_DROP is allowed
for now.

Dropping frames due to overshoot is enabled by the
oxcf->drop_frames_water_mark, which is zero as default.
Note that this CL also allows for drop/skip encoding of
enhancement layers if that layer bitrate is zero.

max_consec_drop is also added, set to INT_MAX as default.
Note that max_consec_drop is only used for svc mode.
It has not been added yet for single layer in libvpx encoder.

Tests added for single layer and svc case.

Change-Id: Ic12f6a0eb3fbf07d8eb8456c46cec27b2e1930d3
2023-10-31 09:34:42 -07:00
James Zern cab1f4b9b2 Merge "calc_pframe_target_size: fix integer overflow" into main 2023-10-30 19:20:46 +00:00
James Zern 7b51f50205 Merge "Fix 'unused variable' warning when neon_i8mm is disabled" into main 2023-10-30 19:20:18 +00:00
Jonathan Wright 3f3576098f Fix 'unused variable' warning when neon_i8mm is disabled
Guard hwcap2 feature interrogation on HAVE_NEON_I8MM so that it gets
disabled if neon_i8mm is disabled when configuring the build.

Bug: webm:1825
Change-Id: Ic6ff71f17387b96219591928a583d43560bb7c7a
2023-10-30 15:53:18 +00:00
Xiahong Bao 61c927a4ed calc_pframe_target_size: fix integer overflow
The intermediate value in the target bandwidth
calculation may exceed integer bounds.

Bug: 308007926

Change-Id: I8288c5820db06a550d88bf91fccc86106996deaa
Signed-off-by: Xiahong Bao <xiahong.bao@nxp.com>
2023-10-30 11:17:17 +09:00
Marco Paniconi b759032a0e Clear some clang-tidy complaints on header includes
Change-Id: Id6f54dc4643172f6a5576dc4846c47c8eda31c0f
2023-10-27 10:02:46 -07:00
Jonathan Wright 6457f06529 Add Arm SVE build flags and run-time CPU feature detection
Add 'sve' arch options to the configure, build and unit test files -
adding appropriate conditional options where necessary. Arm SIMD
extensions are treated as supersets in libvpx, so disable SVE if
either Neon DotProd or I8MM are unavailable.

Change-Id: I39dd24f2b209251084d1e28d7ac68099460309bb
2023-10-24 10:42:06 +01:00
Jerome Jiang 974c14578c Merge "Reduce memory usage of test with large frame size" into main 2023-10-20 21:14:18 +00:00
Jerome Jiang 352f9f64df Reduce memory usage of test with large frame size
- Use smaller frame size that still triggers the overflow
 - Do not run encoder as the encoder init also triggers the overflow

Bug: chromium:1492864
Change-Id: I392549abf69f1cfb3754cc847a214513ec9bedc5
2023-10-20 15:40:43 -04:00
Wan-Teh Chang 9004ace978 Also test VPX_ARCH_AARCH64 for 64-bit platforms
Change-Id: Ic11ccd791ff78801e0aba1d12ad2d99b9941ce9d
2023-10-19 18:28:48 -07:00
Jerome Jiang 424723dc02 Run bitrate overflow test only on 64bit systems
Frame size caps the target bitrate internally, so the frame size needs
to be large enough to reproduce the target bitrate overflow in the
fuzzing test.

However the frame size needed exceeds the max buffer allowed on 32bit
system defined by VPX_MAX_ALLOCABLE_MEMORY

Bug: chromium:1492864

Change-Id: Ia3a9a78cd35516373897039a7769b492e29e8450
2023-10-19 11:36:47 -04:00
Jerome Jiang e4db6c3aac Cap avg_frame_bandwidth at INT_MAX
avg_frame_bandwidth = target_bandwidth / framerate

If target_bandwidth is too big and/or framerate is too small (< 1),
avg_frame_bandwidth could be overflow

Bug: chromium:1492864
Change-Id: I32314da1414b472ae4bf2acdcd81b8a948286146
2023-10-17 17:06:28 -04:00
Jerome Jiang 0129e64a65 Fix ubsan failure caused by left shift of negative
Bug: b/305642441
Change-Id: Iddb1572c284161140da48f61b04cf600e5b57ecc
2023-10-16 11:22:48 -04:00
Jerome Jiang 2ab7ba8251 Force mode search on 64x64 if no mode is selected
A speed feature disable_split_mask (set to 63) could cause no mode and
partition to be selected in rd_pick_partition because:

-> thresh_mult_sub8x8 all INT_MAX
-> All modes skipped for sub8x8 blocks
-> found_best_rd is 0 -> break from the loop of 4 sub blocks
-> sum_rdc is INT_MAX -> No rd update -> should_encode_sb is 0
-> Propagating to top of the tree
-> No partition / mode is selected

Bug: b/290499385
Change-Id: Ia655e262f3b32445347ae0aaf1a2d868cea997f3
2023-10-13 20:28:21 -04:00
Wan-Teh Chang 9c377eafbe Handle Arm/AArch64 runtime feature detection
Port the following libaom CLs to libvpx:
https://aomedia-review.googlesource.com/c/aom/+/178361
https://aomedia-review.googlesource.com/c/aom/+/180701
https://aomedia-review.googlesource.com/c/aom/+/181821

The tests themselves are not feature-gated in the same way that they are
used in the rest of the codebase since they are not controlled by
rtcd.pl. This means that tests that assume the existence of features not
present on the target can cause SIGILL to be thrown.

This commit extends init_vpx_test.cc to match the behaviour for other
targets and automatically disable testing for features that are not
available on the machine running the tests.

Call arm_cpu_caps() and x86_simd_caps() inside #if !CONFIG_SHARED.
All the SIMD-specialized functions (arm or x86) are internal functions,
so they are not exported from the libvpx shared library. If
CONFIG_SHARED is 1, it is not necessary to call arm_cpu_caps(),
x86_simd_caps(), and append_negative_gtest_filter() either.

Change-Id: I330631816bdb52842020c5aa2a1ad802865cc285
2023-10-10 09:27:20 -07:00
Wan-Teh Chang 7c31749387 Declare some "VP8_CONFIG *oxcf" params as const
Change-Id: Ia5e8445098e18da5978aacf17281f16252413f17
2023-10-07 07:23:11 -07:00
Wan-Teh Chang 8cb4544c21 VP8: allow thread count changes
Fix the TODO(https://crbug.com/1486441) comment in vp8/vp8_cx_iface.c.

Make vp8cx_create_encoder_threads() work after it has been called
before. If there are already the exact number of threads it needs to
create, return immediately. Otherwise, shut down the existing threads
(by calling vp8cx_remove_encoder_threads()) and create the required
number of threads.

Call vp8cx_create_encoder_threads() in vp8e_set_config() to respond to
changes in g_threads or g_w (which also affects the number of threads
through cm->mb_cols and cpi->mt_sync_range).

Change-Id: I552eeca5b1f1f5313f59559eb1da396f270a2429
2023-10-06 10:40:14 -07:00
Wan-Teh Chang c23da380a3 VP8: Allocate cpi->mt_current_mb_col array lazily
Add the mt_current_mb_col_size field to VP8_COMP to record the size of
the mt_current_mb_col array.

Move the allocation of the mt_current_mb_col array from
vp8_alloc_compressor_data() to vp8_encode_frame(), where the use of
mt_current_mb_col starts. Allocate mt_current_mb_col right before use
if mt_current_mb_col hasn't been allocated or if the current size is
incorrect.

Move the deallocation of the mt_current_mb_col array from
dealloc_compressor_data() to vp8cx_remove_encoder_threads().

Move the TODO(https://crbug.com/1486441) comment from
vp8/encoder/onyx_if.c to vp8/vp8_cx_iface.c.

Change-Id: Ic5a0793278c2cc94876669aaa0dd732412876673
2023-10-04 13:05:18 -07:00
Wan-Teh Chang 874bcaa164 Merge "Clean up vp8cx_create/remove_encoder_threads()" into main 2023-10-04 19:50:20 +00:00
Jerome Jiang 25a9a8e35a Merge "Use correct include guards for init_vpx_test.h" into main 2023-10-04 13:42:09 +00:00
Anupam Pandey 41caf8fef5 Add an emms instruction to vpx_subtract_block
This CL adds an `emms` instruction at the end of the MMX assembly
for the vpx_subtract_block function, to properly clear the register
state. This resolves a mismatch between x86 build and C only build.

BUG=webm:1816

Change-Id: I79d2947da7f587f3558a2ae17df214d2faf59e74
2023-10-04 11:13:05 +05:30
James Zern 448c5e8684 Merge "vp9_encoder: normalize sizeof() calls" into main 2023-10-04 04:42:16 +00:00
Wan-Teh Chang a1d4b56208 Merge "Declare cur_row inside #if CONFIG_MULTITHREAD" into main 2023-10-04 03:50:39 +00:00
Wan-Teh Chang d7c7383298 Merge "Have vp9_enc_build and vp9_enc_test restore cwd" into main 2023-10-04 03:21:50 +00:00
Wan-Teh Chang 450dfa908b Merge "Fix a potential resource leak and add alloc checks" into main 2023-10-04 03:20:27 +00:00
Wan-Teh Chang ea67878f8c Clean up vp8cx_create/remove_encoder_threads()
Make vp8cx_create_encoder_threads() undo everything cleanly before
returning an error.

Make vp8cx_remove_encoder_threads() reset pointer fields to NULL after
freeing them, reset encoding_thread_count to 0, and reset b_lpf_running
to 0 (false). This makes it safe to call vp8cx_create_encoder_threads()
after calling vp8cx_remove_encoder_threads().

Change-Id: I586f06ce3d5b1c88ca46884bb4d6667ffc97e440
2023-10-03 20:08:18 -07:00
Wan-Teh Chang f67f9ce346 Declare cur_row inside #if CONFIG_MULTITHREAD
Fix the following compiler warning when libvpx is configured with
the --disable-multithread option:

  vp9/common/vp9_thread_common.c:391:7: warning:
  variable 'cur_row' set but not used [-Wunused-but-set-variable]
    int cur_row;
        ^

Change-Id: I53aa279152715083df40990eb7fdcaeb77a66777
2023-10-03 19:16:36 -07:00
Jerome Jiang f73026c2cc Use correct include guards for init_vpx_test.h
Bug: b/303112617
Change-Id: Ie18df33b2bcab91c18e920825f4ed3a29e18373b
2023-10-03 22:00:02 -04:00
Jerome Jiang 5b6ceba996 Include vpx_config.h for macros
Clear some clang-tidy complaints

Change-Id: I6690428d336c81709befd19a33e11c1367275df3
2023-10-03 14:26:50 -04:00
Jerome Jiang 0a3e2b4ca1 Factor out common code used in test binaries
Bug: b/303112617
Change-Id: Icbe16e95ff334a9578a692cc51b4773393ad0005
2023-10-03 14:26:44 -04:00
Wan-Teh Chang b729684b05 Use big cfg.g_w in ConfigResizeChangeThreadCount
vp8cx_create_encoder_threads() caps the thread count at
(cm->mb_cols / cpi->mt_sync_range) - 1. If cfg.g_w is 16, cm->mb_cols is
only 1 (see vp8_alloc_frame_buffers: mb_cols = width >> 4), so we won't
be using multiple threads. To reproduce bug chromium:1486441, the test
just needs to increase cfg.g_h sufficiently.

Bug: chromium:1486441
Change-Id: Ie6b2da2e31cfa1717a481f55eebc8c875db94d87
2023-10-02 13:55:16 -07:00
James Zern 95cb5eae70 Merge "Merge tag 'v1.13.1'" into main 2023-10-02 19:11:51 +00:00
Wan-Teh Chang b863d8bb47 Have vp9_enc_build and vp9_enc_test restore cwd
Use $PWD to get the current directory.

Quote directory pathnames.

Suggested by James Zern.

Bug: webm:1800
Change-Id: I51e922b24da0e89d936370f858eab55d193ebdcb
2023-09-30 10:32:49 -07:00
Wan-Teh Chang 6512f994da Disable vpx_highbd_8_mse16x16_neon_dotprod, etc.
These functions assume the uint16_t samples are <= 255 (bit depth 8),
but vpx_highbd_8_mse16x16() is called for any bit depth, not just 8.

A better fix is to port the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/175063 to libvpx, but
that requires porting aom_sse() and aom_highbd_sse() to libvpx, which is
quite involved. So disable vpx_highbd_8_mse16x16_neon_dotprod, etc.
first.

Bug: webm:1819
Change-Id: If495a5dedc58d9981317b9993c9fbb81ff3ab50c
2023-09-29 16:58:26 -07:00
James Zern 1672f4db71 Merge tag 'v1.13.1'
libvpx 1.13.1

2023-09-29 v1.13.1 "Ugly Duckling"
  This release contains two security related fixes. One each for VP8 and VP9.

  - Upgrading:
    This release is ABI compatible with the previous release.

  - Bug fixes:
    https://crbug.com/1486441 (CVE-2023-5217)
    Fix to a crash related to VP9 encoding (#1642)

* tag 'v1.13.1':
  update CHANGELOG
  update version to 1.13.1
  Fix bug with smaller width bigger size
  vp9_alloccommon: clear allocation sizes on free
  VP8: disallow thread count changes
  encode_api_test: add ConfigResizeChangeThreadCount
  README: update release version to 1.13.0

Bug: webm:1818
Change-Id: I732e2423f635d4115890f00fd63f9886e31f39a6
2023-09-29 16:36:46 -07:00
James Zern ec9e1ed41f vp9_encoder: normalize sizeof() calls
use sizeof(var) instead of sizeof(type) and sizeof(*var) instead of
sizeof(var[0]) for consistency in some places.

Change-Id: Ibd9a783cfef5ce1d06131df3831a4093821a502f
2023-09-29 15:17:37 -07:00
Wan-Teh Chang 7f568f9876 Fix a potential resource leak and add alloc checks
Backport fixes from libaom:
https://aomedia-review.googlesource.com/c/aom/+/109061
https://aomedia-review.googlesource.com/c/aom/+/158102

Change-Id: Ia9d42d474be2898f9ae2fdc28606273377da3e90
2023-09-29 15:08:28 -07:00
James Zern 10b9492dcf update CHANGELOG
Bug: webm:1818
Change-Id: Ic0a943b5d1c69a3621ad3f91012fb5308a0c11f1
2023-09-29 15:06:14 -07:00
James Zern 490a7067e8 update version to 1.13.1
SO_VERSION_MAJOR = 8
SO_VERSION_MINOR = 0
SO_VERSION_PATCH = 1

The increase of the patch number corresponds to the revision number in
the libtool text.

3. If the library source code has changed at all since the last update,
then increment revision (‘c:r:a’ becomes ‘c:r+1:a’).

Bug: webm:1818
Change-Id: Ia114368e9fd7a908e7fcf6e4d3142f142770e3f4
2023-09-29 13:13:47 -07:00
Jerome Jiang df9fd9d5b7 Fix bug with smaller width bigger size
Fixed previous patch that clusterfuzz failed on.

Local fuzzing passing overnight.

Bug: webm:1642
Change-Id: If0e08e72abd2e042efe4dcfac21e4cc51afdfdb9
(cherry picked from commit 263682c9a2)
2023-09-29 13:13:47 -07:00
James Zern a53700e4a3 vp9_alloccommon: clear allocation sizes on free
This fixes reallocations (and avoids potential crashes) if any
allocations fails and the application continues to call
vpx_codec_decode().

Found with vpx_dec_fuzzer_vp9 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: If5dc96b73c02efc94ec84c25eb50d10ad6b645a6
(cherry picked from commit 02ab555e99)
2023-09-29 13:13:47 -07:00
Wan-Teh Chang 4697b110ac Update 32-bit version of horizontal_add_uint32x2
The code was originally added in
https://aomedia-review.googlesource.com/c/aom/+/162267
by Jonathan Wright.

Change-Id: Iafd9e31d82abe22387e0d68f02c7ab81e85367ed
2023-09-29 10:35:20 -07:00
Cheng Chen fd2052d4c9 Properly determine end of sequence
When the next frame is null and the current frame is an overlay
frame, which is equivalent to there is an active alt ref frame,
we call this an end of sequence.

Change-Id: I49c2cf7a001df98aff8b62ba034317e408274bd4
2023-09-28 18:36:21 -07:00
Wan-Teh Chang a9998b53c2 Merge "vp9_c_vs_simd_encode: Restore cwd at end of test" into main 2023-09-28 17:52:22 +00:00
Jerome Jiang 18bc7ffea7 Merge "Fix bug with smaller width bigger size" into main 2023-09-28 17:39:23 +00:00
Wan-Teh Chang e462a0de03 vp9_c_vs_simd_encode: Restore cwd at end of test
Restore the original current directory at the end of
vp9_c_vs_simd_enc_test().

Bug: webm:1800
Change-Id: Iad64848a231e3c900149cc2b248055b02dda80a6
2023-09-28 09:03:40 -07:00
Wan-Teh Chang 58a854fb27 Skip the y4m_360p_10bit_input clip for armv8
It is a known mismatch.

Bug: webm:1819
Change-Id: Ieb707a6f53ddf6c7b0d1202c6520599d3e45da76
2023-09-27 16:12:12 -07:00
Yunqing Wang 8e63202fb3 Merge "Modify vp9_c_vs_simd_enc_test script" into main 2023-09-27 20:33:39 +00:00
Jerome Jiang 263682c9a2 Fix bug with smaller width bigger size
Fixed previous patch that clusterfuzz failed on.

Bug: webm:1642
Change-Id: If0e08e72abd2e042efe4dcfac21e4cc51afdfdb9
2023-09-27 11:21:11 -04:00
James Zern baed121877 VP8: disallow thread count changes
Currently allocations are done at encoder creation time. Going from
threaded to non-threaded would cause a crash.

Bug: chromium:1486441
Change-Id: Ie301c2a70847dff2f0daae408fbef1e4d42e73d4
(cherry picked from commit 3fbd1dca6a)
2023-09-26 18:42:50 -07:00
James Zern 452199ca85 encode_api_test: add ConfigResizeChangeThreadCount
Update thread counts and resolution to ensure allocations are updated
correctly. VP8 is disabled to avoid a crash.

Bug: chromium:1486441
Change-Id: Ie89776d9818d27dc351eff298a44c699e850761b
(cherry picked from commit af6dedd715)
2023-09-26 18:41:50 -07:00
Yunqing Wang 61f868bcf7 Modify vp9_c_vs_simd_enc_test script
Applied James' change to the script. Enabled the test:
vp9_c_vs_simd_enc_test

BUG=webm:1800

Change-Id: If1e33e5ccb6ca9315004f3e3f5b910f8a8255317
2023-09-26 14:12:14 -07:00
James Zern 3fbd1dca6a VP8: disallow thread count changes
Currently allocations are done at encoder creation time. Going from
threaded to non-threaded would cause a crash.

Bug: chromium:1486441
Change-Id: Ie301c2a70847dff2f0daae408fbef1e4d42e73d4
2023-09-25 19:33:07 -07:00
James Zern af6dedd715 encode_api_test: add ConfigResizeChangeThreadCount
Update thread counts and resolution to ensure allocations are updated
correctly. VP8 is disabled to avoid a crash.

Bug: chromium:1486441
Change-Id: Ie89776d9818d27dc351eff298a44c699e850761b
2023-09-25 19:28:08 -07:00
Jerome Jiang 67bfb41ed8 Do not call WebM RC for new GOP at the end of seq
define_gf_group is called at the last frame of each GOP to get GOP size
for next one, which means it'll also be called at the last GOP of the
sequence, when calling WebM RC will be returned with error since WebM RC
does not have any more GOP to return.

When gop_coding_frames from the encoder is 1, it means it's running out
of firstpass stats, which means end of sequence.

Bug: b/299610956
Change-Id: I30e077a28fe41593ebabbc1dc0c2915a4bcbece3
2023-09-21 18:10:27 -04:00
Martin Storsjo ad3301e6a3 aarch64: Generalize Windows cpu detection to any Windows variant
This cpu detection implementation doesn't do anything MSVC specific,
it just calls the IsProcessorFeaturePresent function. This can be
compiled with mingw compilers just as well.

Change-Id: I55e607a47c8f5b70d9f707ef96b2fa7553f2f79f
2023-09-18 11:56:51 -07:00
Jerome Jiang 8f8e741468 Add max/min_gf_interval to vpx_rc_config_t
Bug: b/300499738
Change-Id: Id32cb5e3ce667539c0d1efe1ff5fcc7a49e35329
2023-09-14 16:04:19 -04:00
Jerome Jiang 8e61b3cd1b Fix ref frame buffer in TPL stats for RC
The original ref frame index was the index in the GF group; RC expects
the index to be the one for ref frame buffer.

Change-Id: I9a2b0e72b6332023fb2e8da131b557f82db02e39
2023-09-14 13:27:39 -04:00
Jerome Jiang 9c2e33ff74 Set frame width height for 1st TPL GOP frame
Change-Id: Ic92dfd232bf90e8cbe6c6233af523ed40d12097a
2023-09-14 11:59:57 -04:00
yuanhecai eb232b662a loongarch: Fix bugs from vp8_sixtap_predict4x4/16x16_lsx
Bug: webm:1755

Change-Id: I7295e0f9a1551b8a418d5b65a2b7351df1fdc063
2023-09-13 10:06:23 +08:00
yuanhecai 391bb5604b loongarch: simplify vpx_quantize_b/b_32x32_lsx args
Bug: webm:1755

Change-Id: I42fdb1c34f959dd1204b343b8192e3d9b49821b4
2023-09-13 10:05:34 +08:00
Jerome Jiang d549cb74b9 Add missing headers for clang-tidy warnings
Change-Id: I97edec8ecffdcc79b8f3528deb60a3a0332ea0cc
2023-09-12 15:28:06 -04:00
Jonathan Wright 1a1f50a89d Use run-time feature detection for Neon DotProd HBD MSE
Arm Neon DotProd implementations of vpx_highbd_8_mse<w>x<h> currently
need to be enabled at compile time since they're guarded by #ifdef
feature macros. Now that run-time feature detection has been enabled
for Arm platforms, expose these implementations with distinct
*neon_dotprod names in a separate file and wire them up to the build
system and rtcd.pl. Also add new test cases for the new functions.

Change-Id: I26be6fb587258c8fa9fbf03509b7602358a001a8
2023-09-03 23:04:50 +01:00
Jonathan Wright 15d6215716 Use run-time feature detection for Neon DotProd specialty var.
Enable Arm Neon DotProd implementations of vpx_get_var_sse_sum*
specialty variance functions via run-time feature detection, wiring
up the new *neon_dotprod names to rtcd.pl. Also add new test cases.

Change-Id: I04ac3db87d32ee7f94702b6c0360254e5688f713
2023-09-03 23:04:49 +01:00
Jonathan Wright ad4f28abaa Use run-time feature detection for Neon DotProd variance
Arm Neon DotProd implementations of vpx_variance<w>x<h> currently
need to be enabled at compile time since they're guarded by #ifdef
feature macros. Now that run-time feature detection has been enabled
for Arm platforms, expose these implementations with distinct
*neon_dotprod names in a separate file and wire them up to the build
system and rtcd.pl. Also add new test cases for the new functions.

Remove the _neon suffix in functions making reference to
vpx_variance<w>x<h>_neon() (e.g. sub-pixel variance) - enabling use
of the appropriate *neon or *neon_dotprod version at run time.

Similar changes for the specialty variance and MSE functions will be
made in a subsequent commit.

Change-Id: I69a0ef0d622ecb2d15bd90b4ace53273a32ed22d
2023-09-03 23:04:49 +01:00
Jonathan Wright 7009fe55a9 Use run-time CPU feature detection for Neon DotProd SAD4D
Arm Neon DotProd implementations of vpx_sad*4d currently need to be
enabled at compile time since they're guarded by ifdef feature
macros. Now that run-time feature detection has been enabled for Arm
platforms, expose these implementations with distinct *neon_dotprod
names in separate files and wire them up to the build system and
rtcd.pl. Also add new test cases for the new DotProd functions.

Change-Id: Ie99ee0b03ec488626f52c3f13e4111fe26cc5619
2023-09-03 23:04:49 +01:00
Jonathan Wright 02dc617f8c Use run-time CPU feature detection for Neon DotProd SAD
Arm Neon DotProd implementations of vpx_sad* currently need to be
enabled at compile time since they're guarded by ifdef feature
macros. Now that run-time feature detection has been enabled for Arm
platforms, expose these implementations with distinct *neon_dotprod
names in separate files and wire them up to the build system and
rtcd.pl. Also add new test cases for the new DotProd functions.

Change-Id: Ic6906c28240276ba89787eadbc9393a232374f95
2023-09-03 23:04:41 +01:00
Jonathan Wright 91158c99f7 Use run-time CPU feature detection for vpx_convolve8_neon
Arm Neon DotProd and I8MM implementations of vpx_convolve8* currently
need to be enabled at compile time since they're guarded by ifdef
feature macros. Now that run-time feature detection has been enabled
for Arm platforms, expose these implementations with distinct
*neon_dotprod/*neon_i8mm names in separate files and wire them up to
the build system and rtcd.pl. Also add new test cases for the new
DotProd and I8MM functions.

Change-Id: I3db3cd62e8596099d9fec7805ca3ee86b2a01c74
2023-09-03 20:43:06 +01:00
Jonathan Wright 148d1085f7 Refactor and extend run-time CPU feature detection on Arm
1) Overhaul the Arm CPU feature detection code, taking inspiration
   from similar recent changes in libaom.
2) Add neon_dotprod and neon_i8mm arch options in the configure,
   build and unit test files, adding appropriate conditional options
   where necessary.
3) Soft-enable run-time CPU feature detection by default for both 32-
   bit and 64-bit Arm platforms.

Change-Id: I3f13317d88324acc5753394351188baa8d18a261
2023-09-01 20:25:17 +01:00
Jonathan Wright 7ee16bc178 Simplify Neon MSE helper function params/return values
Simplify the parameters and return values of the Neon MSE helper
functions for both standard and high bitdepth - avoiding unused
return values.

Change-Id: I6f9208f9ce890fbe58346d9c7d9d701f28f2f90f
2023-08-31 14:21:01 +01:00
Marco Paniconi 6da1bd01d6 vp9 svc: fix interger overflow
Overflow was happening in two places:
one in set_encoder_config(), where the input
layer_target_bitrates are converted from kbps to bps,
the other in vp9_calc_pframe_target_size_one_pass_vbr(),
where target is scaled by kf_ratio.

vp9_ratectrl.c:2039: runtime error: signed integer overflow:
-137438983 * 25 cannot be represented in type 'int'

Bug: chromium:1475943

Change-Id: I1ab0980862548c8827fae461df9a7a74425209ff
2023-08-29 11:35:12 -07:00
Jerome Jiang e052ada780 Do not call ext rc functions when they're null
Change-Id: Ie78afadd4ad5845e42bd4d5412703369f8d5e0f5
2023-08-25 10:56:23 -04:00
James Zern 24c0dcc851 Merge "vp9_calc_pframe_target_size_one_pass_cbr: fix int overflow" into main 2023-08-21 22:46:42 +00:00
James Zern 99a4462b8d Merge "vp8,ratectrl.c: fix integer overflow" into main 2023-08-21 22:46:25 +00:00
James Zern b124b05dcb Merge "fdct4x4_neon: fix compile w/cl" into main 2023-08-21 19:05:49 +00:00
Jerome Jiang ade6905e39 vp9 ext rc: copy under/overshoot% for all RC modes
Bug: b/295507002
Change-Id: Ie4b302b82fa2d83e0be450cea60c59907b37f954
2023-08-21 10:05:15 -04:00
James Zern c7aa75ac55 vp9_calc_pframe_target_size_one_pass_cbr: fix int overflow
vp9/encoder/vp9_ratectrl.c:2171:23: runtime error: signed integer
overflow: 103079280 * -22 cannot be represented in type 'int'

Bug: chromium:1473268
Change-Id: Ic1de7d48e74d94c2a992e53ec4382b5b44dba7af
2023-08-18 14:04:33 -07:00
James Zern 80b1b5a7e9 vp8,ratectrl.c: fix integer overflow
in calc_iframe_target_size():
vp8/encoder/ratectrl.c:349:31: runtime error: signed integer overflow:
38 * 343597280 cannot be represented in type 'int'

Bug: chromium:1473473
Change-Id: Ie8f7b147efb27c92314df09837b66f7d97046883
2023-08-18 12:36:45 -07:00
James Zern 401d8f36be vp9_cx_iface: fix code compatibility
Remove '= {}' (C23 [1]) and use memset to clear a vpx_rc_config_t
instance.

after:
6e2c3b9b3 Add RC mode to vpx external RC interface

Fixes compile with -pedantic and Microsoft's cl compiler.

[1] https://en.cppreference.com/w/c/language/initialization

Change-Id: I2019cdf0c42103cfc80b1e58c68b7596e497007f
2023-08-18 09:10:00 -07:00
Jerome Jiang d3188cab65 Merge "vp9 ext rc: Assign over/undershoot % for CQ mode" into main 2023-08-17 17:47:50 +00:00
Jerome Jiang 87a467f356 vp9 ext rc: Assign over/undershoot % for CQ mode
Bug: b/295507002
Change-Id: Ie5b4dabc620f6d17c4039f186e0709d8e9479b47
2023-08-17 12:34:48 -04:00
Jerome Jiang e7bfd8b6c2 Merge "Extend ext RC mode to have CQ mode" into main 2023-08-17 15:11:48 +00:00
Jerome Jiang 4b1ac3c23f Extend ext RC mode to have CQ mode
Also do not return error if it's not specified.

Bug: b/295507002
Change-Id: Ib1f83551272bdde1bceff03554abc4c02d95ca09
2023-08-16 16:05:01 -04:00
James Zern bbf36c8839 Merge "tools_common,die_codec(): output to stderr" into main 2023-08-16 19:41:23 +00:00
James Zern 58eed626d8 tools_common,die_codec(): output to stderr
This function is used to report a failure, messages of this type should
go to stderr.

Change-Id: I0dee246dddc886a3278b247a770a356446658864
2023-08-16 11:15:55 -07:00
Jerome Jiang 6e2c3b9b3c Add RC mode to vpx external RC interface
Bug: b/295507002
Change-Id: Id2dd21482828ec64eef9abdf6a1cca83100d21ba
2023-08-15 16:11:43 -04:00
James Zern 8d2c357eab fdct4x4_neon: fix compile w/cl
Use an array for constant initialization rather than array syntax which
assumes the underlying type is a vector. Fixes compile error with
cl targeting Windows Arm64:

vpx_dsp\arm\fdct4x4_neon.c(55,52): error C2078: too many initializers

No change in assembly with gcc 12.2.0 & clang 14.

Bug: b/277255390
Bug: webm:1810
Fixed: webm:1810

Change-Id: Ia30edcdbb45067dfe865b9958a5eecf1fd9ddfc8
2023-08-11 15:52:26 -07:00
James Zern 335728c987 *quantize*.c: fix visual studio warnings
after:
22818907d normalize *const in rtcd

fixes warnings of the form:
vpx_dsp\x86\quantize_avx.c(145): warning C4028: formal parameter 2
different from declaration

Change-Id: I4dc423f11ec4a9171e18bdb6be2fa8dfb65ee61a
2023-08-11 13:42:49 -07:00
Jonathan Wright c8610c266c Fix bug and re-enable vpx_int_pro_row/col_neon
Fix a bug in vpx_int_pro_row_neon (increment pointer after peeled
first loop iteration) and re-enable both vpx_int_pro_row/col_neon
paths.

Also fix IntProRowTest to use width_ (instead of 0) as the src_stride
for the input data block. The test's use of 0 for src_stride is the
reason the tests passed with the buggy Neon implementation noted in
the listed bugs. (The old buggy Neon implementation fails the
adjusted unit tests.)

BUG=webm:1800
BUG=webm:1809

Change-Id: I1f4572ee155653a7596fe2c10b5938ea7a3f63ae
2023-08-11 00:08:56 +01:00
Yunqing Wang 67961dc5f7 Merge "Enable arm test in c vs SIMD bit-exactness test" into main 2023-08-08 20:32:52 +00:00
Yunqing Wang 32a4ecc3cf Merge "Disable vpx_int_pro_row/col neon SIMD functions" into main 2023-08-08 20:32:41 +00:00
Yunqing Wang 685715b698 Enable arm test in c vs SIMD bit-exactness test
Arm SIMD testing was enabled in c vs SIMD bit-exactness test after
arm SIMD mismatch was resolved.

BUG=webm:1800

Change-Id: Id60127313a0955f4a5c8468281fd5a441668fddb
2023-08-07 22:06:58 +00:00
Yunqing Wang 6e5fc00001 Disable vpx_int_pro_row/col neon SIMD functions
The vpx_int_pro_row/col neon SIMD version caused a mismatch between
neon encoding vs c encoding. Disabled them for now to ensure the
correctness of VP9 encoding on the arm platform. Since these 2
functions were not used much, so this wouldn't affect the overall
encoder speed much.

BUG=webm:1800
BUG=webm:1809

Change-Id: Id1a7d542fc03d4cf9fa1039a49832abf35fb722f
2023-08-07 15:04:43 -07:00
Jerome Jiang 242c743170 VP9 RC: Add pixel row/col of a TPL block
Bug: b/294049605
Change-Id: I383a88a037a2a48a5fc1b9def6f991278c3665a8
2023-08-07 16:42:43 -04:00
Jerome Jiang d4b6132d2b Fix more clang-tidy warnings
- Include vpx/vpx_ext_ratectrl.h in vp9_ext_ratectrl.c
 - Include vpx/internal/vpx_codec_internal.h
 - Include <stddef.h> for NULL

Bug: b/294049605
Change-Id: Iedd8b3864da27fde1678bfa6606e6fc5630a7a09
2023-08-07 10:33:18 -04:00
Jerome Jiang fc29b8533e Fix some clang-tidy warnings
- Use zero initializer instead of memset to avoid including <cstring>
 - Include vpx_codec.h for vpx_codec_err_t and error codes
 - Include vpx_tpl.h for VpxTplGopStats

Change-Id: Iac5837ce2173bd945bfe8eeb401ff4dfd04fd2e1
2023-08-04 16:29:10 -04:00
Jerome Jiang f6aaad370d Fix include path fpr vpx_tpl.h,vpx_ext_ratectrl.h
Bug: b/294049605
Change-Id: I6422fc4250c2192f985cce2e296a19a05934969b
2023-08-04 16:29:06 -04:00
Jerome Jiang 5556ebd894 Merge "vp9_quantize_fp_neon: Same params name as in decl" into main 2023-08-03 19:28:44 +00:00
Jerome Jiang 7dfeaffacc Merge "vp9 ext rc: Add callback for tpl stats" into main 2023-08-03 18:33:32 +00:00
Jerome Jiang 44f2819298 vp9_quantize_fp_neon: Same params name as in decl
Clear some clang-tidy warnings

Change-Id: Iea4c4e77b3d515ec6384bd34875a0002ab13c14c
2023-08-03 14:07:55 -04:00
Jerome Jiang 2f2761c261 vp9 ext rc: Add callback for tpl stats
Added test

Bug: b/294049605
Change-Id: I3967a0f915e1a6e7a0d34d04732c33e1ca6f35e7
2023-08-03 13:16:42 -04:00
Anupam Pandey 6075b1a36f Add test to check bit exactness of C and SIMD in VP9 encoder
This CL adds a shell script to test bit exactness of C and SIMD
VP9 encoder for x86 platform.

As C Vs NEON encoding outputs are not bit-exact (BUG=webm:1809),
ARM tests are currently disabled.

BUG=webm:1800

Change-Id: Iffcc70863e8cf83ccb5bc5be73e8866165697358
2023-08-03 15:03:38 +05:30
Yunqing Wang 2b82efa769 Add a 10-bit test file
Added a 10-bit test file for VP9 end-to-end c vs SIMD bit-
exactness test.

BUG=webm:1800

Change-Id: I4a864f1a740abee27049d68231adf2ec308f9a96
2023-08-01 21:04:09 -07:00
Johann 22818907d2 normalize *const in rtcd
Change-Id: Iece50143b43263c0c8f90299bedd7d2a5b9aa56b
2023-07-29 05:44:56 +09:00
Johann e7a4730fcc remove incorrect (void)
n_coeffs is used in this function

Change-Id: I5f5d2933304bb636a33e0fa294b4526edb65a08d
2023-07-28 20:21:34 +09:00
Johann 7c7ab9165a quantize_fp: reduce parameters
apply similar steps as to the other quantize functions to switch to
macroblock_plane and ScanOrder

Change-Id: I486d653326aaf52ffd3beafd2e891ba6a5d57ef3
2023-07-28 20:13:24 +09:00
Johann 70fc756383 quantize: reduce parameters
Pass macroblock_plane and ScanOrder instead of looking up the values
beforehand. Avoids pushing arguments to the stack.

Change-Id: I22df6f645eb1a1d89ba5a4d9bc58acb77af51aa9
2023-07-28 17:34:43 +09:00
James Zern 4f19de3826 resize_test: prefer 'override' to 'virtual'
Update functions in WRITE_COMPRESSED_STREAM blocks, which are disabled
by default. This caused them to be missed in:
84e6b7ab0 test/*.cc: prefer 'override' to 'virtual'

Change-Id: I0e462263f19c15eb0a30d0c0f4e145062f789489
2023-07-27 17:52:49 -07:00
James Zern 626e37e777 test/*.h: prefer 'override' to 'virtual'
created with clang-tidy --fix --checks=-*,modernize-use-override

Change-Id: I53412f35590799574edb573ae417a4a004cccd1e
2023-07-27 17:52:49 -07:00
James Zern d899b97945 encode_test_driver.h: use bool literal
Change-Id: If47be9ca0daa18d92cb849484f9e139e65e3560e
2023-07-27 17:52:49 -07:00
James Zern d62edaf41f test/**.cc: use bool literals
created with clang-tidy --fix --checks=-*,modernize-use-bool-literals

Change-Id: Ifaed8ca824676555acaf1053b2a5a52c51a70638
2023-07-25 12:18:03 -07:00
James Zern 5740cb3929 test/decode_perf_test.cc: use nullptr
created with clang-tidy --fix --checks=-*,modernize-use-nullptr

Change-Id: Ibf4a80fa00e9b59d471c92788ec4c7c72e4662e5
2023-07-25 12:10:21 -07:00
James Zern 1c2297b2bc test/*.cc: use '= default'
created with clang-tidy --fix --checks=-*,modernize-use-equals-default

Change-Id: Ie373fb5501491fce53479d20f3a6d908c4b7c535
2023-07-25 12:04:57 -07:00
James Zern f9e577cecb Merge changes I71e1b442,Ibbfb949b into main
* changes:
  test/*.cc: prefer 'override' to 'virtual'
  test,AbstractBench: fix -Wnon-virtual-dtor
2023-07-25 18:27:34 +00:00
James Zern 84e6b7ab02 test/*.cc: prefer 'override' to 'virtual'
created with clang-tidy --fix --checks=-*,modernize-use-override

Change-Id: I71e1b4423c143b3e47fe90929ee110b307cdb565
2023-07-24 18:09:31 -07:00
James Zern 5fb280ebb9 test,AbstractBench: fix -Wnon-virtual-dtor
In file included from ../test/bench.cc:14:
../test/bench.h:17:7: warning: 'AbstractBench' has virtual functions but
non-virtual destructor [-Wnon-virtual-dtor]
class AbstractBench {

Change-Id: Ibbfb949b63c8dff936c7ed4f2d056dea0343377b
2023-07-24 17:07:51 -07:00
Jerome Jiang e1c124f896 Add new_mv_count to ext rate control interface
Bug: b/290385227
Change-Id: Ia87c4bf1e9315bf1134c998f88e9d5548c497777
2023-07-24 18:04:58 -04:00
Jerome Jiang 37200b6abb cleanup: _pt -> _ptr in vp9 external RC interface
Change-Id: Ic483488f8f6273e8977cfc324466bda41f1e47a7
2023-07-24 13:08:05 -04:00
James Zern 1d1ee888d3 vp9_rdopt,handle_inter_mode: fix -Wmaybe-uninitialized warning
With gcc 13.1.1

In function ‘handle_inter_mode’,
inlined from ‘vp9_rd_pick_inter_mode_sb’ at
    ../vp9/encoder/vp9_rdopt.c:3872:17:
../vp9/encoder/vp9_rdopt.c:3142:8: warning: ‘tmp_rd’ may be used
    uninitialized [-Wmaybe-uninitialized]
 3142 |     rd = tmp_rd + RDCOST(x->rdmult, x->rddiv, rs, 0);
../vp9/encoder/vp9_rdopt.c: In function ‘vp9_rd_pick_inter_mode_sb’:
../vp9/encoder/vp9_rdopt.c:2846:15: note: ‘tmp_rd’ was declared here
 2846 |   int64_t rd, tmp_rd, best_rd = INT64_MAX;

Change-Id: I8608957cc8bbeb1ae525f3c3dad6fe9785b2a9b4
2023-07-13 09:49:30 -07:00
James Zern 9ad950a9c4 Merge "vp8: remove missing prototypes from the rtcd header" into main 2023-07-11 00:55:30 +00:00
L. E. Segovia e9b9972ca4 vp8: remove missing prototypes from the rtcd header
These were removed in If7a49e920e12f7fca0541190b87e6dae510df05c but
the leftovers can cause a build to fail if the code isn't optimized out.
I just found this out in the Meson port of libvpx for GStreamer.

BUG=webm:1584

Change-Id: I1c953720a2cbec3796200d4ec4020dca0b672bfb
2023-07-10 22:38:24 +00:00
James Zern f30532a6d9 vpx_free_tpl_gop_stats: normalize param name
this fixes a clang-tidy warning

Change-Id: I13f4750c15b7d6a395494c8dbcb896bde125b3c4
2023-07-10 10:06:13 -07:00
James Zern b2c2955c82 Merge "delete some dead code" into main 2023-07-06 17:10:37 +00:00
James Zern dcb91aa3dd mfqe_partition: fix -Wunreachable-code
vp9/common/vp9_mfqe.c|240 col 16| warning: code will never be executed
[-Wunreachable-code]
 BLOCK_SIZE mfqe_bs, bs_tmp;
            ^~~~~~~

Change-Id: I566b20d8c294e19bc4b90b57b730f933048e71a5
2023-06-29 09:52:26 -07:00
Wan-Teh Chang 3ef9934789 Fix a bug in vpx_highbd_hadamard_32x32_neon().
This CL is the highbd version of
https://chromium-review.googlesource.com/c/webm/libvpx/+/4646573.

The bug is caused by the incorrect assumption that
(a / 2) + (b / 2) == (a + b) / 2 and (a / 2) - (b / 2) == (a - b) / 2.

Also fix the Rand() inputs to Hadamard functions in unit tests.

This CL ports the following libaom CLs to libvpx:
https://aomedia-review.googlesource.com/c/aom/+/177101
https://aomedia-review.googlesource.com/c/aom/+/177241

Change-Id: Ic20e7684eab5d6507417fa2b75e572064d37ad2c
2023-06-28 16:09:36 -07:00
James Zern dc26707f80 delete some dead code
follow-up to:
3ecba3980 Fix Clang -Wunreachable-code-aggressive warnings

Change-Id: I364312987bc838c69c010cce024bd3d62a918417
2023-06-28 12:26:32 -07:00
James Zern 9598c384bc Merge "Fix Clang -Wunreachable-code-aggressive warnings" into main 2023-06-28 19:21:40 +00:00
James Zern 3ecba39802 Fix Clang -Wunreachable-code-aggressive warnings
Based on the change in libaom:
fe36011455 Fix Clang -Wunreachable-code-aggressive warnings

Clang's -Wunreachable-code-aggressive flag enables several warning flags
such as -Wunreachable-code-break and -Wunreachable-code-return. Chrome's
build system enables -Wunreachable-code-aggressive (in
build/config/compiler/BUILD.gn), so it would be good if libvpx could be
compiled without -Wunreachable-code-aggressive warnings.

This requires the VPX_NO_RETURN macro be defined correctly for all the
compilers we support, otherwise some compilers may warn about missing
return statements after a die() or fatal() call (which does not return).

Change-Id: I0c069133af45a7a61759538b6d74c681ea087dcd
2023-06-28 11:06:50 -07:00
Jerome Jiang 3bd65ac776 vp9 firstpass stats in a separate header
Change-Id: If91c5c74c71affc48eb858beb314a6c194b14023
2023-06-28 11:17:15 -04:00
James Zern 44d6cacec6 Merge changes I1c17302f,Ic084894b,I9867f5fc,Ie3faf7b3,If5dc96b7, ... into main
* changes:
  vp8_decode: fix keyframe resync after decode error
  vp8_decode: only remove threads on thread create failure
  vp8_decode: clear stream info on decoder create failure
  vp9_decodeframe,init_mt: free tile_workers on alloc failure
  vp9_alloccommon: clear allocation sizes on free
  vp9_dx_iface: fix leaks on init_decoder() failure
2023-06-28 00:02:47 +00:00
James Zern 44a5eaa3ba vp8_decode: fix keyframe resync after decode error
This fixes a crash if the application continues to call
vpx_codec_decode(). Previously a non-keyframe could cause a crash if the
decoder failed before fully initializing due to an allocation failure.
The stream info and frame resolution would be 0, skipping an allocation.

Found with vpx_dec_fuzzer_vp8 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: I1c17302f4d3a488ba3b4eefe0bf53853dc558bc1
2023-06-27 12:58:28 -07:00
James Zern a166c52d3a vp8_decode: only remove threads on thread create failure
This fixes a crash if the application continues to call
vpx_codec_decode(). Previously the decoder instance would be freed,
causing a crash when attempting to access it with restart_threads=1.

Found with vpx_dec_fuzzer_vp8 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: Ic084894b776729bb1572f747082cef002f0832a8
2023-06-26 19:23:58 -07:00
James Zern 263ddc9e38 vp8_decode: clear stream info on decoder create failure
This fixes a crash if the application continues to call
vpx_codec_decode().

Found with vpx_dec_fuzzer_vp8 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: I9867f5fc3d1163026f521a9609d3cbbc00568d1d
2023-06-26 19:23:41 -07:00
James Zern a31e818ef8 vp9_decodeframe,init_mt: free tile_workers on alloc failure
This avoids a crash if any of the thread allocations fail and the
application continues to call vpx_codec_decode(). Previously
num_tile_workers would be non-zero, but not equal to num_threads, which
would cause a crash during later thread management.

Found with vpx_dec_fuzzer_vp9 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: Ie3faf7b36764aebedac0924acb6e4cb7545aec7d
2023-06-26 19:20:41 -07:00
James Zern 02ab555e99 vp9_alloccommon: clear allocation sizes on free
This fixes reallocations (and avoids potential crashes) if any
allocations fails and the application continues to call
vpx_codec_decode().

Found with vpx_dec_fuzzer_vp9 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: If5dc96b73c02efc94ec84c25eb50d10ad6b645a6
2023-06-26 19:15:30 -07:00
James Zern 885ecc7c66 vp9_dx_iface: fix leaks on init_decoder() failure
If any allocations fail in init_decoder() and the application continues
to call vpx_codec_decode() some of the allocations would be orphaned or
the decoder would be left in a partially initialized state.

Found with vpx_dec_fuzzer_vp9 & Nallocfuzz
(https://github.com/catenacyber/nallocfuzz).

Bug: webm:1807
Change-Id: I44f662526d715ecaeac6180070af40672cd42611
2023-06-26 19:14:14 -07:00
Wan-Teh Chang 19f3a754d6 Fix a bug in vpx_hadamard_32x32_neon()
A right shift by 2 is equivalent to two halving operations if there is
no no addition or subtraction between the two halving operations.

Note: Since vhaddq_s16() and vhsubq_s16() have 17-bit intermediate
precision, the Neon code doesn't need to go to int32_t as was done in
https://chromium-review.googlesource.com/c/webm/libvpx/+/4604169.

Change-Id: Ibe0691cde0fd3b94ee7c497845ba459d30d503b0
2023-06-26 15:48:03 -07:00
James Zern 14e52008ed Merge "configure.sh: Improve a comment." into main 2023-06-20 20:06:32 +00:00
Yunqing Wang 74e8c77425 Merge "Remove vp9_diamond_search_sad_avx function" into main 2023-06-20 16:34:58 +00:00
Anupam Pandey 80d4172f07 Remove vp9_diamond_search_sad_avx function
This CL removes the avx of vp9_diamond_search_sad function as
there is no speed up seen wrt C.

Change-Id: Ife6005d8e444ea2c8d07ac0f686c840344b9e0ea
2023-06-19 16:05:12 +05:30
Chen Wang af40910197 configure.sh: Improve a comment.
The corresponding case block is not only for ARM.
Original comment text makes reader confused.

Test: N/A, just comment text changes.

Change-Id: I3154d18d3b3d237c1eecfe07dc7ec237c98194cf
Signed-off-by: Chen Wang <wangchen20@iscas.ac.cn>
2023-06-17 00:42:17 +00:00
Jerome Jiang 8cee267d3d Add new_mv_count to firstpass stats
Mostly follows the logic of how it's calculated in libaom.

Bug: b/287283080
Change-Id: I9ee67d844ef9db7cca63339b5304459eaa28d324
2023-06-16 14:13:29 -04:00
Yunqing Wang 8789421bf3 Merge "Fix c vs intrinsic mismatch of vpx_hadamard_32x32() function" into main 2023-06-12 16:44:21 +00:00
Jerome Jiang bdb8ccc0af RTC RC: clean up unnecessary headers
Change-Id: I77c407be59f4eb0c70a89a5fffd88c648e634123
2023-06-09 15:33:39 -04:00
Anupam Pandey 8c308aefea Fix c vs intrinsic mismatch of vpx_hadamard_32x32() function
This CL resolves the mismatch between C and intrinsic implementation
of vpx_hadamard_32x32 function. The mismatch was due to integer
overflow during the addition operation in the intrinsic functions.
Specifically, the addition in the intrinsic function was performed
at the 16-bit level, while the calculation of a0 + a1 resulted in
a 17-bit value.

This code change addresses the problem by performing
the addition at the 32-bit level (with sign extension) in both SSE2
and AVX2, and then converting the results back to the 16-bit level
after a right shift.

STATS_CHANGED

Change-Id: I576ca64e3b9ebb31d143fcd2da64322790bc5853
2023-06-09 17:15:37 +05:30
Jerome Jiang 2245df50a6 Replace NONE with NO_REF_FRAME
NONE is a common name and it has conflicts with symbols defined in
Chromium.

Bug: b/286163500
Change-Id: I3d935a786f771a4d90b258fabc6fd6c2ecbf1c59
2023-06-08 11:08:23 -04:00
Jerome Jiang da02ccde30 Merge "Fix more typos (n/n)" into main 2023-06-08 14:11:24 +00:00
Jerome Jiang fb6aebcbbf Merge "Fix more typos (3/n)" into main 2023-06-07 21:10:04 +00:00
Jerome Jiang d42b7fd661 Fix more typos (n/n)
impace -> impact
taget -> target
prediciton -> prediction
addtion -> addition
the the -> the

Bug: webm:1803
Change-Id: I759c9d930a037ca69662164fcd6be160ed707d77
2023-06-07 16:41:18 -04:00
Jerome Jiang 6a8eb04fec Fix more typos (3/n)
Propogation -> Propagation
propogate -> propagate
cant -> can't
upto -> up to
canddiates -> candidates
refernce -> reference
USEAGE -> USAGE

Change-Id: Iadaf2dffd86b54e04411910f667e8c2dfc6c4c77
2023-06-07 15:46:33 -04:00
Jerome Jiang da9735a37e Merge "Fix more typos (2/n)" into main 2023-06-07 19:10:43 +00:00
Jerome Jiang aafa55cc3f Merge "Fix more typos (1/n)" into main 2023-06-07 19:10:36 +00:00
Jerome Jiang 1c95f5d17b Merge "Fix a few typos" into main 2023-06-07 18:19:08 +00:00
Jerome Jiang ffb9345109 Fix more typos (2/n)
kernal -> kernel
e.g -> e.g.
paritioning -> partitioning
partioning -> partitioning
coefficents -> coefficients
i.e, -> i.e.,
equivalend -> equivalent
recive -> receive
resoultions -> resolutions

Bug: webm:1803
Change-Id: I1d6176202ee5daee7a64bf59114e8b304aeb4db7
2023-06-07 13:09:58 -04:00
Jerome Jiang ad14a32b33 Fix more typos (1/n)
Dont -> Don't
setings -> settings
thresold -> thresh
thresold -> threshold
becasue -> because
itterations -> iterations
its a -> it's a
an constant -> a constant

Bug: webm:1803
Change-Id: I1e019393939ed25c59c898c88d4941ec360b026d
2023-06-07 13:09:39 -04:00
Jerome Jiang bcd491a6be Fix a few typos
segement -> segment
dont -> don't
useage -> usage
devide -> divide

Bug: webm:1803
Change-Id: I0153380b0003825c4b62cf323d4f2bc837c8a264
2023-06-07 12:39:07 -04:00
Deepa K G e510716d7e Add comments in vp9_diamond_search_sad_avx()
Added comments related to re-arranging the
elements of the SAD vector to find the
minimum.

Change-Id: I58b702d304a6cdd32f04775fba603e39c19a8947
2023-06-06 14:35:14 +05:30
Deepa K G 7b66c730a2 Fix c vs avx mismatch of diamond_search_sad()
In the function vp9_diamond_search_sad_avx(), arranged
the cost vector in a specific order. This ensures that
the motion vector with the least index is selected,
when there exists more than one candidate motion
vector with the minimum cost, thus resolving the
c vs avx mismatch.

STATS_CHANGED

Change-Id: I4f8864f464f9ea2aae6250db3d8ad91cb08b26e2
2023-06-05 12:48:24 +05:30
Jerome Jiang 575bd73f61 Merge "Trim tpl stats by 2 extra frames" into main 2023-05-31 19:31:04 +00:00
Jerome Jiang 1aff4a5655 Trim tpl stats by 2 extra frames
Not applicable to the last GOP.

Bug: b/284162396
Change-Id: I55b7e04e9fc4b68a08ce3e00b10743823c828954
2023-05-31 14:26:49 -04:00
James Zern 60ee1b149b Merge changes I6a906803,I0307a3b6 into main
* changes:
  Optimize Neon implementation of vpx_int_pro_row
  Optimize Neon implementation of vpx_int_pro_col
2023-05-31 17:44:00 +00:00
Jonathan Wright c36aa2e9c4 Optimize Neon implementation of vpx_int_pro_row
Double the number of accumulator registers to remove the bottleneck.
Also peel the first loop iteration.

Change-Id: I6a90680369f9c33cdfe14ea547ac1569ec3f50de
2023-05-31 14:34:43 +01:00
Jonathan Wright c738e87f27 Optimize Neon implementation of vpx_int_pro_col
Use widening pairwise addition instructions to halve the number of
additions required.

Change-Id: I0307a3b65e50d2b1ae582938bc5df9c2b21df734
2023-05-31 14:30:02 +01:00
James Zern ad5677eafc Merge changes Ia3647698,I55caf34e,Id2c60f39 into main
* changes:
  vpx_dsp_common.h,clip_pixel: work around VS2022 Arm64 issue
  fdct_partial_neon.c: work around VS2022 Arm64 issue
  fdct8x8_test.cc: work around VS2022 Arm64 issue
2023-05-25 04:54:09 +00:00
James Zern 47fa9804b2 Merge "examples.mk,vpxdec: rm libwebm muxer dependency" into main 2023-05-24 17:43:20 +00:00
Jerome Jiang 31c07211ba Merge "Add IO for TPL stats" into main 2023-05-24 16:27:20 +00:00
James Zern 25f2e1ef25 vpx_dsp_common.h,clip_pixel: work around VS2022 Arm64 issue
cl.exe targeting AArch64 with optimizations enabled
produces invalid code for clip_pixel() when the return type is uint8_t.
See:
https://developercommunity.visualstudio.com/t/Misoptimization-for-ARM64-in-VS-2022-17/10363361

Bug: b/277255076
Bug: webm:1788
Change-Id: Ia3647698effd34f1cf196cd33fa4a8cab9fa53d6
2023-05-23 15:52:09 -07:00
James Zern 95b56ab7df fdct_partial_neon.c: work around VS2022 Arm64 issue
cl.exe targeting AArch64 with optimizations enabled
will fail with an internal compiler error.
See:
https://developercommunity.visualstudio.com/t/Compiler-crash-C1001-when-building-a-for/10346110

Bug: b/277255076
Bug: webm:1788
Change-Id: I55caf34e910dab47a7775f07280677cdfe606f5b
2023-05-23 15:52:05 -07:00
James Zern 62d09a3e94 fdct8x8_test.cc: work around VS2022 Arm64 issue
cl.exe targeting AArch64 with optimizations enabled
produces invalid code in RunExtremalCheck() and RunInvAccuracyCheck().
See:
https://developercommunity.visualstudio.com/t/1770-preview-1:-Misoptimization-for-AR/10369786

Bug: b/277255076
Bug: webm:1788
Change-Id: Id2c60f3948d8f788c78602aea1b5232133415dea
2023-05-23 15:51:56 -07:00
Jerome Jiang d45cc8edda Add IO for TPL stats
Overload TempOutFile constructor to allow IO mode.

Bug: b/281563704

Change-Id: I1f4f5b29db0e331941b6795e478eeeab51f625ad
2023-05-23 17:58:01 -04:00
Jerome Jiang 7a5f328a66 Merge "Add new vpx_tpl.h API file" into main 2023-05-18 17:20:03 +00:00
Yunqing Wang 4bbdd6b046 Merge "Improve convolve AVX2 intrinsic for speed" into main 2023-05-18 15:48:49 +00:00
Jerome Jiang 7e7a1706e3 Add new vpx_tpl.h API file
New file (vpx_tpl.c) in the following CLs will add new APIs dealing with
TPL stats from VP9 encoder.

Change-Id: I5102ef64214cba1ca6ecea9582a19049666c6ca4
2023-05-17 20:43:35 -04:00
Anupam Pandey e6b9a8d667 Improve convolve AVX2 intrinsic for speed
This CL refactors the code related to convolve function.
Furthermore, improved the AVX2 intrinsic to compute
convolve vertical for w = 4 case, and convolve horiz for
w = 16 case.

Please note the module level scaling w.r.t C function
(timer based) for existing (AVX2) and new AVX2 intrinsics:

Block     Scaling
Size   AVX2       AVX2
     (existing)   (New)
4x4    5.34x      5.91x
4x8    7.10x      7.79x
16x8  23.52x     25.63x
16x16 29.47x     30.22x
16x32 33.42x     33.44x

This is a bit exact change.

Change-Id: If130183bc12faab9ca2bcec0ceeaa8d0af05e413
2023-05-17 14:24:34 +05:30
James Zern 99522d307c Merge changes Ie77ad184,Idfcac43c into main
* changes:
  Add 2D-specific Neon horizontal convolution functions
  Refactor standard bitdepth Neon convolution functions
2023-05-16 00:05:05 +00:00
Jonathan Wright 3e1e38d117 Add 2D-specific Neon horizontal convolution functions
2D 8-tap convolution filtering is performed in two passes -
horizontal and vertical. The horizontal pass must produce enough
input data for the subsequent vertical pass - 3 rows above and 4 rows
below, in addition to the actual block height.

At present, all Neon horizontal convolution algorithms process 4 rows
at a time, but this means we end up doing at least 1 row too much
work in the 2D first pass case where we need h + 7, not h + 8 rows of
output.

This patch adds additional dot-product (SDOT and USDOT) Neon paths
that process h + 7 rows of data exactly, saving the work of the
unnecessary extra row. It is impractical to take a similar approach
for the Armv8.0 MLA paths since we have to transpose the data block
both before and after calling the convolution helper functions.

vpx_convolve_neon performance impact: we observe a speedup of ~9% for
smaller (and wider) blocks, and a speedup of 0-3% for larger blocks.
This is to be expected since the proportion of redundant work
decreases as the block height increases.

Change-Id: Ie77ad1848707d2d48bb8851345a469aae9d097e1
2023-05-13 20:43:20 +01:00
James Zern 8adf1be644 Merge "Don't use -Wl,-z,defs with Clang's sanitizers" into main 2023-05-12 19:23:47 +00:00
James Zern 2a9b810d3d Don't use -Wl,-z,defs with Clang's sanitizers
This avoids link errors related to the sanitizers:
https://clang.llvm.org/docs/AddressSanitizer.html#usage
"When linking shared libraries, the AddressSanitizer run-time is not
linked, so -Wl,-z,defs may cause link errors ..."

See also:
https://crbug.com/aomedia/3438

Bug: webm:1801
Fixed: webm:1801
Change-Id: Ie212318005a5f7222e5486775175534025306367
2023-05-12 10:20:54 -07:00
Jonathan Wright 8ecf584321 Refactor standard bitdepth Neon convolution functions
1) Use #define constant instead of magic numbers for right shifts.
2) Move saturating narrow into helper functions that return 4-element
   result vectors.
3) Use mem_neon.h helpers for load/store sequences in Armv8.0 paths.
4) Tidy up: assert conditions and some longer variable names.
5) Prefer != 0 to > 0 where possible for loop termination conditions.

Change-Id: Idfcac43ca38faf729dca07b8cc8f7f45ad264d24
2023-05-12 14:53:51 +01:00
James Zern 9e0fc37f6f configure: add -Wshadow
libraries under third_party/ are out of scope for this change.

Bug: webm:1793
Change-Id: I562065a3c0ea9fdfc9615d1a6b1ae47da79b8ce0
2023-05-09 14:04:19 -07:00
James Zern 894262fb8f Merge "vp8_macros_msa.h: clear -Wshadow warnings" into main 2023-05-09 21:03:31 +00:00
James Zern bf5facce39 Merge changes Iac020280,I8ca8660a into main
* changes:
  gen_msvs_vcxproj: add ARM64EC w/VS >= 2022
  configure: add clang-cl vs1[67] arm64 targets
2023-05-09 20:55:55 +00:00
Yunqing Wang cc1b3886f2 Merge "Add AVX2 intrinsic for vpx_comp_avg_pred() function" into main 2023-05-09 15:57:09 +00:00
Anupam Pandey 457b7f5986 Add AVX2 intrinsic for vpx_comp_avg_pred() function
The module level scaling w.r.t C function (timer based) for
existing (SSE2) and new AVX2 intrinsics:

If ref_padding = 0
Block     Scaling
size    SSE2    AVX2
8x4     3.24x   3.24x
8x8     4.22x   4.90x
8x16    5.91x   5.93x
16x8    1.63x   3.52x
16x16   1.53x   4.19x
16x32   1.38x   4.82x
32x16   1.28x   3.08x
32x32   1.45x   3.13x
32x64   1.38x   3.04x
64x32   1.39x   2.12x
64x64   1.46x   2.24x

If ref_padding = 8
Block     Scaling
size    SSE2    AVX2
8x4     3.20x   3.21x
8x8     4.61x   4.83x
8x16    5.50x   6.45x
16x8    1.56x   3.35x
16x16   1.53x   4.19x
16x32   1.37x   4.83x
32x16   1.28x   3.07x
32x32   1.46x   3.29x
32x64   1.38x   3.22x
64x32   1.38x   2.14x
64x64   1.38x   2.12x

This is a bit-exact change.

Change-Id: I72c5d155f64d0c630bc8c3aef21dc8bbd045d9e6
2023-05-09 16:33:59 +05:30
James Zern fbbe1d0115 vp8_macros_msa.h: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ia940b06bd23a915a050432e03bb630567e891d8d
2023-05-08 21:44:32 -07:00
James Zern 19ec57e149 Merge "README: update target list" into main 2023-05-08 20:52:52 +00:00
James Zern 2108b7a26f Merge changes Ie165d410,I6d9bb8da,I6858e574 into main
* changes:
  vp8_[cd]x_iface: clear setjmp flag on function exit
  vp9_decodeframe,tile_worker_hook: relocate setjmp=1
  vp9,encoder_set_config: set setjmp flag after setjmp()
2023-05-08 20:52:31 +00:00
Jerome Jiang f5f3a64862 Merge "Add VpxTplGopStats" into main 2023-05-08 19:47:30 +00:00
Jerome Jiang 7ab013f8a9 Merge "Unify implementation of CHECK_MEM_ERROR" into main 2023-05-08 19:47:21 +00:00
Jerome Jiang 62cd0c9c3e Merge "CHECK_MEM_ERROR to return in vp9_set_roi_map" into main 2023-05-08 19:46:44 +00:00
James Zern 3916e0e130 gen_msvs_vcxproj: add ARM64EC w/VS >= 2022
rather than define new targets, add a platform to the arm64 list as they
share the same configuration.

Bug: webm:1788
Change-Id: Iac020280b1103fb12b559f21439aeff26568fba4
2023-05-08 10:53:21 -07:00
James Zern 3fe1365884 configure: add clang-cl vs1[67] arm64 targets
x86 and armv7 are skipped for now as the intrinsics will need different
flags than cl.exe (/arch:... -> -m...).

Bug: webm:1788
Change-Id: I8ca8660a8644cdd84c51cb1f75005e371ba8207d
2023-05-08 10:53:21 -07:00
Jerome Jiang 745c6392f7 Add VpxTplGopStats
Contains the size of GOP - also the size of the list of TPL stats for
each frame in this GOP.

VpxTplGopStats will be the unit for VP9E_GET_TPL_STATS control to return
TPL stats from the encoder.

Bug: b/273736974
Change-Id: I1682242fc6db4aafcd6314af023aa0d704976585
2023-05-08 13:27:26 -04:00
Jerome Jiang 1710c9282a Unify implementation of CHECK_MEM_ERROR
There were multiple implementations of CHECK_MEM_ERROR across the
library that take different arguments and used in different places.

This CL will unify them and have only one implementation that takes
vpx_internal_error_info.

Change-Id: I2c568639473815bc00b1fc2b72be56e5ccba1a35
2023-05-08 13:27:24 -04:00
Jerome Jiang 75f9551efb CHECK_MEM_ERROR to return in vp9_set_roi_map
Also change the return type of vp9_set_roi_map to vpx_codec_err_t

Change-Id: I60d9ff45f2d3dfc44cd6e2aab2cb1ba389ff15f3
2023-05-08 13:25:36 -04:00
James Zern b14d20b470 examples.mk,vpxdec: rm libwebm muxer dependency
vpxdec only requires the parser.

Change-Id: I54ead453d4af400ca5c3412a3211d6d0b1383046
2023-05-06 15:48:58 -07:00
James Zern 4818f997fe Merge "vp9_encoder: clear -Wshadow warning" into main 2023-05-06 02:26:55 +00:00
James Zern 3d57fb69af README: update target list
Change-Id: If2d5811a55f6bb60eeba7d28b69c78157a17e87f
2023-05-05 19:12:27 -07:00
Jerome Jiang 905f991acd Merge "Set setjmp flag in VP9 RTC rate control library" into main 2023-05-05 23:02:14 +00:00
Jerome Jiang 5636f098b3 Set setjmp flag in VP9 RTC rate control library
Change-Id: Ic5ec8dc7d9637091d4137a47d793cf29e76fdc45
2023-05-05 15:41:33 -04:00
James Zern 497f246d29 sixtap_filter_msa.c: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I5f9c09f31b06fecc123c6a9d01f5fbed39142356
2023-05-05 12:26:13 -07:00
James Zern 662af59716 Merge "macros_msa.h: clear -Wshadow warnings" into main 2023-05-05 19:15:29 +00:00
James Zern 851a76ff65 vp8_[cd]x_iface: clear setjmp flag on function exit
in vp8e_encode, also move setting the setjmp() call closer to setting
the flag.

Change-Id: Ie165d4100b84776f9c34eddcf64657bd78cce4f5
2023-05-05 11:18:08 -07:00
James Zern eb7014c80c vp9_decodeframe,tile_worker_hook: relocate setjmp=1
after the call to setjmp(); this is more correct and consistent with
other code.

Change-Id: I6d9bb8daad6a959bfe4f25484f9d6664b99da19e
2023-05-05 11:03:19 -07:00
James Zern b030d033b8 vp9,encoder_set_config: set setjmp flag after setjmp()
Change-Id: I6858e574d24aaff64f725404706f58e04e43717d
2023-05-05 11:01:50 -07:00
James Zern e2f217c075 Merge changes I8089e90a,I46890224,I1b0e090d into main
* changes:
  Overwrite cm->error->detail before freeing
  Have vpx_codec_error take const vpx_codec_ctx_t *
  Add comments about vpx_codec_enc_init_ver failure
2023-05-05 17:26:11 +00:00
James Zern b3920105c3 Merge "vpx_subpixel_8t_intrin_avx2,cosmetics: shorten long comment" into main 2023-05-05 16:47:28 +00:00
James Zern 28c5d70650 vp9_encoder: clear -Wshadow warning
with --enable-experimental --enable-rate-ctrl

Bug: webm:1793
Change-Id: I9ca664538bcf0c2aca8aea73283bbb0232eb86e9
2023-05-05 09:46:53 -07:00
James Zern c85b7331a5 macros_msa.h: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ib2e3bd3c52632cdd4410cb2c54d69750e64e5201
2023-05-05 09:25:52 -07:00
Yunqing Wang 17f1a23f55 Merge "Add AVX2 intrinsic for idct16x16 and idct32x32 functions" into main 2023-05-05 15:33:21 +00:00
Anupam Pandey 255ee18885 Add AVX2 intrinsic for idct16x16 and idct32x32 functions
Added AVX2 intrinsic optimization for the following functions
1. vpx_idct16x16_256_add
2. vpx_idct32x32_1024_add
3. vpx_idct32x32_135_add

The module level scaling w.r.t C function (timer based) for
existing (SSE2) and new AVX2 intrinsics:
                            Scaling
   Function Name         SSE2      AVX2
vpx_idct32x32_1024_add  3.62x     7.49x
vpx_idct32x32_135_add   4.85x     9.41x
vpx_idct16x16_256_add   4.82x     7.70x

This is a bit-exact change.

Change-Id: Id9dda933aa1f5093bb6b35ac3b8a41846afca9d2
2023-05-05 15:55:16 +05:30
Wan-Teh Chang 3d6b86e704 Overwrite cm->error->detail before freeing
Help detect use after free of the return value of
vpx_codec_error_detail(). If vpx_codec_error_detail() is called after
vpx_codec_encode() fails, the return value may be equal to
cm->error->detail, which is freed when vpx_codec_destroy() is called.

Document the lifetime of the string returned by
vpx_codec_error_detail().

Change-Id: I8089e90a4499b4f3cc5b9cfdbb25d72368faa319
2023-05-04 22:08:21 -07:00
Wan-Teh Chang 8e47341b0e Have vpx_codec_error take const vpx_codec_ctx_t *
Also have vpx_codec_error_detail take vpx_codec_ctx_t *. Both functions
are getter functions that don't modify the codec context.

Change-Id: I4689022425efbf7b1da5034255ac052fce5e5b4f
2023-05-04 22:08:21 -07:00
Wan-Teh Chang 601a98b154 Add comments about vpx_codec_enc_init_ver failure
Address the questions:
1. If vpx_codec_enc_init_ver() fails, should I still call
   vpx_codec_destroy() on the encoder context?
2. Is it safe to call vpx_codec_error_detail() when
   vpx_codec_enc_init_ver() failed?

Change-Id: I1b0e090d11dd9f853fe203f4cbb6080c3c7b0506
2023-05-04 22:08:21 -07:00
James Zern 4e23e7abfe vpx_subpixel_8t_intrin_avx2,cosmetics: shorten long comment
Change-Id: I8badedc2ad07d60896e45de28b707ad9f6c4d499
2023-05-04 17:17:10 -07:00
Jerome Jiang 3580bc559a Merge "Add num_blocks to VpxTplFrameStats" into main 2023-05-04 18:06:10 +00:00
Jerome Jiang bd3a5ae3ea Merge "Add Vpx* prefix to Tpl{Block,Frame}Stats" into main 2023-05-04 18:00:51 +00:00
Chi Yo Tsai 4379041094 Merge changes I226215a2,Ia4918eb0,If6219446,Ibf00a6e1,I900a0a48 into main
* changes:
  Fix mismatched param names in vpx_dsp/x86/sad4d_avx2.c
  Fix mismatched param names in vpx_dsp/arm/highbd_sad4d_neon.c
  Fix mismatched param names in vpx_dsp/arm/sad4d_neon.c
  Fix mismatched param names in vpx_dsp/arm/highbd_avg_neon.c
  Fix clang warning on const-qualification of parameters
2023-05-04 17:04:17 +00:00
Jerome Jiang 2e5261647f Add num_blocks to VpxTplFrameStats
I realized the calculation of the size of the list of VpxTplBlockStats
is non-trivial. So it's better to add the field for the size.

Bug: b/273736974
Change-Id: Ic1b50597c1f89a8f866b5669ca676407be6dc9d8
2023-05-04 10:59:46 -04:00
Jerome Jiang f059f9ee2d Add Vpx* prefix to Tpl{Block,Frame}Stats
This is to avoid symbols redifinition when integrating with other
libraries.

Bug: b/273736974
Change-Id: I891af78b1907504d5bb9f735164aea18c2aba944
2023-05-04 10:33:07 -04:00
James Zern 4dd3afc00e Merge changes I4d26f5f8,I12e25710 into main
* changes:
  s/__aarch64__/VPX_ARCH_AARCH64/
  configure: add aarch64 to ARCH_LIST
2023-05-04 02:16:12 +00:00
Jerome Jiang 69d5d16552 Merge "Add codec control to export TPL stats" into main 2023-05-04 01:55:41 +00:00
Jerome Jiang de45e4b612 Add codec control to export TPL stats
new codec control: VP9E_GET_TPL_STATS with unit test

Bug: b/273736974
Change-Id: I27343bd3f6dffafc86925234537bcdb557bc4079
2023-05-03 19:16:24 -04:00
chiyotsai 2c03388231 Fix mismatched param names in vpx_dsp/x86/sad4d_avx2.c
Change-Id: I226215a2ff8798b72abe0c2caf3d18875595caa5
2023-05-03 14:45:13 -07:00
chiyotsai 174e782fe5 Fix mismatched param names in vpx_dsp/arm/highbd_sad4d_neon.c
Change-Id: Ia4918eb0bac3b28b27e1ef205b9171680b2eb9a4
2023-05-03 14:44:08 -07:00
chiyotsai 701392c1b0 Fix mismatched param names in vpx_dsp/arm/sad4d_neon.c
Change-Id: If621944684cf9bb9f353db5961ed8b4b4ae38f24
2023-05-03 14:12:41 -07:00
chiyotsai 8782fd070d Fix mismatched param names in vpx_dsp/arm/highbd_avg_neon.c
Change-Id: Ibf00a6e1029284e637b10ef01ac9b31ffadc74ca
2023-05-03 14:12:19 -07:00
chiyotsai 3dbadd1b83 Fix clang warning on const-qualification of parameters
Change-Id: I900a0a48dde5fcb262157b191ac536e18269feb3
2023-05-03 14:12:12 -07:00
James Zern a398b60d6c fdct8x8_test: EXPECT_* -> ASSERT_*
This avoids unnecessary logging when a block has multiple errors.

Change-Id: If0f3e6f8ff5bd284655f7cabfd23c253c93d44c5
2023-05-03 10:09:03 -07:00
James Zern 57b9afa58f s/__aarch64__/VPX_ARCH_AARCH64/
This allows AArch64 to be correctly detected when building with Visual
Studio (cl.exe) and fixes a crash in vp9_diamond_search_sad_neon.c.
There are still test failures, however.

Microsoft's compiler doesn't define __ARM_FEATURE_*. To use those paths
we may need to rely on _M_ARM64_EXTENSION.

Bug: webm:1788
Bug: b/277255076
Change-Id: I4d26f5f84dbd0cbcd1cdf0d7d932ebcf109febe5
2023-05-03 10:04:34 -07:00
James Zern 33aba6ecc1 configure: add aarch64 to ARCH_LIST
This will allow identifying Windows Visual Studio targets as aarch64;
the Microsoft compiler does not define __aarch64__.

An alternative would be to define this in the code, checking for
_M_ARM64 or _M_ARM64EC. For now we'll use the existing VPX_ARCH_*
system. For compatibility VPX_ARCH_ARM will continue to be defined to 1
in this case.

Bug: webm:1788
Bug: b/277255076
Change-Id: I12e25710891e86f0c7339ba96884c18ed90ba16f
2023-05-02 17:43:39 -07:00
Jerome Jiang 84a180fe85 Move TplFrameStats to public header
Get ready for changes to follow:

- Custom reader/writer IO functions
- Codec control to get TPL stats from the encoder

Move the definition of TplFrameStats to public header so applications
can use them directly.

Bug: b/273736974
Change-Id: Ieb0db4560ddd966df1bc01f6a7e179cc97f9bac1
2023-05-01 13:39:01 -04:00
Jerome Jiang dbb1e8c7a6 Clean up a stale TODO in tpl
Change-Id: Ieccaff1cc94cbb2c5a294d83f3080f7407267016
2023-04-27 15:58:08 -04:00
James Zern 1a3e5567f2 Merge "register_state_check: clear -Wshadow warning" into main 2023-04-25 20:34:13 +00:00
James Zern 97d40abf9a Merge "highbd_vpx_convolve8_neon: clear -Wshadow warning" into main 2023-04-25 20:21:24 +00:00
James Zern 59d40c1415 Merge "vp9_highbd_iht16x16_add_neon: clear -Wshadow warning" into main 2023-04-25 20:12:30 +00:00
Yunqing Wang 52076a9c79 Merge "Reduce joint motion search iters based on bsize" into main 2023-04-24 21:15:14 +00:00
Neeraj Gadgil e7b58b69fd Reduce joint motion search iters based on bsize
Joint motion search during compound mode eval is optimized by
reducing the number of mv search iterations based on bsize.
The sf 'comp_inter_joint_search_thresh' is renamed as
'comp_inter_joint_search_iter_level' and used to add the logic.

cpu  Testset  Instr. Cnt     BD Rate loss (%)
               Red (%)   avg. psnr  ovr.psnr    ssim
 0   LOWRES2    5.373     0.0917     0.1088    0.0294
 0   MIDRES2    3.395     0.0239     0.0520    0.0783
 0    HDRES2    2.291     0.0223     0.0301    0.0053
 0   Average    3.686     0.0460     0.0636    0.0377

STATS_CHANGED

Change-Id: I7ee8873ebc8af967382324ae8f5c70c26665d5e6
2023-04-24 10:40:56 +05:30
Jerome Jiang 24802201ac Reland "Calculate recrf_dist and recrf_rate"
This is a reland of commit 3c59378e4e

Addressed issues from the previous CL:

- Both recon_error and rate_cost are scaled up
- recon_error and rate_cost are not accumulated across ref frames,
  instead they are calculated with the best ref frame picked.
- get_quantize_error() is put where it was, so there is no behavior
  change for vp9.

Bug: b/273736974

Original change's description:
> Calculate recrf_dist and recrf_rate
>
> Change-Id: I74e74807436b92d729e2ccaab96149780f1f52d9

Change-Id: I20e1f5543e83b576a074bd4e6b44d99da65f4b56
2023-04-21 19:18:37 -04:00
James Zern fed3de997c highbd_vpx_convolve8_neon: clear -Wshadow warning
Bug: webm:1793
Change-Id: If1a46fe183cd18e05b5538b1eba098e420b745ec
2023-04-21 13:07:04 -07:00
James Zern ec2a75ce9c vp9_highbd_iht16x16_add_neon: clear -Wshadow warning
Bug: webm:1793
Change-Id: I4e79a4d7d41b6abf88e3e60c54ab48a92b0346d2
2023-04-21 13:06:07 -07:00
Jerome Jiang a425371ccd Revert "Calculate recrf_dist and recrf_rate"
This reverts commit 3c59378e4e.

Reason for revert:

recon_error and recon_rate is summed by mistake across reference frames, as pointed out by Angie.

It could also cause vp9 behavior changes.

Original change's description:
> Calculate recrf_dist and recrf_rate
>
> Change-Id: I74e74807436b92d729e2ccaab96149780f1f52d9

Change-Id: I6106ce77cb0fe8c12b2bcf070d01513ffa8dc613
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2023-04-21 18:10:46 +00:00
Jerome Jiang 3c59378e4e Calculate recrf_dist and recrf_rate
Change-Id: I74e74807436b92d729e2ccaab96149780f1f52d9
2023-04-20 20:20:46 -04:00
Jerome Jiang 85fe280196 Merge "Store tpl stats before propagation" into main 2023-04-21 00:16:02 +00:00
James Zern b27cf67c30 register_state_check: clear -Wshadow warning
with --target=x86_64-win64-gcc

Bug: webm:1793
Change-Id: I265533af4e8d05adbe1d66a62b6dcb191ca48747
2023-04-20 14:18:26 -07:00
James Zern a7a9983314 Merge "configure: skip arm64_neon.h workaround w/VS >= 2019" into main 2023-04-20 21:11:09 +00:00
James Zern 885a533482 Merge "vp9_tpl_model: clear -Wshadow warning" into main 2023-04-20 19:51:56 +00:00
Jerome Jiang f49879a2a3 Store tpl stats before propagation
Add two new structs TplBlockStats and TplFrameStats to store tpl stats
before propagation

Change-Id: I903db99326b199ed8f2d8b19ccb973a8c8910501
2023-04-20 15:24:18 -04:00
James Zern f7d5c3eff8 configure: skip arm64_neon.h workaround w/VS >= 2019
Visual Studio 2019+ include arm64_neon.h from arm_neon.h

Bug: b/277255076
Change-Id: I52f42b69a5efe8214a4c541b68e940ad07499584
2023-04-20 12:20:23 -07:00
James Zern 5248bf6be5 Merge "vp9_spatial_svc_encoder: quiet -Wunused-but-set-variable" into main 2023-04-20 16:56:29 +00:00
James Zern 76c8f5dbcf Merge "vp9_ratectrl,vp9_encodedframe_overshoot: rm unused var" into main 2023-04-20 16:56:13 +00:00
James Zern 57e48bc8f7 Merge "onyx_if,encode_frame_to_data_rate: rm unused var" into main 2023-04-20 16:55:59 +00:00
James Zern 4cf48ce099 Merge "libs.mk: quote $(LIBVPX_TEST_DATA_PATH)" into main 2023-04-20 16:55:38 +00:00
James Zern e8fa7a038b libs.mk: quote $(LIBVPX_TEST_DATA_PATH)
This allows the testdata target to work environments like cygwin/msys
when a windows style path is used. It may also fix using paths with
spaces, though that's not generally recommended.

Change-Id: Id444c14468b05d589bce49c1f612aa712a3f0c8c
2023-04-19 18:58:59 -07:00
James Zern 4366ff7222 vp9_spatial_svc_encoder: quiet -Wunused-but-set-variable
with clang-17. Move frames_received under OUTPUT_FRAME_STATS; it's only
used in a printf.

Change-Id: Idfdd59ccd04e43df1855203db82bb4c8a1d059fb
2023-04-19 18:46:11 -07:00
James Zern 895317cdf1 vp9_ratectrl,vp9_encodedframe_overshoot: rm unused var
quiets -Wunused-but-set-variable with clang-17

Change-Id: I5212a20286d0252e45a8e8813d15cb780494b0ad
2023-04-19 18:46:05 -07:00
James Zern 84b4dfa5ba vp9_encodeframe: rm unused vars
in get_rdmult_delta() and compute_frame_aq_offset().

quiets -Wunused-but-set-variable with clang-17

Change-Id: I726852f3bc42afa80a18475de910040a9436b0bb
2023-04-19 18:46:00 -07:00
James Zern 933cf345dd onyx_if,encode_frame_to_data_rate: rm unused var
quiets -Wunused-but-set-variable with clang-17

Change-Id: Ia819beac84cbd57f4eeca6174c785fd320bc40c6
2023-04-19 18:45:53 -07:00
James Zern 860f245de9 Merge changes Ib0c2f852,Ieb77661e,I56ea656e,Ibda734c2 into main
* changes:
  Add Neon implementations of vpx_highbd_sad_skip_<w>x<h>x4d
  Add Neon implementation of vpx_sad_skip_<w>x<h>x4d functions
  Add Neon implementation of vpx_highbd_sad_skip_<w>x<h> functions
  Add Neon implementation of vpx_sad_skip_<w>x<h> functions
2023-04-19 23:17:10 +00:00
Jonathan Wright ab830fe6a1 Add Neon implementations of vpx_highbd_sad_skip_<w>x<h>x4d
Add Neon implementations of high bitdepth downsampling SAD4D
functions for all block sizes.

Also add corresponding unit tests.

Change-Id: Ib0c2f852e269cbd6cbb8f4dfb54349654abb0adb
2023-04-19 00:57:25 +01:00
Jonathan Wright 42c0cbb9cb Add Neon implementation of vpx_sad_skip_<w>x<h>x4d functions
Add Neon implementations of standard bitdepth downsampling SAD4D
functions for all block sizes.

Also add corresponding unit tests.

Change-Id: Ieb77661ea2bbe357529862a5fb54956e34e8d758
2023-04-19 00:57:18 +01:00
Jonathan Wright 05b244af52 Add Neon implementation of vpx_highbd_sad_skip_<w>x<h> functions
Add Neon implementations of high bitdepth downsampling SAD functions
for all block sizes.

Also add corresponding unit tests.

Change-Id: I56ea656e9bb5f8b2aedfdc4637c9ab4e1951b31b
2023-04-19 00:57:08 +01:00
Jonathan Wright 7b7f84fe14 Add Neon implementation of vpx_sad_skip_<w>x<h> functions
Add Neon implementations of standard bitdepth downsampling SAD
functions for all block sizes.

Also add corresponding unit tests.

Change-Id: Ibda734c270278d947673ffcc29ef17a2f4970b01
2023-04-19 00:56:43 +01:00
James Zern e85f9003be Merge "mr_dissim: clear -Wshadow warning" into main 2023-04-18 23:51:04 +00:00
James Zern 873fd58973 Merge "onyx_if: clear -Wshadow warning" into main 2023-04-18 19:24:35 +00:00
James Zern 1dda358cb0 Merge "vp9_rdcost: clear -Wshadow warnings" into main 2023-04-18 19:19:45 +00:00
Yunqing Wang 3d7358796d Merge "Downsample SAD computation in motion search" into main 2023-04-18 16:11:01 +00:00
James Zern d725bdd8a1 vp9_tpl_model: clear -Wshadow warning
with --enable-experimental --enable-non-greedy-mv

Bug: webm:1793
Change-Id: I19e38d7196291ae1ffbb5fb3daa70a4fefd54c55
2023-04-17 22:09:17 -07:00
James Zern eef765751a mr_dissim: clear -Wshadow warning
Bug: webm:1793
Change-Id: I73ced43aba45215264134f917fd69ab0b1f10d01
2023-04-17 22:09:09 -07:00
James Zern 7bdce0887b onyx_if: clear -Wshadow warning
with --enable-internal-stats

Bug: webm:1793
Change-Id: I9d375e4cb45f78b82afe455f2c7ad2b56e217f7d
2023-04-17 22:09:02 -07:00
Yunqing Wang 8f14f66490 Merge "Add AVX2 intrinsic for vpx_fdct16x16() function" into main 2023-04-17 20:21:21 +00:00
Anupam Pandey e15c2e3445 Add AVX2 intrinsic for vpx_fdct16x16() function
Introduced AVX2 intrinsic to compute FDCT for block size
16x16 case. This is a bit-exact change.

Please check the module level scaling w.r.t C function (timer based)
for existing (SSE2) and new AVX2 intrinsics:

   Scaling
SSE2      AVX2
3.88x     5.95x

Change-Id: I02299c3746fcb52d808e2a75d30aa62652c816dc
2023-04-17 15:23:51 +05:30
James Zern bdba4591a7 vp9_rdcost: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I6d48038d74e510ecb5773dfffbdc4c10b765c2aa
2023-04-14 10:25:58 -07:00
Jerome Jiang 27171320f5 Merge "Add VP8RateControlRTC::GetLoopfilterLevel" into main 2023-04-14 17:25:07 +00:00
James Zern 11d6f069ee Merge "libs.mk: Fix wrong scope end comments" into main 2023-04-14 17:22:47 +00:00
L. E. Segovia dca0a8b860 libs.mk: Fix wrong scope end comments
I believe the following comments are wrongly scoped, possibly left over
from previous changesets. This made me very confused when reading the
test suite Makefile, in order to port it to Meson.

Change-Id: Ice3c7ba50c6909a9c7dfd4001afa1e1ddfa4b5ce
2023-04-14 10:41:20 +00:00
Jerome Jiang 536c986764 Add VP8RateControlRTC::GetLoopfilterLevel
New linear model to calculate loopfilter level from frame qp.

Linear regression was done on qvga, vga, and hd clips.

Bug: b/275304642
Change-Id: I552b312212bb4de21b53b762d139aa9588c64ae2
2023-04-13 17:02:51 -04:00
James Zern dfe285a6b9 Merge "vp9_frame_scale_ssse3: clear -Wshadow warnings" into main 2023-04-13 20:59:43 +00:00
James Zern e3c50aa072 Merge changes I2a26c929,I0b7f0136,Ib65a2dff into main
* changes:
  vpxenc: clear -Wshadow warnings
  vpxdec: clear -Wshadow warnings
  svc_encodeframe: clear -Wshadow warnings
2023-04-13 18:35:49 +00:00
James Zern ec2993d549 Merge changes I571a9d64,I22db73cb into main
* changes:
  dct_test: clear -Wshadow warnings
  convolve_test: clear -Wshadow warning
2023-04-13 18:35:21 +00:00
James Zern 329fa7009a Merge "vp9_pickmode: clear -Wshadow warnings" into main 2023-04-13 18:35:10 +00:00
James Zern 6c65608253 vpxenc: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I2a26c9297016d3fa2c32e8974ef3d7dab1e524c4
2023-04-12 14:57:28 -07:00
James Zern 556e4f6cad vpxdec: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I0b7f013682229cde50df7c62db9dab6eab0fd341
2023-04-12 14:57:28 -07:00
James Zern a3eb39ab6f svc_encodeframe: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ib65a2dff124034d8e653572f8ada65984e55ed70
2023-04-12 14:57:28 -07:00
James Zern 968960c7b3 dct_test: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I571a9d641b2f7f4b9d7c473ca815d4ea10b9f9af
2023-04-12 14:57:13 -07:00
James Zern 698eb779f2 convolve_test: clear -Wshadow warning
Bug: webm:1793
Change-Id: I22db73cb756c6c680b73684caef1e08bb6e729d8
2023-04-12 14:57:13 -07:00
James Zern ff4123215d vp9_frame_scale_ssse3: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I85608ac7bb6d3a61649ba342c13c3bf6a39a5dea
2023-04-12 14:56:16 -07:00
James Zern 39a6b6c136 vp9_temporal_filter: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ia681ce636ae99f95b875ee1b0189bc6fa66a7608
2023-04-12 14:56:00 -07:00
James Zern 2513f6d5f4 vp9_svc_layercontext: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I63669de9835713ec70dafa88ca8f2c2459e59698
2023-04-12 14:56:00 -07:00
James Zern aaffc6e306 vp9_pickmode: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I26c063818144d11c4c91165c3fcbf6f258453cc7
2023-04-12 14:55:45 -07:00
James Zern f254e6da84 vp9_speed_features: clear -Wshadow warning
Bug: webm:1793
Change-Id: I9f509c4461631e358f80b98afbb745ce88e9d7a2
2023-04-11 21:47:10 -07:00
James Zern bde26b9961 vp9_ratectrl: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I2476a9d8e1d62414fdbe6feee87d5167058f499b
2023-04-11 21:47:10 -07:00
James Zern e3c458149c vp9_mbgraph: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ibffb62775f09922d37f7d0460aa2751e74c36738
2023-04-11 19:16:28 -07:00
James Zern cd2ec5c3df Merge "vp9_quantize_avx2,highbd_get_max_lane_eob: fix mask" into main 2023-04-11 18:40:00 +00:00
Yunqing Wang 5f615708e4 Merge "Add assert to ensure NEARESTMV or NEWMV modes are not skipped" into main 2023-04-11 18:35:10 +00:00
Yunqing Wang 23d37b3d04 Merge "Avoid redundant start MV SAD calculation" into main 2023-04-11 18:31:25 +00:00
Deepa K G 232f8659aa Downsample SAD computation in motion search
Added a speed feature to skip every other row
in SAD computation during motion search.

                 Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      0.958         0.0204     0.0095    0.0275
 0       MIDRES2      1.891        -0.0636     0.0032    0.0247
 0        HDRES2      2.869         0.0434     0.0345    0.0686
 0       Average      1.905         0.0000     0.0157    0.0403

STATS_CHANGED

Change-Id: I1a8692757ed0cbcb2259729b3ecfb0436cdf49ce
2023-04-11 19:11:51 +05:30
Cherma Rajan A 35c32b1d22 Add assert to ensure NEARESTMV or NEWMV modes are not skipped
Added an assert for prune_single_mode_based_on_mv_diff_mode_rate
speed feature. This ensures NEARMV or ZEROMV modes are pruned
only when NEARESTMV and NEWMV modes are not early terminated.

Change-Id: Id8b03eef6d1ef3f16714a9cbfde0c171c0c6fe0b
2023-04-11 17:15:08 +05:30
Deepa K G 987ed6937b Avoid redundant start MV SAD calculation
Avoided repeated calculation of start MV
SAD during full pixel motion search.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      0.162
 0       MIDRES2      0.246
 0        HDRES2      0.325
 0       Average      0.245

Change-Id: I2b4786901f254ce32ee8ca8a3d56f1c9f112f1d4
2023-04-11 17:07:06 +05:30
James Zern 61709a177a vp9_quantize_avx2,highbd_get_max_lane_eob: fix mask
Pack nz_mask with zero. After the result is permuted this has the effect
of ignoring the upper half of the iscan register which is only loaded
with 128-bits. Depending on the optimization level and the load used the
upper half of the ymm register may contain undefined values which can
produce an incorrect eob. If this is large enough it can cause a crash.

Bug: chromium:1431729
Change-Id: I4ebae9fa39f228bdd29dcc19935f3f07759d75f5
2023-04-10 14:47:06 -07:00
Yunqing Wang 31b6d12892 Merge "Add AVX2 intrinsic for variance function for block width 8" into main 2023-04-10 18:50:09 +00:00
Yunqing Wang bd53ceef3a Merge "Prune single ref modes based on mv difference and mode rate" into main 2023-04-10 17:01:19 +00:00
James Zern 6fd360c684 Merge "Optimize Armv8.0 Neon SAD4D 16xh, 32xh, and 64xh functions" into main 2023-04-07 22:19:18 +00:00
James Zern 12ab4af3ae vp9_dx_iface: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ice6cd08f145e5813e24345d03e0913e5eda5289f
2023-04-06 15:37:40 -07:00
James Zern bebc860915 vp9_encoder: clear -Wshadow warning
Bug: webm:1793
Change-Id: Id390c61f82b9f15063d0310a2c252b02b479d9c5
2023-04-06 15:37:26 -07:00
James Zern 868674d330 vpx_subpixel_8t_intrin_avx2: clear -Wshadow warning
Bug: webm:1793
Change-Id: Icba4ad242dcd0cad736b9a203829361c5bd1ca3f
2023-04-06 12:57:23 -07:00
James Zern 3c0c01357f Merge "Optimize Neon paths of high bitdepth SAD and SAD4d for 8xh blocks" into main 2023-04-06 17:50:00 +00:00
Jonathan Wright ff8a965856 Optimize Armv8.0 Neon SAD4D 16xh, 32xh, and 64xh functions
Add a widening 4D reduction function operating on uint16x8_t vectors
and use it to optimize the final reduction in Armv8.0 Neon standard
bitdepth 16xh, 32xh and 64h SAD4D computations.

Also simplify the Armv8.0 Neon version of the sad64xhx4d_neon helper
function since VP9 block sizes are not large enough to require
widening to 32-bit accumulators before the final reduction.

Change-Id: I32b0a283d7688d8cdf21791add9476ed24c66a28
2023-04-06 17:41:01 +01:00
Jonathan Wright a5801b00a8 Optimize 4D Neon reduction for 4xh and 8xh SAD4D blocks
Add a 4D reduction function operating on uint16x8_t vectors and use
it to optimize the final reduction in standard bitdepth 4xh and 8xh
SAD4D computations. Similar 4D reduction optimizations have already
been implemented for all other standard bitdepth block sizes, and all
high bitdepth block sizes.[1]

[1] https://chromium-review.googlesource.com/c/webm/libvpx/+/4224681

Change-Id: I0aa0b6e0f70449776f316879cafc4b830e86ea51
2023-04-04 14:52:52 +01:00
Anupam Pandey e2465dfc25 Add AVX2 intrinsic for variance function for block width 8
Added AVX2 intrinsic optimization for the following functions
1. vpx_variance8x4
2. vpx_variance8x8
3. vpx_variance8x16

This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      0.698
 0       MIDRES2      0.577
 0        HDRES2      0.469
 0       Average      0.582

Change-Id: Iae8fdf9344fd012cda4955ed140633141d60ba86
2023-04-04 15:36:22 +05:30
James Zern 0f42bd3fb8 Merge changes Idaf49de6,I6d7d96ff,I0d64c923 into main
* changes:
  svc_datarate_test: clear -Wshadow warning
  vp9_mcomp.c: clear -Wshadow warnings
  vp9_rc_get_second_pass_params: clear -Wshadow warning
2023-03-30 22:44:51 +00:00
Cherma Rajan A 1025d37b03 Prune single ref modes based on mv difference and mode rate
This patch introduces a speed feature to prune single reference
modes - NEARMV and ZEROMV based on motion vector difference and
mode rate w.r.t previously evaluated single reference modes
corresponding to the same reference frame.

                Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      1.686        -0.0039    -0.0105   -0.0098
 0       MIDRES2      1.026        -0.0234     0.0029    0.0120
 0        HDRES2      0.000         0.0000     0.0000    0.0000
 0       Average      0.889        -0.0091    -0.0025    0.0007

STATS_CHANGED

Change-Id: I387acd3a73d8256904a7ce684b198d251cf3dd04
2023-03-30 14:37:06 +05:30
George Steed a257b4d6be Avoid vshr and vget_{low,high} in Neon d135 predictor impl
The shift instructions have marginally worse performance on some
micro-architectures, and the vget_{low,high} instructions are
unnecessary.

This commit improves performance of the d135 predictors by 1.5% geomean
averaged across a range of compilers and micro-architectures.

Change-Id: Ied4c3eecc12fc973841696459d868ce403ed4e6c
2023-03-30 09:00:26 +00:00
George Steed c1c7dd3138 Use sum_neon.h helpers in Neon DC predictors
Use sum_neon.h helpers for horizontal reductions in Neon DC predictors,
enabling use of dedicated Neon reduction instructions on AArch64. Some
of the surrounding code is also optimized to remove redundant broadcast
instructions in the dc_store helpers.

Performance is largely unchanged on both the standard as well as the
high bit-depth predictors. The main improvement appears to be the 16x16
standard-bitdepth dc predictor, which improves by 10-15% when
benchmarked on Neoverse N1.

Change-Id: Ibfcc6ecf4b1b2f87ce1e1f63c314d0cc35a0c76f
2023-03-30 09:00:19 +00:00
James Zern 01d282ac95 Merge changes Ie4ffa298,If5ec220a,I670dc379 into main
* changes:
  Avoid LD2/ST2 instructions in highbd v predictors in Neon
  Avoid interleaving loads/stores in Neon for highbd dc predictor
  Avoid LD2/ST2 instructions in vpx_dc_predictor_32x32_neon
2023-03-29 20:52:46 +00:00
Jerome Jiang e47676c11c Merge "svc: Fix a case where target bandwidth is 0" into main 2023-03-29 18:24:18 +00:00
Jerome Jiang 0f893ea0b6 svc: Fix a case where target bandwidth is 0
Bug: webrtc:15033
Change-Id: Iea2997c2ce8982f106a1eed3ec4f7dd1c6e83666
2023-03-29 13:06:19 -04:00
Salome Thirot cf1efecebf Optimize Neon paths of high bitdepth SAD and SAD4d for 8xh blocks
For these block sizes there is no need to widen to 32-bits until the
final reduction, so use a single vabaq instead of vabd + vpadalq.

Change-Id: I9c19d620f7bb8b3a6b0bedd37789c03bb628b563
2023-03-29 16:50:34 +01:00
George Steed 9824167ad2 Avoid LD2/ST2 instructions in highbd v predictors in Neon
The interleaving load/store instructions (LD2/LD3/LD4 and ST2/ST3/ST4)
are useful if we are dealing with interleaved data (e.g. real/imag
components of complex numbers), but for simply loading or storing larger
quantities of data it is preferable to simply use the normal load/store
instructions.

This patch replaces such occurrences in the two larger block sizes:
vpx_highbd_v_predictor_16x16_neon and vpx_highbd_v_predictor_32x32_neon.

Change-Id: Ie4ffa298a2466ceaf893566fd0aefe3f66f439e4
2023-03-29 08:39:35 +00:00
George Steed 83def747ff Avoid interleaving loads/stores in Neon for highbd dc predictor
The interleaving load/store instructions (LD2/LD3/LD4 and ST2/ST3/ST4)
are useful if we are dealing with interleaved data (e.g. real/imag
components of complex numbers), but for simply loading or storing larger
quantities of data it is preferable to simply use two or more of the
normal load/store instructions.

This patch replaces such occurrences in the two larger block sizes:
vpx_highbd_dc_predictor_16x16_neon, vpx_highbd_dc_predictor_32x32_neon,
and related helper functions.

Speedups over the original Neon code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 | 16x16 |    1.25
Neoverse N1 |  LLVM 15 | 32x32 |    1.13
Neoverse N1 |   GCC 12 | 16x16 |    1.56
Neoverse N1 |   GCC 12 | 32x32 |    1.52
Neoverse V1 |  LLVM 15 | 16x16 |    1.63
Neoverse V1 |  LLVM 15 | 32x32 |    1.08
Neoverse V1 |   GCC 12 | 16x16 |    1.59
Neoverse V1 |   GCC 12 | 32x32 |    1.37

Change-Id: If5ec220aba9dd19785454eabb0f3d6affec0cc8b
2023-03-29 08:39:35 +00:00
George Steed 4cf9819282 Avoid LD2/ST2 instructions in vpx_dc_predictor_32x32_neon
The LD2 and ST2 instructions are useful if we are dealing with
interleaved data (e.g. real/imag components of complex numbers), but for
simply loading or storing larger quantities of data it is preferable to
simply use two of the normal load/store instructions.

This patch replaces such occurrences in vpx_dc_predictor_32x32_neon and
related functions.

With Clang-15 this speeds up this function by 10-30% depending on the
micro-architecture being benchmarked on. With GCC-12 this speeds up the
function by 40-60% depending on the micro-architecture being benchmarked
on.

Change-Id: I670dc37908aa238f360104efd74d6c2108ecf945
2023-03-29 08:39:35 +00:00
Yunqing Wang 6d0e5e56ae Merge "Add AVX2 for convolve vertical filter for block width 4" into main 2023-03-28 22:14:51 +00:00
James Zern aba570ac95 Merge changes If83ff1ad,I8fb00a15,Iaad58e77,Iac166d60 into main
* changes:
  Randomize second half of above_row_ in intrapred tests for Neon
  Allow non-uniform above array in d63 predictor Neon impl
  Allow non-uniform above array in d45 predictor Neon impl
  Allow non-uniform above array in highbd d45 predictor Neon impl
2023-03-28 20:14:12 +00:00
James Zern 8e58d504fa Merge "update libwebm to libwebm-1.0.0.29-9-g1930e3c" into main 2023-03-28 18:36:01 +00:00
Jerome Jiang 972149cafe svc: Fix a case where target bandwidth is 0
Bug: webrtc:15033
Change-Id: I28636de66842671b03284408186c4c18254109a5
2023-03-28 11:26:54 -04:00
George Steed 100ca0356d Randomize second half of above_row_ in intrapred tests for Neon
The existing tests duplicate `above_row_[block_size - 1]` after the
first `block_size` elements, which can lead to tests incorrectly passing
due to differing behaviour when calculating the average for the last
elements of the output.

This change adjusts the above array setup to be fully random instead,
allowing us to catch such issues here rather than in other larger tests
like the external MD5 tests.

It doesn't appear that other architectures are fully clean with this
change so restrict it to just Neon for now until they are fixed.

Bug: webm:1797
Change-Id: If83ff1adbf1e8d30f2a92474d7186c65840a5d0b
2023-03-28 13:46:11 +00:00
George Steed 911d6e165e Allow non-uniform above array in d63 predictor Neon impl
The existing standard bitdepth implementation doesn't appear to manifest
as a failure in any of the predictor or MD5 tests, but it does rely on
the predictor tests filling the second `bs` elements of the `above`
input array with copies of `above[bs - 1]` in order to match the C
implementation.

This patch adjusts the Neon implementation to correctly match the C
implementation in the case where the elements of the `above` array all
differ.

The geomean of performance for the predictor is approximately a 2%
slowdown compared to the previous vectorized implementation. This is
still considerably faster than the unspecialized naive C implementation.

Bug: webm:1797
Change-Id: I8fb00a154288d54b24a72a7ff63c816bdcf3aca3
2023-03-28 13:27:22 +00:00
George Steed 3eb3781589 Allow non-uniform above array in d45 predictor Neon impl
The existing implementation doesn't appear to manifest as a failure in
any of the predictor or MD5 tests, but it does rely on the predictor
tests filling the second `bs` elements of the `above` input array with
copies of `above[bs - 1]` in order to match the C implementation.

This patch adjusts the Neon implementation to correctly match the C
implementation in the case where the elements of the `above` array all
differ.

Performance of the predictor is mostly unchanged, except for the 32x32
block size where it appears to have gotten about 40% faster when
compiled with clang-15.

Bug: webm:1797
Change-Id: Iaad58e77c5467307a3c80d6989b7cf2988e09311
2023-03-28 13:27:11 +00:00
George Steed 25825f6a78 Allow non-uniform above array in highbd d45 predictor Neon impl
The existing implementation doesn't appear to manifest as a failure in
any of the predictor or MD5 tests, but it does rely on the predictor
tests filling the second `bs` elements of the `above` input array with
copies of `above[bs - 1]` in order to match the C implementation.

This patch adjusts the Neon implementation to correctly match the C
implementation in the case where the elements of the `above` array all
differ.

Performance of the predictor is mostly unchanged, except for the 16x16
block size where it appears to have gotten marginally faster across most
compiler/micro-architecture combinations.

Bug: webm:1797
Change-Id: Iac166d6047316c0382e0f2790ce780fc99674b43
2023-03-28 08:29:01 +00:00
Anupam Pandey b4d154c948 Add AVX2 for convolve vertical filter for block width 4
Introduced AVX2 intrinsic to compute convolve vertical for
w = 4 case. This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      0.364
 0       MIDRES2      0.236
 0        HDRES2      0.162
 0       Average      0.254

Change-Id: I413f58aa6333a6f2421d4c10d49dec01e55b2098
2023-03-28 10:01:03 +05:30
James Zern 8f17482e82 vp9_rdopt,block_rd_txfm: fix clang-tidy warning
argument name 'recon' in comment does not match parameter name
'out_recon'.

https://clang.llvm.org/extra/clang-tidy/checks/bugprone/argument-comment.html

+ normalize similar calls, using /*var=*/NULL to better match the style
  guidelines

https://google.github.io/styleguide/cppguide.html#Function_Argument_Comments

Change-Id: I089591317f7138965735f737c1536a8b16fcd4e4
2023-03-27 16:20:22 -07:00
James Zern 66885a69ff svc_datarate_test: clear -Wshadow warning
rename class member from ref_frame_config to the correct style:
ref_frame_config_.

Bug: webm:1793
Change-Id: Idaf49de6d724014adee75f81efe974b2031241ba
2023-03-24 11:23:12 -07:00
James Zern 89765feb99 vp9_mcomp.c: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I6d7d96ffb3e388eac94d1d41563f7079a8297c85
2023-03-24 11:23:12 -07:00
James Zern 601904d1f7 vp9_rc_get_second_pass_params: clear -Wshadow warning
Bug: webm:1793
Change-Id: I0d64c9234b4bdcfb49a06566dc41df26f5862c1f
2023-03-24 11:23:12 -07:00
James Zern 5b05f6f3a0 Merge changes Ide512788,I77c7abae into main
* changes:
  vp9_scan.h: rename scan_order struct to ScanOrder
  vp9_encodeframe.c: clear -Wshadow warnings
2023-03-24 18:04:19 +00:00
James Zern bad39ce7a3 vp9_scan.h: rename scan_order struct to ScanOrder
This matches the style guide and fixes some -Wshadow warnings related to
variables with the same name. Something similar was done in libaom in:
03f6fdcfca Fix warnings reported by -Wshadow: Part1b: scan_order struct
           and variable

Bug: webm:1793
Change-Id: Ide5127886b7fd7778e6d8a983bfba6edda21ff28
2023-03-24 09:35:55 -07:00
James Zern 1701d55e33 vp9_encodeframe.c: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I77c7abae7bbb1e1f4972cd31e3a67d62477b896e
2023-03-23 19:21:27 -07:00
James Zern cda56fa019 update libwebm to libwebm-1.0.0.29-9-g1930e3c
changelog:
https://chromium.googlesource.com/webm/libwebm/+log/ee0bab576..1930e3ca2

Bug: webm:1792
Change-Id: I5c5c30c767d357528f102ff38957655e2ec0c645
2023-03-23 19:14:31 -07:00
Wan-Teh Chang 5817bce969 Fix comment typos (likely copy-and-paste errors)
Fix comment typos for vpx_codec_destroy() and vpx_codec_enc_init_ver().

Based on the change made in libaom:
https://aomedia.googlesource.com/aom/+/365a968684
365a968684 Fix comment typos (likely copy-and-paste errors)

Change-Id: I39edae835ed0752b569e8e7328d0709c59724ac2
2023-03-23 17:54:35 -07:00
James Zern 81250791dd Merge "Add Neon implementations of vpx_highbd_avg_<w>x<h>_c" into main 2023-03-23 21:40:13 +00:00
James Zern 5afeb89867 Merge "test.mk: use CONFIG_VP(8|9)_ENCODER for vp8/vp9-only tests" into main 2023-03-23 17:22:28 +00:00
James Zern 27424a8176 Merge "svc_encodeframe.c: fix -Wstringop-truncation" into main 2023-03-23 17:21:57 +00:00
Jerome Jiang ccb0597e4b Merge "Revert "Add codec control to get tpl stats"" into main 2023-03-22 20:48:44 +00:00
Jerome Jiang 78bb8e1c0a Revert "Add codec control to get tpl stats"
This reverts commit 9c15fb62b3.

Reason for revert:

vpxenc should only use public interface

Original change's description:
> Add codec control to get tpl stats
>
> Add command line flag to vpxenc to export tpl stats
>
> Bug: b/273736974
> Change-Id: I6980096531b0c12fbf7a307fdef4c562d0c29e32

Bug: b/273736974
Change-Id: Ifa8951bb34e5936bbfc33086b22e9fc36d379bc9
2023-03-22 20:18:39 +00:00
Wan-Teh Chang a0bf98de0d Merge "Change UpdateRateControl() to return bool" into main 2023-03-22 16:09:24 +00:00
Salome Thirot 5c7867beac Add Neon implementations of vpx_highbd_avg_<w>x<h>_c
Add Neon implementation of vpx_highbd_avg_4x4_c and vpx_highbd_avg_8x8_c
as well as the corresponding tests.

Change-Id: Ib1b06af5206774347690c9c56e194b76aa409c91
2023-03-22 10:50:17 +00:00
James Zern 882399bd54 Merge changes I8abac3c9,If678fc19 into main
* changes:
  vp9_bitstream.c: clear -Wshadow warnings
  vp9_setup_mask: clear -Wshadow warnings
2023-03-22 02:14:12 +00:00
James Zern 0a5f886a0c Merge changes I650b305c,If3e4cf37,I4c791e3a into main
* changes:
  sixtappredict_neon.c: remove redundant returns
  sixtappredict_neon.c,cosmetics: fix a typo
  vp8_sixtap_predict16x16_neon: fix overread
2023-03-21 20:20:51 +00:00
Jerome Jiang 9c643a5ef2 Merge "Add codec control to get tpl stats" into main 2023-03-21 18:34:34 +00:00
James Zern d3c9e39635 Merge "Reland "quantize: use scan_order instead of passing scan/iscan"" into main 2023-03-21 00:33:00 +00:00
James Zern 3b6909977c test.mk: use CONFIG_VP(8|9)_ENCODER for vp8/vp9-only tests
fixes some uninstantiated test failures when configured with
--disable-vp8 or --disable-vp9

Change-Id: If9a6705bd070edee02306e89da103ed474688ec8
2023-03-20 17:28:11 -07:00
James Zern 1c37aefcbd svc_encodeframe.c: fix -Wstringop-truncation
use sizeof(buf) - 1 with strncpy.

fixes:
examples/svc_encodeframe.c:282:3: warning: ‘strncpy’ specified bound
1024 equals destination size [-Wstringop-truncation]
  282 |   strncpy(si->options, options, sizeof(si->options));
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Change-Id: I46980872f9865ae1dc2b56330c3a65d8bc6cf1f7
2023-03-20 17:09:42 -07:00
James Zern 44250287fb sixtappredict_neon.c: remove redundant returns
Change-Id: I650b305c2599fc32353daba030e6241d330796a7
2023-03-20 16:58:28 -07:00
James Zern faa9142f5d sixtappredict_neon.c,cosmetics: fix a typo
Change-Id: If3e4cf372fc6ed076f0d42c435a72262494aab68
2023-03-20 16:56:58 -07:00
James Zern e4f0df53ec vp8_sixtap_predict16x16_neon: fix overread
Shift the final read from the source by 3 to avoid breaking the
assumption that the 6-tap filter needs only 5 pixels outside of the
macroblock; this matches the sse2 and ssse3 implementations.

It's possible this restriction could be removed if the source buffers
are assumed to be padded.

Bug: webm:1795
Change-Id: I4c791e3a214898a503c78f4cedca154c75cdbaef
Fixed: webm:1795
2023-03-20 16:51:51 -07:00
Yunqing Wang c0f11c7f6c Merge "Skip trellis coeff opt based on tx block properties" into main 2023-03-20 16:35:44 +00:00
Yunqing Wang 97aa7b2a4c Merge "Refactor logic of skipping trellis coeff opt" into main 2023-03-20 16:27:53 +00:00
Jerome Jiang 9c15fb62b3 Add codec control to get tpl stats
Add command line flag to vpxenc to export tpl stats

Bug: b/273736974
Change-Id: I6980096531b0c12fbf7a307fdef4c562d0c29e32
2023-03-20 12:02:38 -04:00
Deepa K G 55e102dc54 Skip trellis coeff opt based on tx block properties
The trellis coefficient optimization is skipped for blocks
with larger residual mse.

                 Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      9.467        0.0921     0.1057    0.0362
 0       MIDRES2      4.328       -0.0155     0.0694    0.0178
 0        HDRES2      1.858        0.0231     0.0214   -0.0034
 0       Average      5.218        0.0332     0.0655    0.0169

STATS_CHANGED

Change-Id: I321a9b1a34ebb59b7b6a065b5b2d717c8767a4a5
2023-03-19 23:12:04 +05:30
Deepa K G 405ae85666 Refactor logic of skipping trellis coeff opt
The code to enable trellis coefficient optimization
is refactored using the sf 'trellis_opt_tx_rd'. This
change facilitates adaptive skipping of trellis
optimization based on block properties.

Change-Id: Ia1ff7cbbe5acf86414410f62655d46c099387847
2023-03-19 22:13:11 +05:30
James Zern 492f4c5538 vp9_bitstream.c: clear -Wshadow warnings
Bug: webm:1793
Change-Id: I8abac3c901ad24b642b39ea6e6081d8ba626853d
2023-03-17 19:33:50 -07:00
James Zern cb5b047ad8 vp9_setup_mask: clear -Wshadow warnings
Bug: webm:1793
Change-Id: If678fc195ef87cc634d31fb7b24e0c844a5cb7b0
2023-03-17 19:24:20 -07:00
Johann f23f27bb80 Reland "quantize: use scan_order instead of passing scan/iscan"
This is a reland of commit 14fc40040f

Parent change fixed in crrev.com/c/webm/libvpx/+/4305500

Original change's description:
> quantize: use scan_order instead of passing scan/iscan
>
> further reduces the arguments for the 32x32. This will be applied to the base
> version as well.
>
> Change-Id: I25a162b5248b14af53d9e20c6a7fa2a77028a6d1

Change-Id: I2a7654558eaddd68bd09336bf317b297f18559d2
2023-03-18 06:39:45 +09:00
James Zern 6788c75055 Merge changes I5d9444a2,I1f127df9 into main
* changes:
  Add Neon implementation of vpx_highbd_minmax_8x8_c
  Add tests for vpx_highbd_minmax_8x8_c
2023-03-17 20:35:24 +00:00
James Zern d446ddd32d Merge "Reland "quantize: simplifly highbd 32x32_b args"" into main 2023-03-17 20:32:11 +00:00
Salome Thirot fff4e76b55 Add Neon implementation of vpx_highbd_minmax_8x8_c
Add Neon implementation of vpx_highbd_minmax_8x8_c as well as the
corresponding tests.

Change-Id: I5d9444a239fb1baa53634c1bdb5292b44067d90c
2023-03-17 18:40:41 +00:00
Salome Thirot c6da2329b9 Add tests for vpx_highbd_minmax_8x8_c
Write tests for vpx_highbd_minmax_8x8_c, and fix initial value of min in
vpx_highbd_minmax_8x8_c.

Change-Id: I1f127df945bbb8c7d373c5430ff5f94f28575968
2023-03-17 18:40:41 +00:00
Johann 02fd7d6aeb Reland "quantize: simplifly highbd 32x32_b args"
This is a reland of commit 573f5e662b

Alignment issue with tests fixed in crrev.com/c/webm/libvpx/+/4305500

Original change's description:
> quantize: simplify highbd 32x32_b args
>
> Change-Id: I431a41279c4c4193bc70cfe819da6ea7e1d2fba1

Change-Id: Ic868b6f987c99d88672858fedd092fa49c125e19
2023-03-17 12:52:15 +00:00
Wan-Teh Chang 430c6c1553 Change UpdateRateControl() to return bool
Change the VP9RateControlRtcConfig constructor to initialize
ss_number_layers (to 1).

Change UpdateRateControl() to return bool so that it can report failure
(due to invalid configuration).

Also change InitRateControl() to return bool to propagate the return
value of UpdateRateControl().

Note: This is a port of the libaom CL
https://aomedia-review.googlesource.com/c/aom/+/172042.

Change-Id: I90b60353b5f15692dba5d89e7b1a9c81bb2fdd89
2023-03-16 19:55:41 -07:00
Wan-Teh Chang af63e31978 Merge "Set oxcf->ts_rate_decimator[tl] only once" into main 2023-03-17 02:54:21 +00:00
Wan-Teh Chang d92681b06f Set oxcf->ts_rate_decimator[tl] only once
The code that sets oxcf->ts_rate_decimator[tl] does not need to be
inside a loop that iterates over sl. Move the code out of the sl loop so
that oxcf->ts_rate_decimator[tl] is set only once.

Change-Id: I22f6c117d200ec38a757b749a8700660d15436c1
2023-03-16 18:36:13 -07:00
Wan-Teh Chang d6b6f85063 Remove repeated field from VP9RateControlRtcConfig
Remove the `ts_number_layers` field from VP9RateControlRtcConfig because
the base class VpxRateControlRtcConfig already has that field.

Note: In commit 65a1751e5b,
`ts_number_layers` was moved to the newly created base class
VpxRateControlRtcConfig but was inadvertently left in
VP9RateControlRtcConfig:
https://chromium-review.googlesource.com/c/webm/libvpx/+/3140048,

Change-Id: I98d48e152683ec2e5e62efffb56b7f010c5d0695
2023-03-16 15:32:02 -07:00
Wan-Teh Chang 4265e364ff Merge "Update the sample code for VP9RateControlRTC" into main 2023-03-16 21:40:14 +00:00
Yunqing Wang 5ca4953569 Merge "Add AVX2 for convolve horizontal filter for block width 4" into main 2023-03-16 20:44:11 +00:00
Wan-Teh Chang d67a0021e7 Update the sample code for VP9RateControlRTC
Update the sample code to the current VP9RateControlRTC interface.

Change-Id: I30b0712c897f93fd62ebce51ce39afce3cac1fd7
2023-03-16 13:37:56 -07:00
Anupam Pandey 5c2cd048a0 Add AVX2 for convolve horizontal filter for block width 4
Introduced AVX2 intrinsic to compute convolve horizontal for
w = 4 case. This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      0.763
 0       MIDRES2      0.466
 0        HDRES2      0.317
 0       Average      0.516

Change-Id: I124f3f8e994c24461812f4963b113819466db44f
2023-03-16 08:48:45 +05:30
Salome Thirot 362c69cfe5 Optimize vpx_minmax_8x8_neon for aarch64
Optimize vpx_minmax_8x8_neon on AArch64 targets by using the UMAXV and
UMINV instructions - computing the maximum and minimum elements in a
Neon vector.

Change-Id: I54c3a3a087d266f6774e6113e5947253df288a64
2023-03-14 22:43:04 +00:00
James Zern bbd6bc85a3 Merge "Add Neon implementation of vpx_highbd_satd_c" into main 2023-03-14 19:38:04 +00:00
James Zern f4400abb25 Merge "Optimize vpx_satd_neon" into main 2023-03-14 19:32:32 +00:00
James Zern bfe0ec066b Merge "Add Neon implementation of vp9_highbd_block_error_c" into main 2023-03-14 19:31:02 +00:00
Salome Thirot be84aa14dc Add Neon implementation of vpx_highbd_satd_c
Add Neon implementation of vpx_highbd_satd_c as well as the
corresponding tests.

Change-Id: I3d50e6abdf168fb13743e7d8da9364f072308b7f
2023-03-14 09:32:42 +00:00
Salome Thirot f7dbd848e4 Optimize vpx_satd_neon
Optimize Neon implementation of vpx_satd by using ABD and UADALP instead
of ABAL and ABAL2, splitting the accumulator and using a dedicated
helper function to perform the final reduction.

Change-Id: Idcfa49e001b68b1dcd87c13fd9acc317a208cd2a
2023-03-14 09:24:39 +00:00
Salome Thirot e553e3acff Add Neon implementation of vp9_highbd_block_error_c
Add Neon implementation of vp9_highbd_block_error_c as well as the
corresponding tests.

Change-Id: Ibe0eb077f959ced0dcd7d0d8d9d529d3b5bc1874
2023-03-14 09:11:43 +00:00
Konstantinos Margaritis 29beea8243 [NEON] Add temporal filter functions, 8-bit and highbd
Both are around 3x faster than original C version. 8-bit gives a
small 0.5% speed increase, whereas highbd gives ~2.5%.

Change-Id: I71d75ddd2757b19aa201e879fd9fa8f3a25431ad
2023-03-14 08:22:40 +00:00
James Zern d32a410880 Merge "Fix buffer overrun in highbd Neon subpel variance filters" into main 2023-03-14 00:22:31 +00:00
James Zern f40c89459f Merge "reland: quantize: simplify 32x32_b args" into main 2023-03-10 21:40:59 +00:00
Yunqing Wang d40a8608cc Merge "Add AVX2 for vpx_filter_block1d8_v8() function" into main 2023-03-10 01:02:25 +00:00
Anupam Pandey 775d594e46 Add AVX2 for vpx_filter_block1d8_v8() function
Introduced AVX2 intrinsic to compute convolve vertical for
w = 8 case. This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      1.347
 0       MIDRES2      1.046
 0        HDRES2      0.805
 0       Average      1.066

Change-Id: Idf77fff054beaf2c985b9bf2335591bda47e811f
2023-03-09 16:50:40 +05:30
Neeraj Gadgil 4959770032 Rename function 'model_rd_for_sb_earlyterm'
Function renamed as 'build_inter_pred_model_rd_earlyterm' and
added a comment to explain its behavior.

Change-Id: I804e6273558ba36241232f62cf18ea754b85e369
2023-03-09 15:22:51 +05:30
Jonathan Wright eab52a4f3c Fix buffer overrun in highbd Neon subpel variance filters
The high bitdepth Neon code applying the first pass of the bilinear
filter for subpixel variance on blocks of width 4 processed two rows
at a time. This resulted in a source buffer overread, attempting to
produce two rows of padding for the second (vertical) pass of the
bilinear filter.

This patch modifies highbd_var_filter_block2d_bil_w4 and
highbd_avg_pred_var_filter_block2d_bil_w4 such that they only process
a single row per iteration, and only require a single row of padding
for the second pass. This prevents the buffer overread.

Since all block sizes are now processed one row at a time, there is
no need for a "padding" macro parameter - the value is always 1, with
no special case for 4xh blocks. As well as re-enabling the Neon paths
and their associated tests, we remove the now-redundant 'padding'
macro parameter.

Bug: webm:1796
Change-Id: Icd6076b38eb4476139795bb1734ca800c9edf079
2023-03-08 23:40:14 +00:00
James Zern 79b1347a51 Merge "disable vpx_highbd_*_sub_pixel_avg_variance4x{4,8}_neon" into main 2023-03-08 23:05:08 +00:00
James Zern 7a47294675 Merge "Optimize vpx_sum_squares_2d_i16_neon" into main 2023-03-08 21:54:30 +00:00
James Zern a47967700d disable vpx_highbd_*_sub_pixel_avg_variance4x{4,8}_neon
vpx_highbd_8_sub_pixel_avg_variance4x4_neon
vpx_highbd_8_sub_pixel_avg_variance4x8_neon
vpx_highbd_10_sub_pixel_avg_variance4x4_neon
vpx_highbd_10_sub_pixel_avg_variance4x8_neon
vpx_highbd_12_sub_pixel_avg_variance4x4_neon
vpx_highbd_12_sub_pixel_avg_variance4x8_neon

all cause heap overflows of the form:

i[ RUN      ] NEON/VpxHBDSubpelAvgVarianceTest.Ref/33
=================================================================
==535205==ERROR: AddressSanitizer: heap-buffer-overflow on address
0xffff95bb0b89 at pc 0x00000116dabc bp 0xffffd09f6430 sp 0xffffd09f6428
READ of size 8 at 0xffff95bb0b89 thread T0
    #0 0x116dab8 in load_unaligned_u16q vpx_dsp/arm/mem_neon.h:176:3
    #1 0x116dab8 in highbd_var_filter_block2d_bil_w4
       vpx_dsp/arm/highbd_subpel_variance_neon.c:49:21
    #2 0x116dab8 in vpx_highbd_8_sub_pixel_avg_variance4x4_neon
       vpx_dsp/arm/highbd_subpel_variance_neon.c:543:1
    ...

0xffff95bb0b89 is located 0 bytes to the right of 73-byte region
[0xffff95bb0b40,0xffff95bb0b89)
allocated by thread T0 here:
    #0 0x5f18b0 in malloc (test_libvpx+0x5f18b0)
    #1 0xce4a40 in vpx_memalign vpx_mem/vpx_mem.c:62:10
    #2 0xce4a40 in vpx_malloc vpx_mem/vpx_mem.c:70:40
    #3 0xa52238 in (anonymous namespace)::SubpelVarianceTest<unsigned
       int (*)(unsigned char const*, int, int, int, unsigned char
               const*, int, unsigned int*, unsigned char
               const*)>::SetUp()
       test/variance_test.cc:586:14
    ...

This is the same issue as:
  e33d4c276 disable vpx_highbd_*_sub_pixel_variance4x{4,8}_neon
They have highbd_var_filter_block2d_bil_w4 in common.

Bug: webm:1796
Change-Id: I3ed70d0ba22e127720542612ea9f6665948eedfc
2023-03-08 13:17:17 -08:00
James Zern e33d4c276d disable vpx_highbd_*_sub_pixel_variance4x{4,8}_neon
vpx_highbd_8_sub_pixel_variance4x4_neon
vpx_highbd_8_sub_pixel_variance4x8_neon
vpx_highbd_10_sub_pixel_variance4x4_neon
vpx_highbd_10_sub_pixel_variance4x8_neon
vpx_highbd_12_sub_pixel_variance4x4_neon
vpx_highbd_12_sub_pixel_variance4x8_neon

all cause heap overflows of the form:

[ RUN      ] NEON/VpxHBDSubpelVarianceTest.Ref/24
=================================================================
==450528==ERROR: AddressSanitizer: heap-buffer-overflow on address
0xffff8311a571 at pc 0x0000010ca52c bp 0xffffc63e96b0 sp 0xffffc63e96a8
READ of size 8 at 0xffff8311a571 thread T0
    #0 0x10ca528 in load_unaligned_u16q vpx_dsp/arm/mem_neon.h:176:3
    #1 0x10ca528 in highbd_var_filter_block2d_bil_w4
       vpx_dsp/arm/highbd_subpel_variance_neon.c:49:21
    #2 0x10ca528 in vpx_highbd_10_sub_pixel_variance4x8_neon
       vpx_dsp/arm/highbd_subpel_variance_neon.c:257:1
    ...

0xffff8311a571 is located 0 bytes to the right of 113-byte region
[0xffff8311a500,0xffff8311a571)
allocated by thread T0 here:
    #0 0x5f18b0 in malloc (test_libvpx+0x5f18b0)
    #1 0xce4f90 in vpx_memalign vpx_mem/vpx_mem.c:62:10
    #2 0xce4f90 in vpx_malloc vpx_mem/vpx_mem.c:70:40
    #3 0xa4ad44 in (anonymous namespace)::SubpelVarianceTest<unsigned
       int (*)(unsigned char const*, int, int, int, unsigned char
       const*, int, unsigned int*)>::SetUp() test/variance_test.cc:586:14

Bug: webm:1796
Change-Id: I39f7f936bae2bcbbe1f803fb10375ec02d1c1277
2023-03-07 22:16:56 -08:00
James Zern 0f17aa986a Merge "[SSE4_1] Fix overflow in highbd temporal_filter" into main 2023-03-07 23:40:10 +00:00
James Zern ccdcba6dc9 Merge changes I79247b5a,Ic6016cf8,Ibab7ec5f into main
* changes:
  Add Neon implementation of vp9_block_error_c
  Fix return type of horizontal_add_int64x2 helper
  Optimize vp9_block_error_fp_neon
2023-03-07 23:00:19 +00:00
James Zern 13bd85f687 Merge changes Ic021e82e,I2bce6f19,I250ab56e,I910692b1,Iefaa774d into main
* changes:
  Implement highbd_d207_predictor using Neon
  Implement highbd_d153_predictor using Neon
  Implement d207_predictor using Neon
  Implement d153_predictor using Neon
  Implement highbd_d63_predictor using Neon
2023-03-07 22:48:54 +00:00
Yunqing Wang 8874873bef Merge "Add AVX2 for vpx_filter_block1d8_h8() function" into main 2023-03-07 16:40:52 +00:00
Yunqing Wang ba3d606630 Merge "Use cb pattern for interp eval when filter is not switchable" into main 2023-03-07 16:37:22 +00:00
Yunqing Wang f138a4004d Merge "Early terminate interp filt search based on best RD cost" into main 2023-03-07 16:35:18 +00:00
Anupam Pandey b7fabadc5d Add AVX2 for vpx_filter_block1d8_h8() function
Introduced AVX2 intrinsic to compute convolve horizontal for
w = 8 case. This is a bit-exact change.

                 Instruction Count
cpu   Resolution   Reduction(%)
 0       LOWRES2      1.509
 0       MIDRES2      1.165
 0        HDRES2      0.898
 0       Average      1.191

Change-Id: I699c94aa3d7ea74c58f901df906eed0b81b4ee79
2023-03-07 18:20:30 +05:30
Salome Thirot eec4808393 Add Neon implementation of vp9_block_error_c
Add Neon implementation of vp9_block_error_c as well as the
corresponding tests.

Change-Id: I79247b5ae24f51b7b55fc5e517d5e403dc86367a
2023-03-07 12:04:25 +00:00
Salome Thirot 57c6ea9752 Fix return type of horizontal_add_int64x2 helper
horizontal_add_int64x2 was incorrectly returning a uint64_t instead of
an int64_t. This patch fixes that.

Change-Id: Ic6016cf87aebfc6a14f540b784d6648757e12b49
2023-03-07 11:34:05 +00:00
Salome Thirot 5ae84ea5ae Optimize vp9_block_error_fp_neon
Currently vp9_block_error_fp_neon is only used when
CONFIG_VP9_HIGHBITDEPTH is set to false. This patch optimizes the
implementation and uses tran_low_t instead of int16_t so that the
function can also be used in builds where vp9_highbitdepth is enabled.

Change-Id: Ibab7ec5f74b7652fa2ae5edf328f9ec587088fd3
2023-03-07 11:29:31 +00:00
Neeraj Gadgil b9933679bf Use cb pattern for interp eval when filter is not switchable
This CL uses a checkerboard pattern for interp filter eval when
the filter is not switchable.

                 Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      0.725         0.0017    -0.0000    0.0192
 0       MIDRES2      0.968         0.0004     0.0504    0.0810
 0        HDRES2      1.135         0.0089     0.0130    0.0113
 0       Average      0.943         0.0037     0.0211    0.0372

STATS_CHANGED

Change-Id: Ia713e5170101302f264ffaa2350bc0ab15c27090
2023-03-07 12:39:10 +05:30
Neeraj Gadgil f2210fd290 Early terminate interp filt search based on best RD cost
The CL prunes interpolation filter search based on rdcost of
individual planes.

                 Instruction Count        BD-Rate Loss(%)
cpu   Resolution   Reduction(%)    avg.psnr   ovr.psnr    ssim
 0       LOWRES2      1.613         0.0143     0.0208    0.0146
 0       MIDRES2      1.637         0.0214    -0.0316    0.0036
 0        HDRES2      1.369         0.0171     0.0178    0.1222
 0       Average      1.539         0.0176     0.0023    0.0468

STATS_CHANGED

Change-Id: I4be30bd1c7bbbc93c6bbc840565893a97d2598a4
2023-03-07 12:36:00 +05:30
James Zern 2d7d2fcf7b Merge "Fix heap buffer overrun in vpx_get4x4sse_cs_neon" into main 2023-03-07 05:45:53 +00:00
James Zern 925b8156d9 Merge changes I05dc4d43,Ia0977ff0 into main
* changes:
  Fix potential buffer over-read in highbd d117 predictor Neon
  Implement d117_predictor using Neon
2023-03-07 01:28:15 +00:00
Jonathan Wright 5a2bb12c52 Fix heap buffer overrun in vpx_get4x4sse_cs_neon
Use a mem_neon.h helper to do strided 4-byte loads instead of Neon
8-byte loads - where the last 4 bytes are out of bounds.

Re-enable the Neon code path and the tests.

Bug: webm:1794
Change-Id: I69ccff730f4a5cbf585dd6a9aa0f3eb13e150074
2023-03-07 00:05:10 +00:00
James Zern d94e16404a vpx_convolve_copy_neon: fix unaligned loads w/w==4
Fixes a -fsanitize=undefined warning:

vpx_dsp/arm/vpx_convolve_copy_neon.c:29:26: runtime error: load of
misaligned address 0xffffa8242bea for type 'const uint32_t' (aka 'const
unsigned int'), which requires 4 byte alignment
0xffffa8242bea: note: pointer points here
 88 81  7d 7d 7d 7d 7d 81 81 7d  81 80 87 97 a8 ab a0 91 ...
              ^
    #0 0xb0447c in vpx_convolve_copy_neon
       vpx_dsp/arm/vpx_convolve_copy_neon.c:29:26
    #1 0x12285c8 in inter_predictor vp9/common/vp9_reconinter.h:29:3
    #2 0x1228430 in dec_build_inter_predictors
       vp9/decoder/vp9_decodeframe.c
    ...

Change-Id: Iaec4ac2a400b6e6db72d12e5a7acb316262b12a7
2023-03-06 15:19:31 -08:00
Jonathan Wright 6b783c6975 Optimize vpx_sum_squares_2d_i16_neon
Add an additional 32-bit vector accumulator to allow parallel
processing on CPUs that have more than one Neon multiply-accumulate
pipeline. Also use sum_neon.h horizontal-add helpers for reduction.

Change-Id: Ibcb48a738f5dee1430c3ebcd305b5ea8ea344c40
2023-03-06 18:34:23 +00:00
George Steed 9e35c35945 Implement highbd_d207_predictor using Neon
Add Neon implementations of the highbd d207 predictor for 4x4, 8x8,
16x16 and 32x32 block sizes. Also update tests to add new corresponding
cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.61
Neoverse N1 |  LLVM 15 |   8x8 |    5.30
Neoverse N1 |  LLVM 15 | 16x16 |    8.93
Neoverse N1 |  LLVM 15 | 32x32 |    8.35
Neoverse N1 |   GCC 12 |   4x4 |    2.16
Neoverse N1 |   GCC 12 |   8x8 |    5.75
Neoverse N1 |   GCC 12 | 16x16 |    7.28
Neoverse N1 |   GCC 12 | 32x32 |    3.31
Neoverse V1 |  LLVM 15 |   4x4 |    1.71
Neoverse V1 |  LLVM 15 |   8x8 |    7.46
Neoverse V1 |  LLVM 15 | 16x16 |   10.09
Neoverse V1 |  LLVM 15 | 32x32 |    8.10
Neoverse V1 |   GCC 12 |   4x4 |    1.99
Neoverse V1 |   GCC 12 |   8x8 |    7.81
Neoverse V1 |   GCC 12 | 16x16 |    8.34
Neoverse V1 |   GCC 12 | 32x32 |    5.74

Change-Id: Ic021e82eed0c7bc8263eb68606411354eb5e4870
2023-03-06 13:35:45 +00:00
George Steed cf85ae9a49 Implement highbd_d153_predictor using Neon
Add Neon implementations of the highbd d153 predictor for 4x4, 8x8,
16x16 and 32x32 block sizes. Also update tests to add new corresponding
cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.71
Neoverse N1 |  LLVM 15 |   8x8 |    4.05
Neoverse N1 |  LLVM 15 | 16x16 |    7.04
Neoverse N1 |  LLVM 15 | 32x32 |    7.71
Neoverse N1 |   GCC 12 |   4x4 |    1.84
Neoverse N1 |   GCC 12 |   8x8 |    4.19
Neoverse N1 |   GCC 12 | 16x16 |    6.07
Neoverse N1 |   GCC 12 | 32x32 |    3.14
Neoverse V1 |  LLVM 15 |   4x4 |    3.19
Neoverse V1 |  LLVM 15 |   8x8 |    5.51
Neoverse V1 |  LLVM 15 | 16x16 |    7.73
Neoverse V1 |  LLVM 15 | 32x32 |    7.72
Neoverse V1 |   GCC 12 |   4x4 |    3.97
Neoverse V1 |   GCC 12 |   8x8 |    5.52
Neoverse V1 |   GCC 12 | 16x16 |    6.31
Neoverse V1 |   GCC 12 | 32x32 |    5.36

Change-Id: I2bce6f1921d76d1c10d163e0cd4f395b40799184
2023-03-06 13:35:27 +00:00
George Steed 33f3ae3414 Fix potential buffer over-read in highbd d117 predictor Neon
The load of `left[bs]` in the standard bitdepth d117 Neon implementation
triggered an address-sanitizer failure.

The highbd equivalent does not appear to trigger any asan failures when
running the VP9/ExternalFrameBufferMD5Test or
VP9/TestVectorTest.MD5Match tests, but for consistency with the standard
bitdepth implementation we adjust it to avoid the over-read.

Performance is roughly identical, with a 0.8% performance improvement on
average over the previous optimised code.

Change-Id: I05dc4d43f244f4915c0ccc52cc0af999bbacb018
2023-03-06 13:34:35 +00:00
George Steed 872476c66b Implement d207_predictor using Neon
Add Neon implementations of the d207 predictor for 4x4, 8x8, 16x16 and
32x32 block sizes. Also update tests to add new corresponding cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.72
Neoverse N1 |  LLVM 15 |   8x8 |    5.68
Neoverse N1 |  LLVM 15 | 16x16 |   12.30
Neoverse N1 |  LLVM 15 | 32x32 |   16.70
Neoverse N1 |   GCC 12 |   4x4 |    1.71
Neoverse N1 |   GCC 12 |   8x8 |    6.01
Neoverse N1 |   GCC 12 | 16x16 |   12.40
Neoverse N1 |   GCC 12 | 32x32 |    6.71
Neoverse V1 |  LLVM 15 |   4x4 |    1.99
Neoverse V1 |  LLVM 15 |   8x8 |    8.28
Neoverse V1 |  LLVM 15 | 16x16 |   14.36
Neoverse V1 |  LLVM 15 | 32x32 |   17.55
Neoverse V1 |   GCC 12 |   4x4 |    1.99
Neoverse V1 |   GCC 12 |   8x8 |    8.43
Neoverse V1 |   GCC 12 | 16x16 |   14.41
Neoverse V1 |   GCC 12 | 32x32 |    7.82

Change-Id: I250ab56edab3390b0bac9dc96995a4bf9a4da641
2023-03-06 13:34:35 +00:00
George Steed 7e88600bf9 Implement d117_predictor using Neon
Add Neon implementations of the d117 predictor for 4x4, 8x8, 16x16 and
32x32 block sizes. Also update tests to add new corresponding cases.

This re-lands commit 360e9069b6,
previously reverted in commit 394de691a0.

The implementation is mostly identical to the original but with an
adjustment to how data is loaded from the `left` array. In particular
the left array cannot be guaranteed to be larger than the block size, so
the read of e.g. `left[32]` in the `bs=32` case is not valid. This turns
out to be not a problem since the last lane loaded in this case is
unused. I have added comments in the code to explain why this is the
case.

Since we cannot load the last element directly, we instead construct it
from the previous aligned read. This seems to have an inconsistent
affect on performance, improving by up to 10% in some cases and
regressing by up to 10% on others. Either way it is still significantly
faster than the original C code.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.88
Neoverse N1 |  LLVM 15 |   8x8 |    5.19
Neoverse N1 |  LLVM 15 | 16x16 |    9.63
Neoverse N1 |  LLVM 15 | 32x32 |   13.85
Neoverse N1 |   GCC 12 |   4x4 |    2.04
Neoverse N1 |   GCC 12 |   8x8 |    4.62
Neoverse N1 |   GCC 12 | 16x16 |    9.79
Neoverse N1 |   GCC 12 | 32x32 |    4.69
Neoverse V1 |  LLVM 15 |   4x4 |    1.75
Neoverse V1 |  LLVM 15 |   8x8 |    6.71
Neoverse V1 |  LLVM 15 | 16x16 |    9.62
Neoverse V1 |  LLVM 15 | 32x32 |   13.81
Neoverse V1 |   GCC 12 |   4x4 |    1.75
Neoverse V1 |   GCC 12 |   8x8 |    6.01
Neoverse V1 |   GCC 12 | 16x16 |    6.91
Neoverse V1 |   GCC 12 | 32x32 |    4.39

Change-Id: Ia0977ff0b0eba2c41c7884b64e7c22ff9bc9549d
2023-03-06 13:34:35 +00:00
George Steed 8b0a60f91c Implement d153_predictor using Neon
Add Neon implementations of the d153 predictor for 4x4, 8x8, 16x16 and
32x32 block sizes. Also update tests to add new corresponding cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.59
Neoverse N1 |  LLVM 15 |   8x8 |    4.46
Neoverse N1 |  LLVM 15 | 16x16 |    8.77
Neoverse N1 |  LLVM 15 | 32x32 |   15.21
Neoverse N1 |   GCC 12 |   4x4 |    1.90
Neoverse N1 |   GCC 12 |   8x8 |    4.70
Neoverse N1 |   GCC 12 | 16x16 |    9.55
Neoverse N1 |   GCC 12 | 32x32 |    5.95
Neoverse V1 |  LLVM 15 |   4x4 |    2.89
Neoverse V1 |  LLVM 15 |   8x8 |    6.94
Neoverse V1 |  LLVM 15 | 16x16 |   10.20
Neoverse V1 |  LLVM 15 | 32x32 |   15.63
Neoverse V1 |   GCC 12 |   4x4 |    4.45
Neoverse V1 |   GCC 12 |   8x8 |    7.71
Neoverse V1 |   GCC 12 | 16x16 |    9.08
Neoverse V1 |   GCC 12 | 32x32 |    7.93

Change-Id: I910692b14917cde8a8952fab5b9c78bed7f7c6ad
2023-03-06 13:34:35 +00:00
George Steed 6282757546 Implement highbd_d63_predictor using Neon
Add Neon implementations of the highbd d63 predictor for 4x4, 8x8, 16x16
and 32x32 block sizes. Also update tests to add new corresponding cases.

This re-lands commit 7cdf139e3d,
previously reverted in 7478b7e4e4.

Compared to the previous implementation attempt we now correctly match
the behaviour of the C code when handling the final element loaded from
the 'above' input array. In particular:

- The C code for a 4x4 block performs a full average of the last element
  rather than duplicating the final element from the input 'above'
  array.

- The C code for other block sizes performs a full average for the
  stride=0 and stride=1, and otherwise shifts in duplicates of the final
  element from the input 'above' array. Notably this shifting for later
  strides _replaces_ the final element which we previously performed an
  average on (see {d0,d1}_ext in the code).

It is worth noting that this difference is not caught by the existing
VP9HighbdIntraPredTest test cases since the test vector initialisation
contains this loop:

    for (int x = block_size; x < 2 * block_size; x++) {
        above_row_[x] = above_row_[block_size - 1];
    }

Since AVG2(a, a) and AVG3(a, a, a) are simply 'a', such differences in
behaviour for the final element are not observed.

Tested on AArch64 with:

- ./test_libvpx --gtest_filter="*VP9HighbdIntraPredTest*"
- ./test_libvpx --gtest_filter="*VP9/TestVectorTest.MD5Match*"
- ./test_libvpx --gtest_filter="*VP9/ExternalFrameBufferMD5Test*"

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    2.43
Neoverse N1 |  LLVM 15 |   8x8 |    3.92
Neoverse N1 |  LLVM 15 | 16x16 |    3.19
Neoverse N1 |  LLVM 15 | 32x32 |    4.13
Neoverse N1 |   GCC 12 |   4x4 |    2.92
Neoverse N1 |   GCC 12 |   8x8 |    6.51
Neoverse N1 |   GCC 12 | 16x16 |    4.55
Neoverse N1 |   GCC 12 | 32x32 |    3.18
Neoverse V1 |  LLVM 15 |   4x4 |    1.99
Neoverse V1 |  LLVM 15 |   8x8 |    3.65
Neoverse V1 |  LLVM 15 | 16x16 |    3.72
Neoverse V1 |  LLVM 15 | 32x32 |    3.26
Neoverse V1 |   GCC 12 |   4x4 |    2.39
Neoverse V1 |   GCC 12 |   8x8 |    4.76
Neoverse V1 |   GCC 12 | 16x16 |    3.24
Neoverse V1 |   GCC 12 | 32x32 |    2.44

Change-Id: Iefaa774d6a20388b523eaa7f5df6bc5f5cf249e4
2023-03-06 13:34:35 +00:00
Johann 0384a2aab7 reland: quantize: simplify 32x32_b args
Allocate mb_plane_ on the heap to ensure src is aligned.

Now that all the implementations of the 32x32 quantize are in
intrinsics we can reference struct members directly. Saves
pushing them to the stack.

n_coeffs is not used at all for this function.

Change-Id: Ib551f7f583977602504d962b72063bc6eda9dda9
2023-03-06 09:16:04 +09:00
James Zern 5fae248f2a disable vp8_sixtap_predict16x16_neon
This causes various buffer overflows in the tests:

[ RUN      ] NEON/SixtapPredictTest.TestWithPresetData/0
=================================================================
==22346==ERROR: AddressSanitizer: global-buffer-overflow on address
0x0000012b4a5b at pc 0x000000df0f60 bp 0xffffcf6e64b0 sp 0xffffcf6e64a8
READ of size 8 at 0x0000012b4a5b thread T0
    #0 0xdf0f5c in vp8_sixtap_predict16x16_neon
       vp8/common/arm/neon/sixtappredict_neon.c:1507:13
    #1 0x8819e4 in (anonymous
        namespace)::SixtapPredictTest_TestWithPresetData_Test::TestBody()
       test/predict_test.cc:293:3
    ...

0x0000012b4a5b is located 2 bytes to the right of global variable
'kTestData' defined in '../test/predict_test.cc:237:24' (0x12b48a0) of
size 441

[ RUN      ] NEON/SixtapPredictTest.TestWithRandomData/0
=================================================================
==22338==ERROR: AddressSanitizer: heap-buffer-overflow on address
0xffff8b5321fb at pc 0x000000df0f60 bp 0xfffff7e0cf30 sp 0xfffff7e0cf28
READ of size 8 at 0xffff8b5321fb thread T0
    #0 0xdf0f5c in vp8_sixtap_predict16x16_neon
       vp8/common/arm/neon/sixtappredict_neon.c:1507:13
    #1 0x87d4c0 in (anonymous
       namespace)::PredictTestBase::TestWithRandomData(void (*)(unsigned
       char*, int, int, int, unsigned char*, int))
       test/predict_test.cc:170:9
    ...

0xffff8b5321fb is located 2 bytes to the right of 441-byte region
[0xffff8b532040,0xffff8b5321f9)
allocated by thread T0 here:
    #0 0x5fd4f0 in operator new[](unsigned long) (test_libvpx+0x5fd4f0)
    #1 0x87c2e0 in (anonymous namespace)::PredictTestBase::SetUp()
       test/predict_test.cc:47:12
    #2 0x87d074 in non-virtual thunk to (anonymous
       namespace)::PredictTestBase::SetUp() test/predict_test.cc
    ...

Bug: webm:1795
Change-Id: I32213a381eef91547d00f88acf90f1cf2ec2ea75
2023-03-03 15:33:16 -08:00
James Zern f5dfa780ce disable vpx_get4x4sse_cs_neon
This function causes a heap overflow in the tests:
[ RUN      ] NEON/VpxSseTest.RefSse/0
=================================================================
==876922==ERROR: AddressSanitizer: heap-buffer-overflow on address
0xffff8949d903 at pc 0x000000dd95d4 bp 0xfffffdd7f260 sp 0xfffffdd7f258
READ of size 8 at 0xffff8949d903 thread T0
    #0 0xdd95d0 in vpx_get4x4sse_cs_neon
       vpx_dsp/arm/variance_neon.c:556:10
    #1 0x9d4894 in (anonymous namespace)::MainTestClass<unsigned int
       (*)(unsigned char const*, int, unsigned char const*,
           int)>::RefTestSse() test/variance_test.cc:531:5
    #2 0x9d4894 in (anonymous
       namespace)::VpxSseTest_RefSse_Test::TestBody()
           test/variance_test.cc:772:30
    ...

0xffff8949d903 is located 3 bytes to the right of 16-byte region
[0xffff8949d8f0,0xffff8949d900)
allocated by thread T0 here:
    #0 0x5fd050 in operator new[](unsigned long) (test_libvpx+0x5fd050)
    #1 0x9d3e04 in (anonymous namespace)::MainTestClass<unsigned int
       (*)(unsigned char const*, int, unsigned char const*,
           int)>::SetUp() test/variance_test.cc:299:12

Bug: webm:1794
Change-Id: I4bc681eb9a436743ef8bfe2a2abae59ce754309c
2023-03-03 13:24:02 -08:00
James Zern 394de691a0 Revert "Implement d117_predictor using Neon"
This reverts commit 360e9069b6.

This causes ASan errors:
[ RUN      ] VP9/TestVectorTest.MD5Match/1
=================================================================
==837858==ERROR: AddressSanitizer: stack-buffer-overflow on address
0xffff82ecad40 at pc 0x000000c494d4 bp 0xffffe1695800 sp 0xffffe16957f8
READ of size 16 at 0xffff82ecad40 thread T0
    #0 0xc494d0 in vpx_d117_predictor_32x32_neon (test_libvpx+0xc494d0)
    #1 0x1040b34 in vp9_predict_intra_block (test_libvpx+0x1040b34)
    #2 0xf8feec in decode_block (test_libvpx+0xf8feec)
    #3 0xf8f588 in decode_partition (test_libvpx+0xf8f588)
    #4 0xf7be5c in vp9_decode_frame (test_libvpx+0xf7be5c)
    ...
Address 0xffff82ecad40 is located in stack of thread T0 at offset 64 in
frame
    #0 0x103fd3c in vp9_predict_intra_block (test_libvpx+0x103fd3c)

  This frame has 2 object(s):
    [32, 64) 'left_col.i' <== Memory access at offset 64 overflows this
                              variable
    [96, 176) 'above_data.i'

Change-Id: I058213364617dfe1036126c33a3307f8288d9ae0
2023-03-03 12:34:36 -08:00
Johann ca0c51f05f Revert "Allow macroblock_plane to have its own rounding buffer"
This reverts commit 5359ae810c.

Reason for revert: Blocks quantize cleanups

Original change's description:
> Allow macroblock_plane to have its own rounding buffer
>
> Add 8 bytes buffer to macroblock_plane to support rounding factor.
>
> Change-Id: I3751689e4449c0caea28d3acf6cd17d7f39508ed

Change-Id: Ia2424d2114207370f0b45350313a5ff8521d25a8
2023-03-03 06:24:41 +00:00
Konstantinos Margaritis 817248e1be [SSE4_1] Fix overflow in highbd temporal_filter
While porting this function to NEON, using SSE4_1 implementation
as base I noticed that both were producing files with different
checksums to the C reference implementation. After investigating
further I found that this saturating pack was the culprit. Doing
the multiplication on the 32-bit values, leads to producing the
correct results with the C implementation.

Change-Id: I40c2a36551b2db363a58ea9aa19ef327f2676de3
2023-03-02 00:02:16 +00:00
James Zern 508bfc1ff4 Revert "quantize: simplify 32x32_b args"
This reverts commit 848f6e7337.

This has alignment issues, causing crashes in the tests:
SSSE3/VP9QuantizeTest.EOBCheck/*

Change-Id: Ic12014ab0a78ed3cde02d642509061552cdc8fc9
2023-03-01 15:54:49 -08:00
James Zern e4b423e140 Revert "quantize: simplifly highbd 32x32_b args"
This reverts commit 573f5e662b.

This has alignment issues, causing crashes in the tests:
SSSE3/VP9QuantizeTest.EOBCheck/*

Change-Id: Ibf05e6b116c46f6e2c11187b3e3578bbd2d2c227
2023-03-01 15:54:48 -08:00
James Zern d98a7b8bd9 Revert "quantize: use scan_order instead of passing scan/iscan"
This reverts commit 14fc40040f.

This has alignment issues, causing crashes in the tests:
SSSE3/VP9QuantizeTest.EOBCheck/*

Change-Id: I934f9a4c3ce3db33058a65180fa645c8649c3670
2023-03-01 15:54:46 -08:00
James Zern 0e7804ca30 Merge "Optimize Neon implementation of high bitdepth MSE functions" into main 2023-03-01 23:13:34 +00:00
James Zern 7478b7e4e4 Revert "Implement highbd_d63_predictor using Neon"
This reverts commit 7cdf139e3d.

This causes failures in the VP9/ExternalFrameBufferMD5Test and
VP9/TestVectorTest.MD5Match tests in both armv7 and aarch64 builds.

Change-Id: I7ac4ba0ddc70e7e7860df9f962e6658defe1cdd5
2023-03-01 12:17:00 -08:00
Salome Thirot 096cd0ba8a Optimize Neon implementation of high bitdepth MSE functions
Currently MSE functions just call the variance helpers but don't
actually use the computed sum. This patch adds dedicated helpers to
perform the computation of sse.

Add the corresponding tests as well.

Change-Id: I96a8590e3410e84d77f7187344688e02efe03902
2023-03-01 13:35:03 +00:00
Johann 14fc40040f quantize: use scan_order instead of passing scan/iscan
further reduces the arguments for the 32x32. This will be applied to the base
version as well.

Change-Id: I25a162b5248b14af53d9e20c6a7fa2a77028a6d1
2023-03-01 07:48:01 +09:00
Johann 573f5e662b quantize: simplifly highbd 32x32_b args
Change-Id: I431a41279c4c4193bc70cfe819da6ea7e1d2fba1
2023-03-01 07:35:15 +09:00
James Zern 1ad49b2878 Merge changes I892fbd2c,Ic59df16c,I7228327b,Ib4a1a2cb into main
* changes:
  Implement highbd_d117_predictor using Neon
  Implement highbd_d63_predictor using Neon
  Implement d117_predictor using Neon
  Implement d63_predictor using Neon
2023-02-28 21:50:11 +00:00
James Zern 002ca3fc72 Merge "quantize: simplify 32x32_b args" into main 2023-02-28 21:40:26 +00:00
George Steed 74e4587c89 Implement highbd_d117_predictor using Neon
Add Neon implementations of the highbd d117 predictor for 4x4, 8x8,
16x16 and 32x32 block sizes. Also update tests to add new corresponding
cases.

An explanation of the general implementation strategy is given in the
8x8 implementation body, and is mostly identical to the non-highbd
version.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.99
Neoverse N1 |  LLVM 15 |   8x8 |    4.37
Neoverse N1 |  LLVM 15 | 16x16 |    6.81
Neoverse N1 |  LLVM 15 | 32x32 |    6.49
Neoverse N1 |   GCC 12 |   4x4 |    2.49
Neoverse N1 |   GCC 12 |   8x8 |    4.10
Neoverse N1 |   GCC 12 | 16x16 |    5.58
Neoverse N1 |   GCC 12 | 32x32 |    2.16
Neoverse V1 |  LLVM 15 |   4x4 |    1.99
Neoverse V1 |  LLVM 15 |   8x8 |    5.03
Neoverse V1 |  LLVM 15 | 16x16 |    6.61
Neoverse V1 |  LLVM 15 | 32x32 |    6.01
Neoverse V1 |   GCC 12 |   4x4 |    2.09
Neoverse V1 |   GCC 12 |   8x8 |    4.52
Neoverse V1 |   GCC 12 | 16x16 |    4.23
Neoverse V1 |   GCC 12 | 32x32 |    2.70

Change-Id: I892fbd2c17ac527ddc22b91acca907ffc84c5cd2
2023-02-28 11:46:40 +00:00
George Steed 7cdf139e3d Implement highbd_d63_predictor using Neon
Add Neon implementations of the highbd d63 predictor for 4x4, 8x8, 16x16
and 32x32 block sizes. Also update tests to add new corresponding cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    2.43
Neoverse N1 |  LLVM 15 |   8x8 |    4.03
Neoverse N1 |  LLVM 15 | 16x16 |    3.07
Neoverse N1 |  LLVM 15 | 32x32 |    4.11
Neoverse N1 |   GCC 12 |   4x4 |    2.92
Neoverse N1 |   GCC 12 |   8x8 |    7.20
Neoverse N1 |   GCC 12 | 16x16 |    4.43
Neoverse N1 |   GCC 12 | 32x32 |    3.18
Neoverse V1 |  LLVM 15 |   4x4 |    1.99
Neoverse V1 |  LLVM 15 |   8x8 |    3.66
Neoverse V1 |  LLVM 15 | 16x16 |    3.60
Neoverse V1 |  LLVM 15 | 32x32 |    3.29
Neoverse V1 |   GCC 12 |   4x4 |    2.39
Neoverse V1 |   GCC 12 |   8x8 |    4.76
Neoverse V1 |   GCC 12 | 16x16 |    3.29
Neoverse V1 |   GCC 12 | 32x32 |    2.43

Change-Id: Ic59df16ceeb468003754b4374be2f4d9af6589e4
2023-02-28 11:46:34 +00:00
George Steed 360e9069b6 Implement d117_predictor using Neon
Add Neon implementations of the d117 predictor for 4x4, 8x8, 16x16 and
32x32 block sizes. Also update tests to add new corresponding cases.

An explanation of the general implementation strategy is given in the
8x8 implementation body.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    1.73
Neoverse N1 |  LLVM 15 |   8x8 |    5.24
Neoverse N1 |  LLVM 15 | 16x16 |    9.77
Neoverse N1 |  LLVM 15 | 32x32 |   14.13
Neoverse N1 |   GCC 12 |   4x4 |    2.04
Neoverse N1 |   GCC 12 |   8x8 |    4.70
Neoverse N1 |   GCC 12 | 16x16 |    8.64
Neoverse N1 |   GCC 12 | 32x32 |    4.57
Neoverse V1 |  LLVM 15 |   4x4 |    1.75
Neoverse V1 |  LLVM 15 |   8x8 |    6.79
Neoverse V1 |  LLVM 15 | 16x16 |    9.16
Neoverse V1 |  LLVM 15 | 32x32 |   14.47
Neoverse V1 |   GCC 12 |   4x4 |    1.75
Neoverse V1 |   GCC 12 |   8x8 |    6.00
Neoverse V1 |   GCC 12 | 16x16 |    7.63
Neoverse V1 |   GCC 12 | 32x32 |    4.32

Change-Id: I7228327b5be27ee7a68deecafa05be0bd2a40ff4
2023-02-28 11:33:21 +00:00
George Steed a7ab16aed1 Implement d63_predictor using Neon
Add Neon implementations of the d63 predictor for 4x4, 8x8, 16x16 and
32x32 block sizes. Also update tests to add new corresponding cases.

Speedups over the C code (higher is better):

Microarch.  | Compiler | Block | Speedup
Neoverse N1 |  LLVM 15 |   4x4 |    2.10
Neoverse N1 |  LLVM 15 |   8x8 |    4.45
Neoverse N1 |  LLVM 15 | 16x16 |    4.74
Neoverse N1 |  LLVM 15 | 32x32 |    2.27
Neoverse N1 |   GCC 12 |   4x4 |    2.46
Neoverse N1 |   GCC 12 |   8x8 |   10.37
Neoverse N1 |   GCC 12 | 16x16 |   11.46
Neoverse N1 |   GCC 12 | 32x32 |    6.57
Neoverse V1 |  LLVM 15 |   4x4 |    2.24
Neoverse V1 |  LLVM 15 |   8x8 |    3.53
Neoverse V1 |  LLVM 15 | 16x16 |    4.44
Neoverse V1 |  LLVM 15 | 32x32 |    2.17
Neoverse V1 |   GCC 12 |   4x4 |    2.25
Neoverse V1 |   GCC 12 |   8x8 |    7.67
Neoverse V1 |   GCC 12 | 16x16 |    8.97
Neoverse V1 |   GCC 12 | 32x32 |    4.77

Change-Id: Ib4a1a2cb5a5c4495ae329529f8847664cbd0dfe0
2023-02-28 11:32:32 +00:00
Johann 848f6e7337 quantize: simplify 32x32_b args
Now that all the implementations of the 32x32 quantize are in
intrinsics we can reference struct members directly. Saves
pushing them to the stack.

n_coeffs is not used at all for this function.

Change-Id: I2104fea3fa20c455087e21b347d6abd7ea1f3e1e
2023-02-28 18:46:16 +09:00
James Zern 372989240d Merge "Add Neon implementations of standard bitdepth MSE functions" into main 2023-02-28 02:44:28 +00:00
James Zern c70d57c71a Merge "Optimize transpose_neon.h helper functions" into main 2023-02-28 02:36:41 +00:00
James Zern 112945ac7b tools_common,VpxInterface: remove unneeded const
Change-Id: Ic309aab2ff1750bdbcc36e8aafe05d52930ba694
2023-02-27 13:48:47 -08:00
James Zern 0824c8c556 Merge "tools_common,VpxInterface: fix interface fn ptr proto" into main 2023-02-27 19:52:18 +00:00
Salome Thirot ccc101e6bb Add Neon implementations of standard bitdepth MSE functions
Currently only vpx_mse16x16 has a Neon implementation. This patch adds
optimized Armv8.0 and Armv8.4 dot-product paths for all block sizes:
8x8, 8x16, 16x8 and 16x16.

Add the corresponding tests as well.

Change-Id: Ib0357fdcdeb05860385fec89633386e34395e260
2023-02-27 18:03:22 +00:00
Jonathan Wright b25cca8c2e Optimize transpose_neon.h helper functions
1) Use vtrn[12]q_[su]64 in vpx_vtrnq_[su]64* helpers on AArch64
   targets. This produces half as many TRN1/2 instructions compared to
   the number of MOVs that result from vcombine.

2) Use vpx_vtrnq_[su]64* helpers wherever applicable.

3) Refactor transpose_4x8_s16 to operate on 128-bit vectors.

Change-Id: I9a8b1c1fe2a98a429e0c5f39def5eb2f65759127
2023-02-27 09:49:02 +00:00
James Zern 5b2d3d5e42 tools_common,VpxInterface: fix interface fn ptr proto
Use (void) to indicate an empty parameter list and match the declaration
of vpx_codec_vp[89]_[cd]x. This fixes a cfi sanitizer error.

Change-Id: I190f432eea4d1765afffd84c7458ec44d863f90c
2023-02-24 19:25:39 -08:00
James Zern 45dc0d34d2 Merge changes I65d86038,If3299fe5,I3ef1ff19 into main
* changes:
  Add Neon implementation of high bitdepth 32x32 hadamard transform
  Add Neon implementation of high bitdepth 16x16 hadamard transform
  Add Neon implementation of high bitdepth 8x8 hadamard transform
2023-02-24 17:58:15 +00:00
James Zern 3cf0568ace Merge changes Ia64d175a,Ie4ea8f0a into main
* changes:
  vp9_loop_filter_alloc: clear -Wshadow warnings
  vp9_adapt_mode_probs: clear -Wshadow warning
2023-02-24 17:49:25 +00:00
Salome Thirot 111068923b Add Neon implementation of high bitdepth 32x32 hadamard transform
Add Neon implementation of vpx_highbd_hadamard_32x32 as well as the
corresponding tests.

Change-Id: I65d8603896649de1996b353aa79eee54824b4708
2023-02-24 11:10:14 +00:00
Salome Thirot 6ec45f933c Add Neon implementation of high bitdepth 16x16 hadamard transform
Add Neon implementation of vpx_highbd_hadamard_16x16 as well as the
corresponding tests.

Change-Id: If3299fe556351dfe3db994ac171d83a95ea1504b
2023-02-24 11:09:57 +00:00
Jerome Jiang 1614895e06 Merge "vp9 rc test: change param type to bool" into main 2023-02-24 01:45:54 +00:00
Jerome Jiang 221d76ab9c vp9 rc test: change param type to bool
Change-Id: Ib45522e32d9137678da9062830044e9dd87537e5
2023-02-23 14:28:30 -05:00
Chi Yo Tsai 49807f88d6 Merge "Disable some intra modes for TX_32X32" into main 2023-02-23 18:01:05 +00:00
Salome Thirot aab93ee6b6 Add Neon implementation of high bitdepth 8x8 hadamard transform
Add Neon implementation of vpx_highbd_hadamard_8x8 as well as the
corresponding tests.

Change-Id: I3ef1ff199d76b6b010591ef15a81b0f36c9ded03
2023-02-23 17:09:52 +00:00
James Zern 76389886ee vp9_loop_filter_alloc: clear -Wshadow warnings
Bug: webm:1793
Change-Id: Ia64d175aa69dc2ecde2babf64bde04f02b32795b
2023-02-22 22:13:02 -08:00
James Zern f569a4d68c vp9_adapt_mode_probs: clear -Wshadow warning
Bug: webm:1793
Change-Id: Ie4ea8f0a3295e6f58dc6f7d5c61d46700c539d40
2023-02-22 22:08:36 -08:00
James Zern 03b97add02 Merge "vp9_block.h: rename diff struct to Diff" into main 2023-02-23 06:07:25 +00:00
chiyotsai 4ba3be9324 Disable some intra modes for TX_32X32
Performance:
| SPD_SET | TESTSET | AVG_PSNR | OVR_PSNR |  SSIM   | ENC_T |
|---------|---------|----------|----------|---------|-------|
|    0    | hdres2  | +0.036%  | +0.032%  | +0.014% | -3.9% |
|    0    | lowres2 | -0.002%  | -0.011%  | +0.020% | -3.6% |
|    0    | midres2 | +0.045%  | +0.025%  | -0.007% | -4.0% |

STATS_CHANGED

Change-Id: I75a927333d26f2a37f0dda57a641b455b845f5b9
2023-02-22 14:36:21 -08:00
James Zern 3712a5869c vpx_subpixel_8t_intrin_avx2: clear -Wshadow warnings
no changes to assembly

Bug: webm:1793
Change-Id: I6a82290cafee7f4a7909d497ccfdefd5a78fb8ed
2023-02-22 12:54:54 -08:00
James Zern 46add73f7e vp9_block.h: rename diff struct to Diff
This matches the style guide and fixes some -Wshadow warnings related to
variables with the same name. Something similar was done in libaom in:
863b04994b Fix warnings reported by -Wshadow: Part2: av1 directory

Bug: webm:1793
Change-Id: I4df1bbc8d079a3174d75f0d35d54c200ffdbb677
2023-02-22 11:59:02 -08:00
Yunqing Wang 910245f1fe Merge "Skip redundant iterations in joint motion search " into main 2023-02-22 19:28:17 +00:00
Jerome Jiang f7ca33c46c Merge "vp9 rc: Make it work for SVC parallel encoding" into main 2023-02-22 14:59:49 +00:00
Salome Thirot 6ed9639e43 Optimize Neon implementation of high bitpdeth variance functions
Specialize implementation of high bitdepth variance functions such that
we only widen data processing element types when absolutely necessary.

Change-Id: If4cc3fea7b5ab0821e3129ebd79ff63706a512bf
2023-02-21 20:03:56 +00:00
Deepa K G c4ee2b2f03 Skip redundant iterations in joint motion search
In joint_motion_search, there are four iterations.
Even iterations search in the first reference frame
and odd iterations search in the second. The last two
iterations use the search result of the first two
iterations as the start point. If the search result does
not change,last two iterations are not necessary and can
be skipped.

          Instruction Count
cpu-used   Reduction(%)
  0          1.411

Change-Id: Ie583c9f75dd0a22bbdfb432ccdd62eea6ec4fce8
2023-02-21 18:05:23 +05:30
Jerome Jiang 0f888815c5 vp9 rc: Make it work for SVC parallel encoding
Added unit test.

Keep track of spatial layer id and frame type in case where spatial
layers are encoded parallel by the hardware encoder.

ComputeQP() / PostEncodeUpdate() doesn't need to be called sequentially
when there is no inter layer prediction.

Bug: b/257368998
Change-Id: I50beaefcfc205d3f9a9d3dbe11fead5bfdc71489
2023-02-17 20:44:22 -05:00
Jerome Jiang 32c1a4bf3f Merge "vp9 rc: Verify QP for all spatial layers" into main 2023-02-17 02:11:31 +00:00
Jerome Jiang be2fd0c740 vp9 rc: Verify QP for all spatial layers
Change-Id: Ic669c96d25d7c039d370e9acd00dc45e09054552
2023-02-16 19:23:42 -05:00
chiyotsai b737865480 Relax frame recode tolerance on speed 0 to 1 above 480p
Performance:
| SPD_SET | TESTSET | AVG_PSNR | OVR_PSNR |  SSIM   | ENC_T |
|---------|---------|----------|----------|---------|-------|
|    0    | hdres2  | -0.028%  | +0.030%  | -0.408% | -2.0% |
|    0    | lowres2 | +0.000%  | +0.000%  | +0.000% | +0.0% |
|    0    | midres2 | -0.138%  | +0.042%  | -0.427% | -2.5% |
|---------|---------|----------|----------|---------|-------|
|    1    | hdres2  | -0.032%  | +0.018%  | -0.342% | -1.1% |
|    1    | lowres2 | +0.000%  | +0.000%  | +0.000% | +0.0% |
|    1    | midres2 | +0.050%  | +0.060%  | -0.257% | -1.6% |

Rate Error:
|         |         |     AVG_RC_ERROR    |     MAX_RC_ERROR    |
|         |         |---------------------|---------------------|
| SPD_SET | TESTSET |   BASE   |   TEST   |   BASE   |   TEST   |
|---------|---------|----------|----------|----------|----------|
|    0    | hdres2  |  33.044% |  33.065% | 149.903% | 149.903% |
|    0    | midres2 |  59.632% |  59.566% |  79.091% |  79.249% |
|---------|---------|----------|----------|----------|----------|
|    1    | hdres2  |  33.050% |  33.057% | 151.278% | 151.278% |
|    1    | midres2 |  59.640% |  59.614% |  78.707% |  78.842% |

STATS_CHANGED

Change-Id: I5d09601fede3912d5173717ce9dd070df3a97ec8
2023-02-16 13:25:06 -08:00
chiyotsai 660031ccf3 Enable some more speed features on speed 0 to 2
Performance:
| SPD_SET | TESTSET | AVG_PSNR | OVR_PSNR |  SSIM   | ENC_T |
|---------|---------|----------|----------|---------|-------|
|    0    | hdres2  | +0.034%  | +0.030%  | +0.033% | -3.7% |
|    0    | lowres2 | +0.012%  | +0.017%  | +0.044% | -2.1% |
|    0    | midres2 | +0.030%  | +0.035%  | +0.060% | -1.9% |
|---------|---------|----------|----------|---------|-------|
|    1    | hdres2  | +0.027%  | +0.036%  | +0.030% | -2.7% |
|    1    | lowres2 | -0.006%  | -0.002%  | +0.006% | -1.0% |
|    1    | midres2 | -0.006%  | -0.012%  | -0.010% | -1.0% |
|---------|---------|----------|----------|---------|-------|
|    2    | hdres2  | -0.006%  | -0.001%  | -0.020% | -2.4% |
|    2    | lowres2 | -0.010%  | -0.015%  | -0.001% | -0.9% |
|    2    | midres2 | +0.006%  | -0.005%  | +0.009% | -1.0% |

STATS_CHANGED

Change-Id: I1431ac07215bb844739a410697387b9aead82792
2023-02-14 10:11:58 -08:00
James Zern bc2965ff72 Merge changes Id74a6d9c,I5c31e0e9,Id5a2b2d9,I73182c97,I2f5916d5, ... into main
* changes:
  Optimize vpx_highbd_comp_avg_pred_neon
  Add Neon AvgPredTestHBD test suite
  Specialize Neon high bitdepth avg subpel variance by filter value
  Specialize Neon high bitdepth subpel variance by filter value
  Refactor Neon high bitdepth avg subpel variance functions
  Optimize Neon high bitdepth subpel variance functions
2023-02-14 02:46:51 +00:00
Salome Thirot ed68c267cf Optimize vpx_highbd_comp_avg_pred_neon
Optimize the implementation of vpx_highbd_comp_avg_pred_neon by making
use of the URHADD instruction to compute the average.

Change-Id: Id74a6d9c33e89bc548c3c7ecace59af69051b4a7
2023-02-13 20:23:14 +00:00
Salome Thirot b17993ca67 Add Neon AvgPredTestHBD test suite
Add test suite for vpx_highbd_comp_avg_pred_neon.

Change-Id: I5c31e0e990661ee3b8030bb517829c088fceae4d
2023-02-13 20:23:09 +00:00
Salome Thirot e03217c9d5 Specialize Neon high bitdepth avg subpel variance by filter value
Use the same specialization as for standard bitdepth. The rationale for
the specialization is as follows:

The optimal implementation of the bilinear interpolation depends on the
filter values being used. For both horizontal and vertical interpolation
this can simplify to just taking the source values, or averaging the
source and reference values - which can be computed more easily than a
bilinear interpolation with arbitrary filter values.

This patch introduces tests to find the most optimal bilinear
interpolation implementation based on the filter values being used.
This new specialization is only used for larger block sizes.

Change-Id: Id5a2b2d9fac6f878795a6ed9de2bc27d9e62d661
2023-02-13 20:23:02 +00:00
Salome Thirot c113d6b027 Specialize Neon high bitdepth subpel variance by filter value
Use the same specialization as for standard bitdepth. The rationale for
the specialization is as follows:

The optimal implementation of the bilinear interpolation depends on the
filter values being used. For both horizontal and vertical interpolation
this can simplify to just taking the source values, or averaging the
source and reference values - which can be computed more easily than a
bilinear interpolation with arbitrary filter values.

This patch introduces tests to find the most optimal bilinear
interpolation implementation based on the filter values being used.
This new specialization is only used for larger block sizes.

Change-Id: I73182c979255f0332a274f2e5907df7f38c9eeb3
2023-02-13 20:22:56 +00:00
Salome Thirot 7343d56c1b Refactor Neon high bitdepth avg subpel variance functions
Use the same general code style as in the standard bitdepth Neon
implementation - merging the computation of vpx_highbd_comp_avg_pred
with the second pass of the bilinear filter to avoid storing and loading
the block again.

Also move vpx_highbd_comp_avg_pred_neon to its own file (like the
standard bitdepth implementation) since we're no longer using it for
averaging sub-pixel variance.

Change-Id: I2f5916d5b397db44b3247b478ef57046797dae6c
2023-02-13 20:22:50 +00:00
Salome Thirot 42cb3dbf94 Optimize Neon high bitdepth subpel variance functions
Use the same general code style as in the standard bitdepth Neon
implementation. Additionally, do not unnecessarily widen to 32-bit data
types when doing bilinear filtering - allowing us to process twice as
many elements per instruction.

Change-Id: I1e178991d2aa71f5f77a376e145d19257481e90f
2023-02-13 20:19:30 +00:00
James Zern b5e1945af0 README: update release version to 1.13.0
this was missed in the v1.13.0 tag

Bug: webm:1780
Change-Id: I3044534123bf67861174970e6241f6586055358e
(cherry picked from commit 184a886917)
2023-02-13 18:35:10 +00:00
James Zern 184a886917 README: update release version to 1.13.0
this was missed in the v1.13.0 tag

Bug: webm:1780
Change-Id: I3044534123bf67861174970e6241f6586055358e
2023-02-10 19:04:41 -08:00
Chi Yo Tsai 5595e18870 Merge "Remove CONFIG_CONSISTENT_RECODE flag" into main 2023-02-10 22:13:50 +00:00
chiyotsai 086f0e6538 Remove CONFIG_CONSISTENT_RECODE flag
Currently, libvpx does not properly clear and re-initialize the memories
when it re-encodes a frame. As a result, out-of-date values are used in
the encoding process, and re-encoding a frame with the same parameter
will give different outputs.

This commit enables the code under CONFIG_CONSISTENT_RECODE to correct
this behavior. This change has minor effect on the coding performance,
but it ensures valid values are used in the encoding process.

Furthermore, the flag is removed as it is now always turned on.

Performance:
| SPD_SET | TESTSET | AVG_PSNR | OVR_PSNR |  SSIM   | ENC_T |
|---------|---------|----------|----------|---------|-------|
|    0    | hdres2  | -0.012%  | -0.021%  | -0.030% | +0.1% |
|    0    | lowres2 | +0.029%  | +0.019%  | +0.047% | +0.1% |
|    0    | midres2 | -0.004%  | +0.009%  | +0.026% | +0.1% |
|---------|---------|----------|----------|---------|-------|
|    1    | hdres2  | +0.032%  | +0.032%  | -0.000% | -0.0% |
|    1    | lowres2 | -0.005%  | -0.011%  | -0.014% | +0.0% |
|    1    | midres2 | +0.004%  | +0.020%  | +0.027% | +0.2% |
|---------|---------|----------|----------|---------|-------|
|    2    | hdres2  | +0.048%  | +0.056%  | +0.057% | +0.1% |
|    2    | lowres2 | +0.007%  | +0.002%  | -0.016% | -0.0% |
|    2    | midres2 | -0.015%  | -0.008%  | -0.002% | +0.1% |
|---------|---------|----------|----------|---------|-------|
|    3    | hdres2  | +0.010%  | +0.014%  | +0.004% | -0.0% |
|    3    | lowres2 | +0.000%  | -0.021%  | -0.001% | +0.0% |
|    3    | midres2 | +0.007%  | -0.038%  | +0.012% | -0.2% |
|---------|---------|----------|----------|---------|-------|
|    4    | hdres2  | +0.107%  | +0.136%  | +0.124% | -0.0% |
|    4    | lowres2 | -0.012%  | -0.024%  | -0.020% | -0.0% |
|    4    | midres2 | +0.055%  | -0.004%  | +0.048% | -0.1% |
|---------|---------|----------|----------|---------|-------|
|    5    | hdres2  | +0.026%  | +0.027%  | +0.020% | -0.0% |
|    5    | lowres2 | +0.009%  | -0.008%  | +0.028% | +0.1% |
|    5    | midres2 | -0.025%  | +0.021%  | -0.020% | -0.1% |

STATS_CHANGED

Change-Id: I3967aee8c8e4d0608a492e07f99ab8de9744ba57
2023-02-10 13:06:51 -08:00
James Zern 924716523e Merge "Optimize Neon high bitdepth convolve copy" into main 2023-02-10 03:35:22 +00:00
Jerome Jiang f903d99650 Merge "Merge tag 'v1.13.0'" into main 2023-02-09 22:07:28 +00:00
Jerome Jiang e8bd0842c5 Merge "Remove onyx_int.h from vp8 rc header" into main 2023-02-09 21:27:59 +00:00
Jerome Jiang 5edaa583e1 Remove onyx_int.h from vp8 rc header
Also move the FRAME_TYPE declaration to common.h

Bug: webm:1766

Change-Id: Ic3016bd16548a5d2e0ae828a7fd7ad8adda8b8f6
2023-02-09 15:15:59 -05:00
Jerome Jiang 121dc7513f Merge tag 'v1.13.0'
Release v1.13.0 Ugly Duckling

2023-01-31 v1.13.0 "Ugly Duckling"

  This release includes more Neon and AVX2 optimizations, adds a new codec
  control to set per frame QP, upgrades GoogleTest to v1.12.1, and includes
  numerous bug fixes.

- Upgrading:
    This release is ABI incompatible with the previous release.

    New codec control VP9E_SET_QUANTIZER_ONE_PASS to set per frame QP.

    GoogleTest is upgraded to v1.12.1.

    .clang-format is upgraded to clang-format-11.

    VPX_EXT_RATECTRL_ABI_VERSION was bumped due to incompatible changes to the
    feature of using external rate control models for vp9.

- Enhancement:
    Numerous improvements on Neon optimizations.
    Numerous improvements on AVX2 optimizations.
    Additional ARM targets added for Visual Studio.

- Bug fixes:
    Fix to calculating internal stats when frame dropped.
    Fix to segfault for external resize test in vp9.
    Fix to build system with replacing egrep with grep -E.
    Fix to a few bugs with external RTC rate control library.
    Fix to make SVC work with VBR.
    Fix to key frame setting in VP9 external RC.
    Fix to -Wimplicit-int (Clang 16).
    Fix to VP8 external RC for buffer levels.
    Fix to VP8 external RC for dynamic update of layers.
    Fix to VP9 auto level.
    Fix to off-by-one error of max w/h in validate_config.
    Fix to make SVC work for Profile 1.

Bug: webm:1780

Change-Id: I371fc1444ead56f8d7fc510e05582b6415c3ddb1
2023-02-09 20:00:46 +00:00
Jonathan Wright 459cfc8bae Optimize Neon high bitdepth convolve copy
Use standard loads and stores instead of the significantly slower
interleaving/de-interleaving variants. Also move all loads in loop
bodies above all stores as a mitigation against the compiler thinking
that the src and dst pointers alias (since we can't use restrict in
C89.)

Change-Id: Idd59dca51387f553f8db27144a2b8f2377c937d3
2023-02-09 12:14:18 +00:00
Chi Yo Tsai d3275163c1 Merge "Copy BLOCK_8X8's mi to PICK_MODE_CONTEXT::mic" into main 2023-02-08 23:16:48 +00:00
chiyotsai b6951d2b0f Copy BLOCK_8X8's mi to PICK_MODE_CONTEXT::mic
STATS_CHANGED

BUG=webm:1789

Change-Id: I74efe28bdf90a179c59fe3d1f5a15d497f57080d
2023-02-08 14:01:19 -08:00
Salome Thirot bb065c6c6d Add missing high bitdepth Neon subpel variance tests
Add missing 4x4 and 4x8 tests for both high bitdepth sub-pixel variance
and high bitdepth averaging sub-pixel variance.

Change-Id: I042752c5b7ccc14f58075694d0bb1d36f144ad06
2023-02-08 19:28:09 +00:00
Chi Yo Tsai 73cdc9fd1e Merge "Enable some speed features on speed 0" into main 2023-02-08 00:44:46 +00:00
chiyotsai 03ddac40df Enable some speed features on speed 0
Performance:
| SPD_SET | TESTSET | AVG_PSNR | OVR_PSNR |  SSIM   | ENC_T |
|---------|---------|----------|----------|---------|-------|
|    0    | hdres2  | +0.069%  | +0.067%  | +0.100% | -8.6% |
|    0    | midres2 | +0.116%  | +0.103%  | +0.062% | -9.6% |
|    0    | lowres2 | +0.276%  | +0.283%  | +0.214% |-11.9% |

STATS_CHANGED

Change-Id: I8b26c0be2312fcd0f8c9e889367682e80ea8de4b
2023-02-07 15:06:06 -08:00
Salome Thirot 25a6b2b181 Use 4D reduction Neon helper for standard bitdepth SAD4D
Move the 4D reduction helper function to sum_neon.h and use this for
both standard and high bitdepth SAD4D paths. This also removes the
AArch64 requirement for using the UDOT Neon SAD4D paths.

Change-Id: I207f76b3d42aa541809b0672c3b3d86e54d133ff
2023-02-07 17:08:48 +00:00
Yunqing Wang 9b910a65ed Merge "Move TPL to a new file" into main 2023-02-07 04:22:40 +00:00
James Zern 9a26870002 Merge changes Ica45c44f,I75c5f099,I9e626d7f into main
* changes:
  Optimize Neon implementation of high bitdepth SAD4D functions
  Optimize Neon implementation of high bitdepth avg SAD functions
  Optimize Neon implementation of high bitdepth SAD functions
2023-02-07 01:32:03 +00:00
Yunqing Wang ec8e2fe1cf Move TPL to a new file
This is a refactoring CL.

Change-Id: Ic8c1575601d27f14ecd1b1bf0a038e447eaae458
2023-02-06 16:34:08 -08:00
Jerome Jiang d2557313d2 Merge "Remove duplicated VPX_SCALING declaration" into main 2023-02-06 22:16:41 +00:00
Salome Thirot 6b8e9e1f3e Optimize Neon implementation of high bitdepth SAD4D functions
Optimizations take a similar form to those implemented for Armv8.0
standard bitdepth SAD4D:

- Use ABD, UADALP instead of ABAL, ABAL2 (double the throughput on
  modern out-of-order Arm-designed cores.)
- Use more accumulator registers to make better use of Neon pipeline
  resources on Arm CPUs that have four Neon pipes.
- Compute the four SAD sums in parallel so that we only load the source
  block once - instead of four times.

Change-Id: Ica45c44fd167e5fcc83871d8c138fc72ed3a9723
2023-02-06 21:04:52 +00:00
Jerome Jiang 5eea5c7666 Remove duplicated VPX_SCALING declaration
Use VPX_SCALING_MODE instead

Change-Id: Iab9d29f20838703e00bd9f7641035d8ebd69af53
2023-02-06 13:32:37 -05:00
Salome Thirot 9a5cbfbc08 Optimize Neon implementation of high bitdepth avg SAD functions
Optimizations take a similar form to those implemented for standard
bitdepth averaging SAD:

- Use ABD, UADALP instead of ABAL, ABAL2 (double the throughput on
  modern out-of-order Arm-designed cores.)
- Use more accumulator registers to make better use of Neon pipeline
  resources on Arm CPUs that have four Neon pipes.

Change-Id: I75c5f09948f6bf17200f82e00e7a827a80451108
2023-02-06 15:54:57 +00:00
Salome Thirot e3028ddbb4 Optimize Neon implementation of high bitdepth SAD functions
Optimizations take a similar form to those implemented for standard
bitdepth SAD:

- Use ABD, UADALP instead of ABAL, ABAL2 (double the throughput on
  modern out-of-order Arm-designed cores.)
- Use more accumulator registers to make better use of Neon pipeline
  resources on Arm CPUs that have four Neon pipes.

Change-Id: I9e626d7fa0e271908dc43448405a7985b80e6230
2023-02-06 15:51:43 +00:00
Yunqing Wang a77c7a78ae Merge "Fix uninitialized mesh feature for BEST mode" into main 2023-02-03 23:22:58 +00:00
Wan-Teh Chang 18a3421b7d Set _img->bit_depth in y4m_input_fetch_frame()
This is a port of
https://aomedia-review.googlesource.com/c/aom/+/169961.

Change-Id: I2aa0d12cafde0c73448bf8c57eab0cd92e846468
2023-02-03 14:07:09 -08:00
Yunqing Wang d6382e4469 Fix uninitialized mesh feature for BEST mode
At BEST encoding mode, the mesh search range wasn't initialized for
non FC_GRAPHICS_ANIMATION content type, which actually/mistakenly
used speed 0's setting. Fixed it by adding the initialization.

There were 2 ways to fix this. Patchset 1 set to use speed 0's setting
for non FC_GRAPHICS_ANIMATION type. This didn't change BEST mode's
encoding results much, and only a couple of clips' results were changed.

Borg result for BEST mode:
         avg_psnr:  ovr_psnr:  ssim:  encoding_spdup:
lowres2:  -0.004     -0.003   -0.000    0.030
midres2:  -0.006     -0.009   -0.012    0.033
hdres2:    0.002      0.002    0.004    0.015

Patchset 2 set to use BEST's setting for non FC_GRAPHICS_ANIMATION type.
However, the majority of test clips' BDrate got changed up to
~0.5% (gain or loss), and overall it didn't give better performance
than patchset 1. So, we chose to use patchset 1.

Change-Id: Ibbf578dad04420e6ba22cb9a3ddec137a7e4deef
2023-02-03 12:02:11 -08:00
James Zern 858a8c611f vp9_diamond_search_sad_neon: use DECLARE_ALIGNED
rather than the gcc specific __attribute__((aligned())); fixes build
targeting ARM64 windows.

Bug: webm:1788
Change-Id: I2210fc215f44d90c1ce9dee9b54888eb1b78c99e
2023-02-01 14:50:01 -08:00
James Zern 50e1b76e32 Merge "Use load_unaligned mem_neon.h helpers in SAD and SAD4D" into main 2023-01-31 21:20:16 +00:00
Jonathan Wright 472c839c9f Use load_unaligned mem_neon.h helpers in SAD and SAD4D
Use the load_unaligned helper functions in mem_neon.h to load strided
sequences of 4 bytes where alignment is not guaranteed in the Neon
SAD and SAD4D paths.

Change-Id: I941d226ef94fd7a633b09fc92165a00ba68a1501
2023-01-31 15:39:21 +00:00
Cheng Chen a94cdd57ff Fix unsigned integer overflow in sse computation
Basically port the fix from libaom:
https://aomedia-review.googlesource.com/c/aom/+/169361

Change-Id: Id06a5db91372037832399200ded75d514e096726
2023-01-30 23:02:22 +00:00
James Zern 7a8ba7ea02 Merge "Refactor 8x8 16-bit Neon transpose functions" into main 2023-01-30 19:30:45 +00:00
Salome Thirot 8047e6f2b3 Refactor Neon implementation of SAD4D functions
Refactor and optimize the Neon implementation of SAD4D functions -
effectively backporting these libaom changes[1,2].

[1] https://aomedia-review.googlesource.com/c/aom/+/162181
[2] https://aomedia-review.googlesource.com/c/aom/+/162183

Change-Id: Icb04bd841d86f2d0e2596aa7ba86b74f8d2d360b
2023-01-30 13:14:54 +00:00
Yunqing Wang 698392d7fe Merge "Add encoder component timing information" into main 2023-01-28 00:27:57 +00:00
Yunqing Wang 5dd3d70a4f Add encoder component timing information
Change-Id: Iaa5b73a9593ecfd74b6426ed47d2b529ec7ae2b5
2023-01-27 11:33:29 -08:00
Gerda Zsejke More 5e92d6d103 Refactor 8x8 16-bit Neon transpose functions
Refactor the Neon implementation of transpose_s16_8x8(q) and
transpose_u16_8x8 so that the final step compiles to 8 ZIP1/ZIP2
instructions as opposed to 8 EXT, MOV pairs. This change removes 8
instructions per call to transpose_s16_8x8(q), transpose_u16_8x8
where the result stays in registers for further processing - rather
than being stored to memory - like in vpx_hadamard_8x8_neon, for
example.

This is a backport of this libaom patch[1].
[1] https://aomedia-review.googlesource.com/c/aom/+/169426

Change-Id: Icef3e51d40efeca7008e1c4fc701bf39bd319c88
2023-01-27 17:13:30 +01:00
Jerome Jiang 5c38ffbfa3 Merge "Fix per frame qp for temporal layers" into main 2023-01-26 21:31:14 +00:00
Jerome Jiang db69ce6aea Fix per frame qp for temporal layers
Also add tests with fixed temporal layering mode.

Change-Id: If516fe94e3fb7f5a745821d1788bfe6cf90edaac
2023-01-26 14:53:40 -05:00
James Zern ade7b131cc Merge "Refactor Neon implementation of SAD functions" into main 2023-01-26 03:26:38 +00:00
James Zern 2f24e444dd Merge "[NEON] Add Highbd FHT 8x8/16x16 functions" into main 2023-01-26 03:23:31 +00:00
Salome Thirot 7fed9187c4 Refactor Neon implementation of SAD functions
Refactor and optimize the Neon implementation of SAD functions -
effectively backporting these libaom changes[1,2,3].

[1] https://aomedia-review.googlesource.com/c/aom/+/161921
[2] https://aomedia-review.googlesource.com/c/aom/+/161923
[3] https://aomedia-review.googlesource.com/c/aom/+/166963

Change-Id: I2d72fd0f27d61a3e31a78acd33172e2afb044cb8
2023-01-25 15:35:51 +00:00
Konstantinos Margaritis 3384b83da0 [NEON] Add Highbd FHT 8x8/16x16 functions
In total this gives about 9% extra performance for both rt/best
profiles.
Furthermore, add transpose_s32 16x16 function

Change-Id: Ib6f368bbb9af7f03c9ce0deba1664cef77632fe2
2023-01-24 20:56:02 +00:00
444 changed files with 37082 additions and 16566 deletions
+3 -1
View File
@@ -5,6 +5,7 @@ Aex Converse <alexconv@twitch.tv> <alex.converse@gmail.com>
Alexis Ballier <aballier@gentoo.org> <alexis.ballier@gmail.com>
Alpha Lam <hclam@google.com> <hclam@chromium.org>
Angie Chiang <angiebird@google.com>
Bohan Li <bohanli@google.com>
Chris Cunningham <chcunningham@chromium.org>
Chi Yo Tsai <chiyotsai@google.com>
Daniele Castagna <dcastagna@chromium.org> <dcastagna@google.com>
@@ -20,6 +21,7 @@ Hui Su <huisu@google.com>
Jacky Chen <jackychen@google.com>
Jim Bankoski <jimbankoski@google.com>
Johann Koenig <johannkoenig@google.com>
Johann Koenig <johannkoenig@google.com> <johannkoenig@dhcp-172-19-7-52.mtv.corp.google.com>
Johann Koenig <johannkoenig@google.com> <johann.koenig@duck.com>
Johann Koenig <johannkoenig@google.com> <johannkoenig@chromium.org>
Johann <johann@duck.com> <johann.koenig@gmail.com>
@@ -53,4 +55,4 @@ Yaowu Xu <yaowu@google.com> <yaowu@xuyaowu.com>
Yaowu Xu <yaowu@google.com> <Yaowu Xu>
Venkatarama NG. Avadhani <venkatarama.avadhani@ittiam.com>
Vitaly Buka <vitalybuka@chromium.org> <vitlaybuka@chromium.org>
xiwei gu <guxiwei-hf@loongson.cn>
Xiwei Gu <guxiwei-hf@loongson.cn>
+15 -1
View File
@@ -25,21 +25,27 @@ Andrew Salkeld <andrew.salkeld@arm.com>
Angie Chen <yunqi@google.com>
Angie Chiang <angiebird@google.com>
Anton Venema <anton.venema@liveswitch.com>
Anupam Pandey <anupam.pandey@ittiam.com>
Aron Rosenberg <arosenberg@logitech.com>
Attila Nagy <attilanagy@google.com>
Birk Magnussen <birk.magnussen@googlemail.com>
Bohan Li <bohanli@google.com>
Brian Foley <bpfoley@google.com>
Brion Vibber <bvibber@wikimedia.org>
Casey Smalley <casey.smalley@arm.com>
changjun.yang <changjun.yang@intel.com>
Charles 'Buck' Krasic <ckrasic@google.com>
Cheng Chen <chengchen@google.com>
Chen Wang <wangchen20@iscas.ac.cn>
Cherma Rajan A <cherma.rajan@ittiam.com>
Chi Yo Tsai <chiyotsai@google.com>
chm <chm@rock-chips.com>
Chris Cunningham <chcunningham@chromium.org>
Christian Duvivier <cduvivier@google.com>
Chunbo Hua <chunbo.hua@intel.com>
Chun-Min Chang <chun.m.chang@gmail.com>
Clement Courbet <courbet@google.com>
Daniel Cheng <dcheng@chromium.org>
Daniele Castagna <dcastagna@chromium.org>
Daniel Kang <ddkang@google.com>
Daniel Sommermann <dcsommer@gmail.com>
@@ -60,6 +66,8 @@ Fritz Koenig <frkoenig@google.com>
Fyodor Kyslov <kyslov@google.com>
Gabriel Marin <gmx@chromium.org>
Gaute Strokkenes <gaute.strokkenes@broadcom.com>
George Steed <george.steed@arm.com>
Gerda Zsejke More <gerdazsejke.more@arm.com>
Geza Lore <gezalore@gmail.com>
Ghislain MARY <ghislainmary2@gmail.com>
Giuseppe Scrivano <gscrivano@gnu.org>
@@ -71,6 +79,7 @@ Hangyu Kuang <hkuang@google.com>
Hanno Böck <hanno@hboeck.de>
Han Shen <shenhan@google.com>
Hao Chen <chenhao@loongson.cn>
Hari Limaye <hari.limaye@arm.com>
Harish Mahendrakar <harish.mahendrakar@ittiam.com>
Henrik Lundin <hlundin@google.com>
Hien Ho <hienho@google.com>
@@ -103,6 +112,7 @@ Jin Bo <jinbo@loongson.cn>
Jingning Han <jingning@google.com>
Joel Fernandes <joelaf@google.com>
Joey Parrish <joeyparrish@google.com>
Johann <johann@duck.com>
Johann Koenig <johannkoenig@google.com>
John Koleszar <jkoleszar@google.com>
Johnny Klonaris <google@jawknee.com>
@@ -120,6 +130,7 @@ KO Myung-Hun <komh@chollian.net>
Konstantinos Margaritis <konma@vectorcamp.gr>
Kyle Siefring <kylesiefring@gmail.com>
Lawrence Velázquez <larryv@macports.org>
L. E. Segovia <amy@amyspark.me>
Linfeng Zhang <linfengz@google.com>
Liu Peng <pengliu.mail@gmail.com>
Lou Quillio <louquillio@google.com>
@@ -147,6 +158,7 @@ Mirko Bonadei <mbonadei@google.com>
Moriyoshi Koizumi <mozo@mozo.jp>
Morton Jonuschat <yabawock@gmail.com>
Nathan E. Egge <negge@mozilla.com>
Neeraj Gadgil <neeraj.gadgil@ittiam.com>
Neil Birkbeck <neil.birkbeck@gmail.com>
Nico Weber <thakis@chromium.org>
Niveditha Rau <niveditha.rau@gmail.com>
@@ -213,7 +225,8 @@ Vitaly Buka <vitalybuka@chromium.org>
Vlad Tsyrklevich <vtsyrklevich@chromium.org>
Wan-Teh Chang <wtc@google.com>
Wonkap Jang <wonkap@google.com>
xiwei gu <guxiwei-hf@loongson.cn>
Xiahong Bao <xiahong.bao@nxp.com>
Xiwei Gu <guxiwei-hf@loongson.cn>
Yaowu Xu <yaowu@google.com>
Yi Luo <luoyi@google.com>
Yongzhe Wang <yongzhe@google.com>
@@ -223,6 +236,7 @@ Yun Liu <yliuyliu@google.com>
Yunqing Wang <yunqingwang@google.com>
Yury Gitman <yuryg@google.com>
Zoe Liu <zoeliu@google.com>
Zoltan Kuscsik <zoltan@s57.io>
Google Inc.
The Mozilla Foundation
The Xiph.Org Foundation
+199
View File
@@ -1,3 +1,202 @@
2025-01-09 v1.15.1 "Wigeon Duck"
This release bumps up the SO major version and fixes the language about ABI
compatibility in the previous release changelog.
2024-10-22 v1.15.0 "Wigeon Duck"
This release includes new codec control for key frame filtering, more Neon
optimizations, improvements to RTC encoding and bug fixes.
- Upgrading:
This release is ABI incompatible with the previous release.
It is strongly recommended to skip this release and upgrade to v1.15.1 since
the shared object was versioned incorrectly, as shown in
https://issues.webmproject.org/issues/384672478.
Temporal filtering improvement that can be turned on with the new codec
control VP9E_SET_KEY_FRAME_FILTERING, which gives 1+% BD-rate saving with
minimal encoder time increase.
libwebm is upgraded to libwebm-1.0.0.31-10-g3b63004
- Enhancement:
Neon optimization speed up
1-3% speed up across speed 5 to 10 for RTC
3% speed up for speed 0 and 1 for VoD in standard bitdepth
3% and 7% speed up for speed 0 and 1 respectively for VoD in high bitdepth
Scene detection is allowed for all RTC speeds (>=5)
Support profile guided optimizations
Delta quantization parameters for UV channels for vp8 is supported in RTC
rate control library
Rate control parameters are reset and maximum QP is enforced on scene
changes in SVC when there is no inter-layer prediction
- Bug fixes:
Fix to Uninitialized scalar variable in `vp9_rd_pick_inter_mode_sb()`
Fix to Integer-overflow in `resize_multistep`
Fix to Heap-buffer-overflow in `vpx_sad64x64_avx2`
Fix to Crash in `vpx_sad8x8_sse2`
Fix to Assertion in `write_modes`
Support profile guided optimizations
Fix to Integer-overflow in `encode_frame_to_data_rate`
Fix to Integer-overflow in `vp9_svc_check_reset_layer_rc_flag`
Fix to core dump error from /usr/bin/tools/tiny_ssim --help
Fix to use-of-uninitialized-value in `vp9_setup_tpl_stats`
Fix to Undefined-shift in `vp9_cyclic_refresh_setup`
Fix to redundant `&& __GNUC__` preproc check
Fix to valgrind warning in EncodeAPI.OssFuzz69906
Fix to Index-out-of-bounds in `vp8_rd_pick_inter_mode`
Fix to Integer-overflow in `vp8_pick_frame_size`
Fix to Use-of-uninitialized-value in `vpx_codec_peek_stream_info`
Fix to log clutters with the message "Warning: Desired height too large"
Fix to Integer-overflow in `vp9_svc_adjust_avg_frame_qindex`
Fix to integer overflows caused by huge target bitrate, frame rate, or
g_timebase numerator or denominator
Fix to missing license headers
Fix to build failure for Android Armv7
Fix to integer overflows in image helpers
Fix to Integer-overflow in `vp9_calc_iframe_target_size_one_pass_cbr`
Fix to Heap-buffer-overflow in `vp9_pick_inter_mode`
Fix to Segv in `vp9_multi_thread_tile_init`
Fix to Use-of-uninitialized-value in `vp9_row_mt_sync_mem_dealloc`
Fix to Crash in `mbloop_filter_vertical_edge_c`
Fix to Check failed in CheckUnwind
Fix to Heap-buffer-overflow in `write_modes_b` and `vpx_write`
Fix to Possible signed integer overflow found in `vpx_codec_encode`
Fix to build conflicts between Abseil and libaom/libvpx in Win ARM64 builds
Fix to build failures on aarch64
Fix to Data race in libvpx ARM NEON
Fix to Heap-buffer-overflow in `scale_plane_1_to_2_phase_0`
Fix to integer overflow in `encode_mb_row`
Fix to Floating-point-exception in `vp8_pick_frame_size`
Fix to Heap-buffer-overflow in `vp9_enc_setup_mi`
Fix to build failure with --target=arm64-win64-vs17
Fix to heap-buffer-overflow write in `vpx_img_read()`
Fix to C vs armv8-linux-gcc encode mismatches for `y4m_360p_10bit_input`
Fix to Null-dereference READ in `ml_predict_var_rd_partitioning`
Fix to Heap-buffer-overflow in `vpx_scaled_2d_ssse3`
Fix to Crash in `convolve_horiz`
Fix to Ill in `vpx_scaled_2d_ssse3`
Fix to Global-buffer-overflow in `cost_coeffs`
2024-05-21 v1.14.1 "Venetian Duck"
This release includes enhancements and bug fixes.
- Upgrading:
This release is ABI compatible with the previous release.
- Enhancement:
Improved the detection of compiler support for AArch64 extensions,
particularly SVE.
Added vpx_codec_get_global_headers() support for VP9.
- Bug fixes:
Added buffer bounds checks to vpx_writer and vpx_write_bit_buffer.
Fix to GetSegmentationData() crash in aq_mode=0 for RTC rate control.
Fix to alloc for row_base_thresh_freq_fac.
Free row mt memory before freeing cpi->tile_data.
Fix to buffer alloc for vp9_bitstream_worker_data.
Fix to VP8 race issue for multi-thread with pnsr_calc.
Fix to uv width/height in vp9_scale_and_extend_frame_ssse3.
Fix to integer division by zero and overflow in calc_pframe_target_size().
Fix to integer overflow in vpx_img_alloc() & vpx_img_wrap()(CVE-2024-5197).
Fix to UBSan error in vp9_rc_update_framerate().
Fix to UBSan errors in vp8_new_framerate().
Fix to integer overflow in vp8 encodeframe.c.
Handle EINTR from sem_wait().
2024-01-02 v1.14.0 "Venetian Duck"
This release drops support for old C compilers, such as Visual Studio 2012
and older, that disallow mixing variable declarations and statements (a C99
feature). It adds support for run-time CPU feature detection for Arm
platforms, as well as support for darwin23 (macOS 14).
- Upgrading:
This release is ABI incompatible with the previous release.
Various new features for rate control library for real-time: SVC parallel
encoding, loopfilter level, support for frame dropping, and screen content.
New callback function send_tpl_gop_stats for vp9 external rate control
library, which can be used to transmit TPL stats for a group of pictures. A
public header vpx_tpl.h is added for the definition of TPL stats used in
this callback.
libwebm is upgraded to libwebm-1.0.0.29-9-g1930e3c.
- Enhancement:
Improvements on Neon optimizations: VoD: 12-35% speed up for bitdepth 8,
68%-151% speed up for high bitdepth.
Improvements on AVX2 and SSE optimizations.
Improvements on LSX optimizations for LoongArch.
42-49% speedup on speed 0 VoD encoding.
Android API level predicates.
- Bug fixes:
Fix to missing prototypes from the rtcd header.
Fix to segfault when total size is enlarged but width is smaller.
Fix to the build for arm64ec using MSVC.
Fix to copy BLOCK_8X8's mi to PICK_MODE_CONTEXT::mic.
Fix to -Wshadow warnings.
Fix to heap overflow in vpx_get4x4sse_cs_neon.
Fix to buffer overrun in highbd Neon subpel variance filters.
Added bitexact encode test script.
Fix to -Wl,-z,defs with Clang's sanitizers.
Fix to decoder stability after error & continued decoding.
Fix to mismatch of VP9 encode with NEON intrinsics with C only version.
Fix to Arm64 MSVC compile vpx_highbd_fdct4x4_neon.
Fix to fragments count before use.
Fix to a case where target bandwidth is 0 for SVC.
Fix mask in vp9_quantize_avx2,highbd_get_max_lane_eob.
Fix to int overflow in vp9_calc_pframe_target_size_one_pass_cbr.
Fix to integer overflow in vp8,ratectrl.c.
Fix to integer overflow in vp9 svc.
Fix to avg_frame_bandwidth overflow.
Fix to per frame qp for temporal layers.
Fix to unsigned integer overflow in sse computation.
Fix to uninitialized mesh feature for BEST mode.
Fix to overflow in highbd temporal_filter.
Fix to unaligned loads w/w==4 in vpx_convolve_copy_neon.
Skip arm64_neon.h workaround w/VS >= 2019.
Fix to c vs avx mismatch of diamond_search_sad().
Fix to c vs intrinsic mismatch of vpx_hadamard_32x32() function.
Fix to a bug in vpx_hadamard_32x32_neon().
Fix to Clang -Wunreachable-code-aggressive warnings.
Fix to a bug in vpx_highbd_hadamard_32x32_neon().
Fix to -Wunreachable-code in mfqe_partition.
Force mode search on 64x64 if no mode is selected.
Fix to ubsan failure caused by left shift of negative.
Fix to integer overflow in calc_pframe_target_size.
Fix to float-cast-overflow in vp8_change_config().
Fix to a null ptr before use.
Conditionally skip using inter frames in speed features.
Remove invalid reference frames.
Disable intra mode search speed features conditionally.
Set nonrd keyframe under dynamic change of deadline for rtc.
Fix to scaled reference offsets.
Set skip_recode=0 in nonrd_pick_sb_modes.
Fix to an edge case when downsizing to one.
Fix to a bug in frame scaling.
Fix to pred buffer stride.
Fix to a bug in simple motion search.
Update frame size in actual encoding.
2023-09-29 v1.13.1 "Ugly Duckling"
This release contains two security related fixes. One each for VP8 and VP9.
- Upgrading:
This release is ABI compatible with the previous release.
- Bug fixes:
https://crbug.com/1486441 (CVE-2023-5217)
Fix to a crash related to VP9 encoding (#1642, CVE-2023-6349)
2023-01-31 v1.13.0 "Ugly Duckling"
This release includes more Neon and AVX2 optimizations, adds a new codec
control to set per frame QP, upgrades GoogleTest to v1.12.1, and includes
+60 -4
View File
@@ -1,5 +1,3 @@
v1.12.0 Torrent Duck
Welcome to the WebM VP8/VP9 Codec SDK!
COMPILING THE APPLICATIONS/LIBRARIES:
@@ -64,9 +62,17 @@ COMPILING THE APPLICATIONS/LIBRARIES:
arm64-android-gcc
arm64-darwin-gcc
arm64-darwin20-gcc
arm64-darwin21-gcc
arm64-darwin22-gcc
arm64-darwin23-gcc
arm64-darwin24-gcc
arm64-linux-gcc
arm64-win64-gcc
arm64-win64-vs15
arm64-win64-vs16
arm64-win64-vs16-clangcl
arm64-win64-vs17
arm64-win64-vs17-clangcl
armv7-android-gcc
armv7-darwin-gcc
armv7-linux-rvct
@@ -75,8 +81,12 @@ COMPILING THE APPLICATIONS/LIBRARIES:
armv7-win32-gcc
armv7-win32-vs14
armv7-win32-vs15
armv7-win32-vs16
armv7-win32-vs17
armv7s-darwin-gcc
armv8-linux-gcc
loongarch32-linux-gcc
loongarch64-linux-gcc
mips32-linux-gcc
mips64-linux-gcc
ppc64le-linux-gcc
@@ -117,6 +127,10 @@ COMPILING THE APPLICATIONS/LIBRARIES:
x86_64-darwin18-gcc
x86_64-darwin19-gcc
x86_64-darwin20-gcc
x86_64-darwin21-gcc
x86_64-darwin22-gcc
x86_64-darwin23-gcc
x86_64-darwin24-gcc
x86_64-iphonesimulator-gcc
x86_64-linux-gcc
x86_64-linux-icc
@@ -138,8 +152,8 @@ COMPILING THE APPLICATIONS/LIBRARIES:
$ CROSS=mipsel-linux-uclibc- ../libvpx/configure
In addition, the executables to be invoked can be overridden by specifying the
environment variables: CC, AR, LD, AS, STRIP, NM. Additional flags can be
passed to these executables with CFLAGS, LDFLAGS, and ASFLAGS.
environment variables: AR, AS, CC, CXX, LD, STRIP. Additional flags can be
passed to these executables with ASFLAGS, CFLAGS, CXXFLAGS, and LDFLAGS.
6. Configuration errors
If the configuration step fails, the first step is to look in the error log.
@@ -169,7 +183,49 @@ CODE STYLE:
See also: http://clang.llvm.org/docs/ClangFormat.html
PROFILE GUIDED OPTIMIZATION (PGO)
Profile Guided Optimization can be enabled for Clang builds using the
commands:
$ export CC=clang
$ export CXX=clang++
$ ../libvpx/configure --enable-profile
$ make
Generate one or multiple PGO profile files by running vpxdec or vpxenc. For
example:
$ ./vpxdec ../vpx/out_ful/vp90-2-sintel_1280x546_tile_1x4_1257kbps.webm \
-o - > /dev/null
To convert and merge the raw profile files, use the llvm-profdata tool:
$ llvm-profdata merge -o perf.profdata default_8382761441159425451_0.profraw
Then, rebuild the project with the new profile file:
$ make clean
$ ../libvpx/configure --use-profile=perf.profdata
$ make
Note: Always use the llvm-profdata from the toolchain that is used for
compiling the PGO-enabled binary.
To observe the improvements from a PGO-enabled build, enable and compare the
list of failed optimizations by using the -Rpass-missed compiler flag. For
example, to list the failed loop vectorizations:
$ ../libvpx/configure --use-profile=perf.profdata \
--extra-cflags=-Rpass-missed=loop-vectorize
For guidance on utilizing PGO files to identify potential optimization
opportunities, see: tools/README.pgo.md
SUPPORT
This library is an open source project supported by its community. Please
email webm-discuss@webmproject.org for help.
BUG REPORTS
Bug reports can be filed in the libvpx issue tracker:
https://issues.webmproject.org/.
For security reports, select 'Security report' from the Template dropdown.
+1 -1
View File
@@ -108,7 +108,7 @@ index b3af677d2..7b65bb4a7 100644
%macro FIRST_2_ROWS 0
movdqa xmm4, xmm0
diff --git a/vpx_dsp/x86/ssim_opt_x86_64.asm b/vpx_dsp/x86/ssim_opt_x86_64.asm
index 41ffbb07e..efb7759f5 100644
index 1ad3b88c8..d019e549d 100644
--- a/vpx_dsp/x86/ssim_opt_x86_64.asm
+++ b/vpx_dsp/x86/ssim_opt_x86_64.asm
@@ -10,6 +10,7 @@
+30 -3
View File
@@ -744,6 +744,15 @@
<ClInclude Include="..\vp9\encoder\vp9_ext_ratectrl.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_firstpass_stats.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_tpl_model.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vpx_tpl.h">
<Filter>Source Files\libvpx\vpx</Filter>
</ClInclude>
</ItemGroup>
<ItemGroup>
<ClCompile Include="..\vpx\src\vpx_encoder.c">
@@ -1274,9 +1283,6 @@
<ClCompile Include="..\vp9\encoder\x86\vp9_dct_intrin_sse2.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\vp9_diamond_search_sad_avx.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\add_noise.c">
<Filter>Source Files\libvpx\vpx_dsp</Filter>
</ClCompile>
@@ -1436,6 +1442,27 @@
<ClCompile Include="..\vpx_dsp\x86\highbd_sad_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\avg_pred_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\inv_txfm_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx\src\vpx_tpl.c">
<Filter>Source Files\libvpx\vpx</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\vp9_tpl_model.c">
<Filter>Source Files\libvpx\vp9\encoder</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\sse.c">
<Filter>Source Files\libvpx\vpx_dsp</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\sse_sse4.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\sse_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
</ItemGroup>
<ItemGroup>
<None Include="libvpx.def">
+9 -2
View File
@@ -103,6 +103,7 @@
<ClInclude Include="..\vp9\encoder\vp9_extend.h" />
<ClInclude Include="..\vp9\encoder\vp9_ext_ratectrl.h" />
<ClInclude Include="..\vp9\encoder\vp9_firstpass.h" />
<ClInclude Include="..\vp9\encoder\vp9_firstpass_stats.h" />
<ClInclude Include="..\vp9\encoder\vp9_job_queue.h" />
<ClInclude Include="..\vp9\encoder\vp9_lookahead.h" />
<ClInclude Include="..\vp9\encoder\vp9_mbgraph.h" />
@@ -124,6 +125,7 @@
<ClInclude Include="..\vp9\encoder\vp9_svc_layercontext.h" />
<ClInclude Include="..\vp9\encoder\vp9_temporal_filter.h" />
<ClInclude Include="..\vp9\encoder\vp9_tokenize.h" />
<ClInclude Include="..\vp9\encoder\vp9_tpl_model.h" />
<ClInclude Include="..\vp9\encoder\vp9_treewriter.h" />
<ClInclude Include="..\vp9\vp9_dx_iface.h" />
<ClInclude Include="..\vp9\vp9_iface_common.h" />
@@ -137,6 +139,7 @@
<ClInclude Include="..\vpx\vpx_frame_buffer.h" />
<ClInclude Include="..\vpx\vpx_image.h" />
<ClInclude Include="..\vpx\vpx_integer.h" />
<ClInclude Include="..\vpx\vpx_tpl.h" />
<ClInclude Include="..\vpx\internal\vpx_codec_internal.h" />
<ClInclude Include="..\vpx_dsp\bitreader.h" />
<ClInclude Include="..\vpx_dsp\bitreader_buffer.h" />
@@ -181,7 +184,6 @@
<ClInclude Include="..\vpx_ports\mem.h" />
<ClInclude Include="..\vpx_ports\mem_ops.h" />
<ClInclude Include="..\vpx_ports\mem_ops_aligned.h" />
<ClInclude Include="..\vpx_ports\msvc.h" />
<ClInclude Include="..\vpx_ports\static_assert.h" />
<ClInclude Include="..\vpx_ports\system_state.h" />
<ClInclude Include="..\vpx_ports\vpx_once.h" />
@@ -346,11 +348,11 @@
<ClCompile Include="..\vp9\encoder\vp9_svc_layercontext.c" />
<ClCompile Include="..\vp9\encoder\vp9_temporal_filter.c" />
<ClCompile Include="..\vp9\encoder\vp9_tokenize.c" />
<ClCompile Include="..\vp9\encoder\vp9_tpl_model.c" />
<ClCompile Include="..\vp9\encoder\vp9_treewriter.c" />
<ClCompile Include="..\vp9\encoder\x86\highbd_temporal_filter_sse4.c" />
<ClCompile Include="..\vp9\encoder\x86\temporal_filter_sse4.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_dct_intrin_sse2.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_diamond_search_sad_avx.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_error_avx2.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_frame_scale_ssse3.c" />
<ClCompile Include="..\vp9\encoder\x86\vp9_highbd_block_error_intrin_sse2.c" />
@@ -379,6 +381,7 @@
<ClCompile Include="..\vpx_dsp\psnr.c" />
<ClCompile Include="..\vpx_dsp\quantize.c" />
<ClCompile Include="..\vpx_dsp\sad.c" />
<ClCompile Include="..\vpx_dsp\sse.c" />
<ClCompile Include="..\vpx_dsp\skin_detection.c" />
<ClCompile Include="..\vpx_dsp\subtract.c" />
<ClCompile Include="..\vpx_dsp\sum_squares.c" />
@@ -388,6 +391,7 @@
<ClCompile Include="..\vpx_dsp\x86\avg_intrin_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\avg_intrin_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\avg_pred_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\avg_pred_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\fwd_txfm_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\fwd_txfm_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\highbd_convolve_avx2.c" />
@@ -409,6 +413,7 @@
<ClCompile Include="..\vpx_dsp\x86\highbd_variance_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\inv_txfm_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\inv_txfm_ssse3.c" />
<ClCompile Include="..\vpx_dsp\x86\inv_txfm_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\loopfilter_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\loopfilter_sse2.c">
<ObjectFileName>$(IntDir)\vpx_%(Filename).obj</ObjectFileName>
@@ -424,6 +429,8 @@
<ExcludedFromBuild Condition="'$(VisualStudioVersion)' == '12.0'">true</ExcludedFromBuild>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\sad_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\sse_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\sse_sse4.c" />
<ClCompile Include="..\vpx_dsp\x86\subtract_avx2.c" />
<ClCompile Include="..\vpx_dsp\x86\sum_squares_sse2.c" />
<ClCompile Include="..\vpx_dsp\x86\variance_avx2.c" />
+30 -3
View File
@@ -744,6 +744,15 @@
<ClInclude Include="..\vp9\encoder\vp9_ext_ratectrl.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_firstpass_stats.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vp9\encoder\vp9_tpl_model.h">
<Filter>Header Files\libvpx\vp9\encoder</Filter>
</ClInclude>
<ClInclude Include="..\vpx\vpx_tpl.h">
<Filter>Header Files\libvpx\vpx</Filter>
</ClInclude>
</ItemGroup>
<ItemGroup>
<ClCompile Include="..\vpx\src\vpx_encoder.c">
@@ -1274,9 +1283,6 @@
<ClCompile Include="..\vp9\encoder\x86\vp9_dct_intrin_sse2.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\x86\vp9_diamond_search_sad_avx.c">
<Filter>Source Files\libvpx\vp9\encoder\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\add_noise.c">
<Filter>Source Files\libvpx\vpx_dsp</Filter>
</ClCompile>
@@ -1436,6 +1442,27 @@
<ClCompile Include="..\vpx_dsp\x86\highbd_sad_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\avg_pred_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\inv_txfm_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx\src\vpx_tpl.c">
<Filter>Source Files\libvpx\vpx</Filter>
</ClCompile>
<ClCompile Include="..\vp9\encoder\vp9_tpl_model.c">
<Filter>Source Files\libvpx\vp9\encoder</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\sse.c">
<Filter>Source Files\libvpx\vpx_dsp</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\sse_sse4.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
<ClCompile Include="..\vpx_dsp\x86\sse_avx2.c">
<Filter>Source Files\libvpx\vpx_dsp\x86</Filter>
</ClCompile>
</ItemGroup>
<ItemGroup>
<None Include="libvpx.def">
+4
View File
@@ -228,6 +228,7 @@
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '10.0'">10.0</MinimumRequiredVersion>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '8.1'">8.1</MinimumRequiredVersion>
<GenerateWindowsMetadata>false</GenerateWindowsMetadata>
<WindowsMetadataFile>$(OutDir)\lib\x86\$(RootNamespace).winmd</WindowsMetadataFile>
</Link>
</ItemDefinitionGroup>
@@ -254,6 +255,7 @@
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '10.0'">10.0</MinimumRequiredVersion>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '8.1'">8.1</MinimumRequiredVersion>
<GenerateWindowsMetadata>false</GenerateWindowsMetadata>
<WindowsMetadataFile>$(OutDir)\lib\x64\$(RootNamespace).winmd</WindowsMetadataFile>
</Link>
</ItemDefinitionGroup>
@@ -344,6 +346,7 @@
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '10.0'">10.0</MinimumRequiredVersion>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '8.1'">8.1</MinimumRequiredVersion>
<GenerateWindowsMetadata>false</GenerateWindowsMetadata>
<WindowsMetadataFile>$(OutDir)\lib\x86\$(RootNamespace).winmd</WindowsMetadataFile>
</Link>
</ItemDefinitionGroup>
@@ -378,6 +381,7 @@
<GenerateDebugInformation>true</GenerateDebugInformation>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '10.0'">10.0</MinimumRequiredVersion>
<MinimumRequiredVersion Condition="'$(ApplicationTypeRevision)' == '8.1'">8.1</MinimumRequiredVersion>
<GenerateWindowsMetadata>false</GenerateWindowsMetadata>
<WindowsMetadataFile>$(OutDir)\lib\x64\$(RootNamespace).winmd</WindowsMetadataFile>
</Link>
</ItemDefinitionGroup>
+8 -3
View File
@@ -1,4 +1,5 @@
VPX_ARCH_ARM equ 0
VPX_ARCH_AARCH64 equ 0
VPX_ARCH_MIPS equ 0
%ifidn __OUTPUT_FORMAT__,win64
VPX_ARCH_X86 equ 0
@@ -12,8 +13,12 @@ VPX_ARCH_X86_64 equ 0
%endif
VPX_ARCH_PPC equ 0
VPX_ARCH_LOONGARCH equ 0
HAVE_NEON equ 0
HAVE_NEON_ASM equ 0
HAVE_NEON equ 0
HAVE_NEON_DOTPROD equ 0
HAVE_NEON_I8MM equ 0
HAVE_SVE equ 0
HAVE_SVE2 equ 0
HAVE_MIPS32 equ 0
HAVE_DSPR2 equ 0
HAVE_MSA equ 0
@@ -46,7 +51,7 @@ CONFIG_GCOV equ 0
CONFIG_RVCT equ 0
CONFIG_GCC equ 0
CONFIG_MSVS equ 1
CONFIG_PIC equ 0
CONFIG_PIC equ 1
CONFIG_BIG_ENDIAN equ 0
CONFIG_CODEC_SRCS equ 0
CONFIG_DEBUG_LIBS equ 0
@@ -83,7 +88,6 @@ CONFIG_ENCODE_PERF_TESTS equ 0
CONFIG_MULTI_RES_ENCODING equ 0
CONFIG_TEMPORAL_DENOISING equ 1
CONFIG_VP9_TEMPORAL_DENOISING equ 0
CONFIG_CONSISTENT_RECODE equ 0
CONFIG_COEFFICIENT_RANGE_CHECKING equ 0
CONFIG_VP9_HIGHBITDEPTH equ 1
CONFIG_BETTER_HW_COMPATIBILITY equ 0
@@ -96,3 +100,4 @@ CONFIG_FP_MB_STATS equ 0
CONFIG_EMULATE_HARDWARE equ 0
CONFIG_NON_GREEDY_MV equ 0
CONFIG_RATE_CTRL equ 0
CONFIG_COLLECT_COMPONENT_TIMING equ 0
+8 -3
View File
@@ -11,6 +11,7 @@
#define RESTRICT
#define INLINE __inline
#define VPX_ARCH_ARM 0
#define VPX_ARCH_AARCH64 0
#define VPX_ARCH_MIPS 0
#if defined(__x86_64) || defined(_M_X64)
#define VPX_ARCH_X86 0
@@ -21,8 +22,12 @@
#endif
#define VPX_ARCH_PPC 0
#define VPX_ARCH_LOONGARCH 0
#define HAVE_NEON 0
#define HAVE_NEON_ASM 0
#define HAVE_NEON 0
#define HAVE_NEON_DOTPROD 0
#define HAVE_NEON_I8MM 0
#define HAVE_SVE 0
#define HAVE_SVE2 0
#define HAVE_MIPS32 0
#define HAVE_DSPR2 0
#define HAVE_MSA 0
@@ -59,7 +64,7 @@
#define CONFIG_RVCT 0
#define CONFIG_GCC 0
#define CONFIG_MSVS 1
#define CONFIG_PIC 0
#define CONFIG_PIC 1
#define CONFIG_BIG_ENDIAN 0
#define CONFIG_CODEC_SRCS 0
#define CONFIG_DEBUG_LIBS 0
@@ -105,7 +110,6 @@
#define CONFIG_MULTI_RES_ENCODING 0
#define CONFIG_TEMPORAL_DENOISING 1
#define CONFIG_VP9_TEMPORAL_DENOISING 0
#define CONFIG_CONSISTENT_RECODE 0
#define CONFIG_COEFFICIENT_RANGE_CHECKING 0
#define CONFIG_VP9_HIGHBITDEPTH 1
#define CONFIG_BETTER_HW_COMPATIBILITY 0
@@ -118,4 +122,5 @@
#define CONFIG_EMULATE_HARDWARE 0
#define CONFIG_NON_GREEDY_MV 0
#define CONFIG_RATE_CTRL 0
#define CONFIG_COLLECT_COMPONENT_TIMING 0
#endif /* VPX_CONFIG_H */
+7 -4
View File
@@ -1,8 +1,11 @@
// This file is generated. Do not edit.
#ifndef VPX_VERSION_H_
#define VPX_VERSION_H_
#define VERSION_MAJOR 1
#define VERSION_MINOR 13
#define VERSION_PATCH 0
#define VERSION_MINOR 15
#define VERSION_PATCH 1
#define VERSION_EXTRA ""
#define VERSION_PACKED ((VERSION_MAJOR<<16)|(VERSION_MINOR<<8)|(VERSION_PATCH))
#define VERSION_STRING_NOSP "v1.13.0"
#define VERSION_STRING " v1.13.0"
#define VERSION_STRING_NOSP "v1.15.1"
#define VERSION_STRING " v1.15.1"
#endif // VPX_VERSION_H_
+11 -10
View File
@@ -1,3 +1,13 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VP8_RTCD_H_
#define VP8_RTCD_H_
@@ -45,15 +55,6 @@ void vp8_bilinear_predict8x8_sse2(unsigned char *src_ptr, int src_pixels_per_lin
void vp8_bilinear_predict8x8_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict8x8)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_blend_b_c(unsigned char *y, unsigned char *u, unsigned char *v, int y_1, int u_1, int v_1, int alpha, int stride);
#define vp8_blend_b vp8_blend_b_c
void vp8_blend_mb_inner_c(unsigned char *y, unsigned char *u, unsigned char *v, int y_1, int u_1, int v_1, int alpha, int stride);
#define vp8_blend_mb_inner vp8_blend_mb_inner_c
void vp8_blend_mb_outer_c(unsigned char *y, unsigned char *u, unsigned char *v, int y_1, int u_1, int v_1, int alpha, int stride);
#define vp8_blend_mb_outer vp8_blend_mb_outer_c
int vp8_block_error_c(short *coeff, short *dqcoeff);
int vp8_block_error_sse2(short *coeff, short *dqcoeff);
RTCD_EXTERN int (*vp8_block_error)(short *coeff, short *dqcoeff);
@@ -329,4 +330,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VP8_RTCD_H_
+31 -22
View File
@@ -1,3 +1,13 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VP9_RTCD_H_
#define VP9_RTCD_H_
@@ -21,7 +31,9 @@ struct macroblockd;
/* Encoder forward decls */
struct macroblock;
struct vp9_variance_vtable;
struct macroblock_plane;
struct vp9_sad_table;
struct ScanOrder;
struct search_site_config;
struct mv;
union int_mv;
@@ -45,9 +57,8 @@ int64_t vp9_block_error_fp_sse2(const tran_low_t *coeff, const tran_low_t *dqcoe
int64_t vp9_block_error_fp_avx2(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
RTCD_EXTERN int64_t (*vp9_block_error_fp)(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
int vp9_diamond_search_sad_c(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
int vp9_diamond_search_sad_avx(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
RTCD_EXTERN int (*vp9_diamond_search_sad)(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
int vp9_diamond_search_sad_c(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, uint32_t start_mv_sad, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_sad_table *sad_fn_ptr, const struct mv *center_mv);
#define vp9_diamond_search_sad vp9_diamond_search_sad_c
void vp9_fht16x16_c(const int16_t *input, tran_low_t *output, int stride, int tx_type);
void vp9_fht16x16_sse2(const int16_t *input, tran_low_t *output, int stride, int tx_type);
@@ -97,13 +108,13 @@ void vp9_highbd_iht8x8_64_add_c(const tran_low_t *input, uint16_t *dest, int str
void vp9_highbd_iht8x8_64_add_sse4_1(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
RTCD_EXTERN void (*vp9_highbd_iht8x8_64_add)(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_highbd_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vp9_highbd_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_highbd_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_highbd_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_highbd_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vp9_highbd_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_highbd_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_highbd_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_temporal_filter_apply_c(const uint8_t *frame1, unsigned int stride, const uint8_t *frame2, unsigned int block_width, unsigned int block_height, int strength, int *blk_fw, int use_32x32, uint32_t *accumulator, uint16_t *count);
#define vp9_highbd_temporal_filter_apply vp9_highbd_temporal_filter_apply_c
@@ -120,16 +131,16 @@ void vp9_iht8x8_64_add_c(const tran_low_t *input, uint8_t *dest, int stride, int
void vp9_iht8x8_64_add_sse2(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
RTCD_EXTERN void (*vp9_iht8x8_64_add)(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
void vp9_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vp9_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_32x32_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vp9_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_scale_and_extend_frame_c(const struct yv12_buffer_config *src, struct yv12_buffer_config *dst, INTERP_FILTER filter_type, int phase_scaler);
void vp9_scale_and_extend_frame_ssse3(const struct yv12_buffer_config *src, struct yv12_buffer_config *dst, INTERP_FILTER filter_type, int phase_scaler);
@@ -153,8 +164,6 @@ static void setup_rtcd_internal(void)
vp9_block_error_fp = vp9_block_error_fp_c;
if (flags & HAS_SSE2) vp9_block_error_fp = vp9_block_error_fp_sse2;
if (flags & HAS_AVX2) vp9_block_error_fp = vp9_block_error_fp_avx2;
vp9_diamond_search_sad = vp9_diamond_search_sad_c;
if (flags & HAS_AVX) vp9_diamond_search_sad = vp9_diamond_search_sad_avx;
vp9_fht16x16 = vp9_fht16x16_c;
if (flags & HAS_SSE2) vp9_fht16x16 = vp9_fht16x16_sse2;
vp9_fht4x4 = vp9_fht4x4_c;
@@ -199,4 +208,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VP9_RTCD_H_
+490 -109
View File
@@ -1,3 +1,13 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VPX_DSP_RTCD_H_
#define VPX_DSP_RTCD_H_
@@ -15,6 +25,10 @@
#include "vpx/vpx_integer.h"
#include "vpx_dsp/vpx_dsp_common.h"
#include "vpx_dsp/vpx_filter.h"
#if CONFIG_VP9_ENCODER
struct macroblock_plane;
struct ScanOrder;
#endif
#ifdef __cplusplus
@@ -31,6 +45,7 @@ RTCD_EXTERN unsigned int (*vpx_avg_8x8)(const uint8_t *, int p);
void vpx_comp_avg_pred_c(uint8_t *comp_pred, const uint8_t *pred, int width, int height, const uint8_t *ref, int ref_stride);
void vpx_comp_avg_pred_sse2(uint8_t *comp_pred, const uint8_t *pred, int width, int height, const uint8_t *ref, int ref_stride);
void vpx_comp_avg_pred_avx2(uint8_t *comp_pred, const uint8_t *pred, int width, int height, const uint8_t *ref, int ref_stride);
RTCD_EXTERN void (*vpx_comp_avg_pred)(uint8_t *comp_pred, const uint8_t *pred, int width, int height, const uint8_t *ref, int ref_stride);
void vpx_convolve8_c(const uint8_t *src, ptrdiff_t src_stride, uint8_t *dst, ptrdiff_t dst_stride, const InterpKernel *filter, int x0_q4, int x_step_q4, int y0_q4, int y_step_q4, int w, int h);
@@ -1214,15 +1229,15 @@ RTCD_EXTERN void (*vpx_highbd_lpf_vertical_8_dual)(uint16_t *s, int pitch, const
void vpx_highbd_minmax_8x8_c(const uint8_t *s8, int p, const uint8_t *d8, int dp, int *min, int *max);
#define vpx_highbd_minmax_8x8 vpx_highbd_minmax_8x8_c
void vpx_highbd_quantize_b_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vpx_highbd_quantize_b)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_highbd_quantize_b_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_highbd_quantize_b_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vpx_highbd_quantize_b)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_highbd_quantize_b_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_32x32_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vpx_highbd_quantize_b_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_32x32_c(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_highbd_quantize_b_32x32_sse2(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_highbd_quantize_b_32x32_avx2(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vpx_highbd_quantize_b_32x32)(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
unsigned int vpx_highbd_sad16x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad16x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1234,10 +1249,10 @@ unsigned int vpx_highbd_sad16x16_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad16x16_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad16x16_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad16x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad16x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1249,10 +1264,10 @@ unsigned int vpx_highbd_sad16x32_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad16x32_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad16x32_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad16x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad16x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1264,10 +1279,10 @@ unsigned int vpx_highbd_sad16x8_avg_sse2(const uint8_t *src_ptr, int src_stride,
unsigned int vpx_highbd_sad16x8_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad16x8_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x8x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x8x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad32x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad32x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1279,10 +1294,10 @@ unsigned int vpx_highbd_sad32x16_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad32x16_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad32x16_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad32x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad32x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1294,10 +1309,10 @@ unsigned int vpx_highbd_sad32x32_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad32x32_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad32x32_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad32x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad32x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1309,10 +1324,10 @@ unsigned int vpx_highbd_sad32x64_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad32x64_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad32x64_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad4x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_highbd_sad4x4 vpx_highbd_sad4x4_c
@@ -1320,9 +1335,9 @@ unsigned int vpx_highbd_sad4x4_c(const uint8_t *src_ptr, int src_stride, const u
unsigned int vpx_highbd_sad4x4_avg_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_highbd_sad4x4_avg vpx_highbd_sad4x4_avg_c
void vpx_highbd_sad4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad4x4x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad4x4x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad4x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_highbd_sad4x8 vpx_highbd_sad4x8_c
@@ -1330,9 +1345,9 @@ unsigned int vpx_highbd_sad4x8_c(const uint8_t *src_ptr, int src_stride, const u
unsigned int vpx_highbd_sad4x8_avg_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_highbd_sad4x8_avg vpx_highbd_sad4x8_avg_c
void vpx_highbd_sad4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad4x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad4x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad64x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad64x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1344,10 +1359,10 @@ unsigned int vpx_highbd_sad64x32_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad64x32_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad64x32_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad64x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad64x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad64x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad64x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1359,10 +1374,10 @@ unsigned int vpx_highbd_sad64x64_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad64x64_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad64x64_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad8x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad8x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1372,9 +1387,9 @@ unsigned int vpx_highbd_sad8x16_avg_c(const uint8_t *src_ptr, int src_stride, co
unsigned int vpx_highbd_sad8x16_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad8x16_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad8x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad8x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad8x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad8x4_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1384,9 +1399,9 @@ unsigned int vpx_highbd_sad8x4_avg_c(const uint8_t *src_ptr, int src_stride, con
unsigned int vpx_highbd_sad8x4_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad8x4_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad8x4x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad8x4x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad8x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad8x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1396,14 +1411,134 @@ unsigned int vpx_highbd_sad8x8_avg_c(const uint8_t *src_ptr, int src_stride, con
unsigned int vpx_highbd_sad8x8_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad8x8_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad8x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad8x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_16x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x16_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_16x16)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_16x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_16x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x32_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_16x32)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_16x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_16x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x8_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_16x8)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x8x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_16x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_32x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x16_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_32x16)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_32x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_32x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x32_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_32x32)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_32x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x64_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_32x64)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_32x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_4x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_highbd_sad_skip_4x4 vpx_highbd_sad_skip_4x4_c
void vpx_highbd_sad_skip_4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad_skip_4x4x4d vpx_highbd_sad_skip_4x4x4d_c
unsigned int vpx_highbd_sad_skip_4x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_highbd_sad_skip_4x8 vpx_highbd_sad_skip_4x8_c
void vpx_highbd_sad_skip_4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_4x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_64x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_64x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_64x32_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_64x32)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_64x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_64x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_64x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_64x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_64x64_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_64x64)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_8x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_8x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_8x16)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_8x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_8x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_highbd_sad_skip_8x4 vpx_highbd_sad_skip_8x4_c
void vpx_highbd_sad_skip_8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad_skip_8x4x4d vpx_highbd_sad_skip_8x4x4d_c
unsigned int vpx_highbd_sad_skip_8x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_8x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_8x8)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_8x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
int vpx_highbd_satd_c(const tran_low_t *coeff, int length);
int vpx_highbd_satd_avx2(const tran_low_t *coeff, int length);
RTCD_EXTERN int (*vpx_highbd_satd)(const tran_low_t *coeff, int length);
int64_t vpx_highbd_sse_c(const uint8_t *a8, int a_stride, const uint8_t *b8,int b_stride, int width, int height);
int64_t vpx_highbd_sse_sse4_1(const uint8_t *a8, int a_stride, const uint8_t *b8,int b_stride, int width, int height);
int64_t vpx_highbd_sse_avx2(const uint8_t *a8, int a_stride, const uint8_t *b8,int b_stride, int width, int height);
RTCD_EXTERN int64_t (*vpx_highbd_sse)(const uint8_t *a8, int a_stride, const uint8_t *b8,int b_stride, int width, int height);
void vpx_highbd_subtract_block_c(int rows, int cols, int16_t *diff_ptr, ptrdiff_t diff_stride, const uint8_t *src8_ptr, ptrdiff_t src_stride, const uint8_t *pred8_ptr, ptrdiff_t pred_stride, int bd);
void vpx_highbd_subtract_block_avx2(int rows, int cols, int16_t *diff_ptr, ptrdiff_t diff_stride, const uint8_t *src8_ptr, ptrdiff_t src_stride, const uint8_t *pred8_ptr, ptrdiff_t pred_stride, int bd);
RTCD_EXTERN void (*vpx_highbd_subtract_block)(int rows, int cols, int16_t *diff_ptr, ptrdiff_t diff_stride, const uint8_t *src8_ptr, ptrdiff_t src_stride, const uint8_t *pred8_ptr, ptrdiff_t pred_stride, int bd);
@@ -1450,6 +1585,7 @@ RTCD_EXTERN void (*vpx_idct16x16_1_add)(const tran_low_t *input, uint8_t *dest,
void vpx_idct16x16_256_add_c(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct16x16_256_add_sse2(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct16x16_256_add_avx2(const tran_low_t *input, uint8_t *dest, int stride);
RTCD_EXTERN void (*vpx_idct16x16_256_add)(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct16x16_38_add_c(const tran_low_t *input, uint8_t *dest, int stride);
@@ -1458,11 +1594,13 @@ RTCD_EXTERN void (*vpx_idct16x16_38_add)(const tran_low_t *input, uint8_t *dest,
void vpx_idct32x32_1024_add_c(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_1024_add_sse2(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_1024_add_avx2(const tran_low_t *input, uint8_t *dest, int stride);
RTCD_EXTERN void (*vpx_idct32x32_1024_add)(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_135_add_c(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_135_add_sse2(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_135_add_ssse3(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_135_add_avx2(const tran_low_t *input, uint8_t *dest, int stride);
RTCD_EXTERN void (*vpx_idct32x32_135_add)(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_1_add_c(const tran_low_t *input, uint8_t *dest, int stride);
@@ -1598,18 +1736,18 @@ void vpx_post_proc_down_and_across_mb_row_c(unsigned char *src, unsigned char *d
void vpx_post_proc_down_and_across_mb_row_sse2(unsigned char *src, unsigned char *dst, int src_pitch, int dst_pitch, int cols, unsigned char *flimits, int size);
RTCD_EXTERN void (*vpx_post_proc_down_and_across_mb_row)(unsigned char *src, unsigned char *dst, int src_pitch, int dst_pitch, int cols, unsigned char *flimits, int size);
void vpx_quantize_b_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_avx(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vpx_quantize_b)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_avx(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vpx_quantize_b)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_32x32_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_32x32_avx(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vpx_quantize_b_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_32x32_c(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_32x32_ssse3(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_32x32_avx(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_32x32_avx2(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vpx_quantize_b_32x32)(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
unsigned int vpx_sad16x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad16x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1619,9 +1757,9 @@ unsigned int vpx_sad16x16_avg_c(const uint8_t *src_ptr, int src_stride, const ui
unsigned int vpx_sad16x16_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad16x16_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad16x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad16x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad16x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad16x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1631,9 +1769,9 @@ unsigned int vpx_sad16x32_avg_c(const uint8_t *src_ptr, int src_stride, const ui
unsigned int vpx_sad16x32_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad16x32_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad16x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad16x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad16x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad16x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1643,9 +1781,9 @@ unsigned int vpx_sad16x8_avg_c(const uint8_t *src_ptr, int src_stride, const uin
unsigned int vpx_sad16x8_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad16x8_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad16x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad16x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad32x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad32x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1657,9 +1795,9 @@ unsigned int vpx_sad32x16_avg_sse2(const uint8_t *src_ptr, int src_stride, const
unsigned int vpx_sad32x16_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad32x16_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad32x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad32x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad32x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad32x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1671,10 +1809,10 @@ unsigned int vpx_sad32x32_avg_sse2(const uint8_t *src_ptr, int src_stride, const
unsigned int vpx_sad32x32_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad32x32_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad32x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad32x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1686,9 +1824,9 @@ unsigned int vpx_sad32x64_avg_sse2(const uint8_t *src_ptr, int src_stride, const
unsigned int vpx_sad32x64_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad32x64_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad32x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad32x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad4x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad4x4_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1698,9 +1836,9 @@ unsigned int vpx_sad4x4_avg_c(const uint8_t *src_ptr, int src_stride, const uint
unsigned int vpx_sad4x4_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad4x4_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad4x4x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad4x4x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad4x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad4x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1710,9 +1848,9 @@ unsigned int vpx_sad4x8_avg_c(const uint8_t *src_ptr, int src_stride, const uint
unsigned int vpx_sad4x8_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad4x8_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad4x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad4x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad64x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad64x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1724,9 +1862,9 @@ unsigned int vpx_sad64x32_avg_sse2(const uint8_t *src_ptr, int src_stride, const
unsigned int vpx_sad64x32_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad64x32_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad64x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad64x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad64x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad64x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1738,11 +1876,11 @@ unsigned int vpx_sad64x64_avg_sse2(const uint8_t *src_ptr, int src_stride, const
unsigned int vpx_sad64x64_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad64x64_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_avx512(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_avx512(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad8x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad8x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1752,9 +1890,9 @@ unsigned int vpx_sad8x16_avg_c(const uint8_t *src_ptr, int src_stride, const uin
unsigned int vpx_sad8x16_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad8x16_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad8x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad8x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad8x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad8x4_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1764,9 +1902,9 @@ unsigned int vpx_sad8x4_avg_c(const uint8_t *src_ptr, int src_stride, const uint
unsigned int vpx_sad8x4_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad8x4_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad8x4x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad8x4x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad8x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad8x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1776,9 +1914,119 @@ unsigned int vpx_sad8x8_avg_c(const uint8_t *src_ptr, int src_stride, const uint
unsigned int vpx_sad8x8_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad8x8_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad8x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad8x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_16x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_16x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_16x16)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_16x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_16x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_16x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_16x32)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_16x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_16x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_16x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_16x8)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_16x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_32x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x16_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_32x16)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_32x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_32x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x32_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_32x32)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_32x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x64_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_32x64)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_32x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_4x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_sad_skip_4x4 vpx_sad_skip_4x4_c
void vpx_sad_skip_4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad_skip_4x4x4d vpx_sad_skip_4x4x4d_c
unsigned int vpx_sad_skip_4x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_4x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_4x8)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_4x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_64x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_64x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_64x32_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_64x32)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_64x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_64x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_64x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_64x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_64x64_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_64x64)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_8x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_8x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_8x16)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_8x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_8x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_sad_skip_8x4 vpx_sad_skip_8x4_c
void vpx_sad_skip_8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad_skip_8x4x4d vpx_sad_skip_8x4x4d_c
unsigned int vpx_sad_skip_8x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_8x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_8x8)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_8x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
int vpx_satd_c(const tran_low_t *coeff, int length);
int vpx_satd_sse2(const tran_low_t *coeff, int length);
@@ -1804,6 +2052,11 @@ void vpx_scaled_horiz_c(const uint8_t *src, ptrdiff_t src_stride, uint8_t *dst,
void vpx_scaled_vert_c(const uint8_t *src, ptrdiff_t src_stride, uint8_t *dst, ptrdiff_t dst_stride, const InterpKernel *filter, int x0_q4, int x_step_q4, int y0_q4, int y_step_q4, int w, int h);
#define vpx_scaled_vert vpx_scaled_vert_c
int64_t vpx_sse_c(const uint8_t *src, int src_stride, const uint8_t *ref, int ref_stride, int width, int height);
int64_t vpx_sse_sse4_1(const uint8_t *src, int src_stride, const uint8_t *ref, int ref_stride, int width, int height);
int64_t vpx_sse_avx2(const uint8_t *src, int src_stride, const uint8_t *ref, int ref_stride, int width, int height);
RTCD_EXTERN int64_t (*vpx_sse)(const uint8_t *src, int src_stride, const uint8_t *ref, int ref_stride, int width, int height);
uint32_t vpx_sub_pixel_avg_variance16x16_c(const uint8_t *src_ptr, int src_stride, int x_offset, int y_offset, const uint8_t *ref_ptr, int ref_stride, uint32_t *sse, const uint8_t *second_pred);
uint32_t vpx_sub_pixel_avg_variance16x16_sse2(const uint8_t *src_ptr, int src_stride, int x_offset, int y_offset, const uint8_t *ref_ptr, int ref_stride, uint32_t *sse, const uint8_t *second_pred);
uint32_t vpx_sub_pixel_avg_variance16x16_ssse3(const uint8_t *src_ptr, int src_stride, int x_offset, int y_offset, const uint8_t *ref_ptr, int ref_stride, uint32_t *sse, const uint8_t *second_pred);
@@ -2029,14 +2282,17 @@ RTCD_EXTERN unsigned int (*vpx_variance64x64)(const uint8_t *src_ptr, int src_st
unsigned int vpx_variance8x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x16_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
RTCD_EXTERN unsigned int (*vpx_variance8x16)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x4_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x4_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
RTCD_EXTERN unsigned int (*vpx_variance8x4)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x8_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
RTCD_EXTERN unsigned int (*vpx_variance8x8)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
void vpx_ve_predictor_4x4_c(uint8_t *dst, ptrdiff_t stride, const uint8_t *above, const uint8_t *left);
@@ -2062,6 +2318,7 @@ static void setup_rtcd_internal(void)
if (flags & HAS_SSE2) vpx_avg_8x8 = vpx_avg_8x8_sse2;
vpx_comp_avg_pred = vpx_comp_avg_pred_c;
if (flags & HAS_SSE2) vpx_comp_avg_pred = vpx_comp_avg_pred_sse2;
if (flags & HAS_AVX2) vpx_comp_avg_pred = vpx_comp_avg_pred_avx2;
vpx_convolve8 = vpx_convolve8_c;
if (flags & HAS_SSE2) vpx_convolve8 = vpx_convolve8_sse2;
if (flags & HAS_SSSE3) vpx_convolve8 = vpx_convolve8_ssse3;
@@ -2698,8 +2955,69 @@ static void setup_rtcd_internal(void)
if (flags & HAS_SSE2) vpx_highbd_sad8x8_avg = vpx_highbd_sad8x8_avg_sse2;
vpx_highbd_sad8x8x4d = vpx_highbd_sad8x8x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad8x8x4d = vpx_highbd_sad8x8x4d_sse2;
vpx_highbd_sad_skip_16x16 = vpx_highbd_sad_skip_16x16_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_16x16 = vpx_highbd_sad_skip_16x16_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x16 = vpx_highbd_sad_skip_16x16_avx2;
vpx_highbd_sad_skip_16x16x4d = vpx_highbd_sad_skip_16x16x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_16x16x4d = vpx_highbd_sad_skip_16x16x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x16x4d = vpx_highbd_sad_skip_16x16x4d_avx2;
vpx_highbd_sad_skip_16x32 = vpx_highbd_sad_skip_16x32_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_16x32 = vpx_highbd_sad_skip_16x32_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x32 = vpx_highbd_sad_skip_16x32_avx2;
vpx_highbd_sad_skip_16x32x4d = vpx_highbd_sad_skip_16x32x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_16x32x4d = vpx_highbd_sad_skip_16x32x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x32x4d = vpx_highbd_sad_skip_16x32x4d_avx2;
vpx_highbd_sad_skip_16x8 = vpx_highbd_sad_skip_16x8_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_16x8 = vpx_highbd_sad_skip_16x8_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x8 = vpx_highbd_sad_skip_16x8_avx2;
vpx_highbd_sad_skip_16x8x4d = vpx_highbd_sad_skip_16x8x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_16x8x4d = vpx_highbd_sad_skip_16x8x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x8x4d = vpx_highbd_sad_skip_16x8x4d_avx2;
vpx_highbd_sad_skip_32x16 = vpx_highbd_sad_skip_32x16_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_32x16 = vpx_highbd_sad_skip_32x16_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x16 = vpx_highbd_sad_skip_32x16_avx2;
vpx_highbd_sad_skip_32x16x4d = vpx_highbd_sad_skip_32x16x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_32x16x4d = vpx_highbd_sad_skip_32x16x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x16x4d = vpx_highbd_sad_skip_32x16x4d_avx2;
vpx_highbd_sad_skip_32x32 = vpx_highbd_sad_skip_32x32_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_32x32 = vpx_highbd_sad_skip_32x32_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x32 = vpx_highbd_sad_skip_32x32_avx2;
vpx_highbd_sad_skip_32x32x4d = vpx_highbd_sad_skip_32x32x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_32x32x4d = vpx_highbd_sad_skip_32x32x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x32x4d = vpx_highbd_sad_skip_32x32x4d_avx2;
vpx_highbd_sad_skip_32x64 = vpx_highbd_sad_skip_32x64_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_32x64 = vpx_highbd_sad_skip_32x64_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x64 = vpx_highbd_sad_skip_32x64_avx2;
vpx_highbd_sad_skip_32x64x4d = vpx_highbd_sad_skip_32x64x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_32x64x4d = vpx_highbd_sad_skip_32x64x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x64x4d = vpx_highbd_sad_skip_32x64x4d_avx2;
vpx_highbd_sad_skip_4x8x4d = vpx_highbd_sad_skip_4x8x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_4x8x4d = vpx_highbd_sad_skip_4x8x4d_sse2;
vpx_highbd_sad_skip_64x32 = vpx_highbd_sad_skip_64x32_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_64x32 = vpx_highbd_sad_skip_64x32_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_64x32 = vpx_highbd_sad_skip_64x32_avx2;
vpx_highbd_sad_skip_64x32x4d = vpx_highbd_sad_skip_64x32x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_64x32x4d = vpx_highbd_sad_skip_64x32x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_64x32x4d = vpx_highbd_sad_skip_64x32x4d_avx2;
vpx_highbd_sad_skip_64x64 = vpx_highbd_sad_skip_64x64_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_64x64 = vpx_highbd_sad_skip_64x64_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_64x64 = vpx_highbd_sad_skip_64x64_avx2;
vpx_highbd_sad_skip_64x64x4d = vpx_highbd_sad_skip_64x64x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_64x64x4d = vpx_highbd_sad_skip_64x64x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_64x64x4d = vpx_highbd_sad_skip_64x64x4d_avx2;
vpx_highbd_sad_skip_8x16 = vpx_highbd_sad_skip_8x16_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_8x16 = vpx_highbd_sad_skip_8x16_sse2;
vpx_highbd_sad_skip_8x16x4d = vpx_highbd_sad_skip_8x16x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_8x16x4d = vpx_highbd_sad_skip_8x16x4d_sse2;
vpx_highbd_sad_skip_8x8 = vpx_highbd_sad_skip_8x8_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_8x8 = vpx_highbd_sad_skip_8x8_sse2;
vpx_highbd_sad_skip_8x8x4d = vpx_highbd_sad_skip_8x8x4d_c;
if (flags & HAS_SSE2) vpx_highbd_sad_skip_8x8x4d = vpx_highbd_sad_skip_8x8x4d_sse2;
vpx_highbd_satd = vpx_highbd_satd_c;
if (flags & HAS_AVX2) vpx_highbd_satd = vpx_highbd_satd_avx2;
vpx_highbd_sse = vpx_highbd_sse_c;
if (flags & HAS_SSE4_1) vpx_highbd_sse = vpx_highbd_sse_sse4_1;
if (flags & HAS_AVX2) vpx_highbd_sse = vpx_highbd_sse_avx2;
vpx_highbd_subtract_block = vpx_highbd_subtract_block_c;
if (flags & HAS_AVX2) vpx_highbd_subtract_block = vpx_highbd_subtract_block_avx2;
vpx_highbd_tm_predictor_16x16 = vpx_highbd_tm_predictor_16x16_c;
@@ -2724,13 +3042,16 @@ static void setup_rtcd_internal(void)
if (flags & HAS_SSE2) vpx_idct16x16_1_add = vpx_idct16x16_1_add_sse2;
vpx_idct16x16_256_add = vpx_idct16x16_256_add_c;
if (flags & HAS_SSE2) vpx_idct16x16_256_add = vpx_idct16x16_256_add_sse2;
if (flags & HAS_AVX2) vpx_idct16x16_256_add = vpx_idct16x16_256_add_avx2;
vpx_idct16x16_38_add = vpx_idct16x16_38_add_c;
if (flags & HAS_SSE2) vpx_idct16x16_38_add = vpx_idct16x16_38_add_sse2;
vpx_idct32x32_1024_add = vpx_idct32x32_1024_add_c;
if (flags & HAS_SSE2) vpx_idct32x32_1024_add = vpx_idct32x32_1024_add_sse2;
if (flags & HAS_AVX2) vpx_idct32x32_1024_add = vpx_idct32x32_1024_add_avx2;
vpx_idct32x32_135_add = vpx_idct32x32_135_add_c;
if (flags & HAS_SSE2) vpx_idct32x32_135_add = vpx_idct32x32_135_add_sse2;
if (flags & HAS_SSSE3) vpx_idct32x32_135_add = vpx_idct32x32_135_add_ssse3;
if (flags & HAS_AVX2) vpx_idct32x32_135_add = vpx_idct32x32_135_add_avx2;
vpx_idct32x32_1_add = vpx_idct32x32_1_add_c;
if (flags & HAS_SSE2) vpx_idct32x32_1_add = vpx_idct32x32_1_add_sse2;
vpx_idct32x32_34_add = vpx_idct32x32_34_add_c;
@@ -2899,11 +3220,68 @@ static void setup_rtcd_internal(void)
if (flags & HAS_SSE2) vpx_sad8x8_avg = vpx_sad8x8_avg_sse2;
vpx_sad8x8x4d = vpx_sad8x8x4d_c;
if (flags & HAS_SSE2) vpx_sad8x8x4d = vpx_sad8x8x4d_sse2;
vpx_sad_skip_16x16 = vpx_sad_skip_16x16_c;
if (flags & HAS_SSE2) vpx_sad_skip_16x16 = vpx_sad_skip_16x16_sse2;
vpx_sad_skip_16x16x4d = vpx_sad_skip_16x16x4d_c;
if (flags & HAS_SSE2) vpx_sad_skip_16x16x4d = vpx_sad_skip_16x16x4d_sse2;
vpx_sad_skip_16x32 = vpx_sad_skip_16x32_c;
if (flags & HAS_SSE2) vpx_sad_skip_16x32 = vpx_sad_skip_16x32_sse2;
vpx_sad_skip_16x32x4d = vpx_sad_skip_16x32x4d_c;
if (flags & HAS_SSE2) vpx_sad_skip_16x32x4d = vpx_sad_skip_16x32x4d_sse2;
vpx_sad_skip_16x8 = vpx_sad_skip_16x8_c;
if (flags & HAS_SSE2) vpx_sad_skip_16x8 = vpx_sad_skip_16x8_sse2;
vpx_sad_skip_16x8x4d = vpx_sad_skip_16x8x4d_c;
if (flags & HAS_SSE2) vpx_sad_skip_16x8x4d = vpx_sad_skip_16x8x4d_sse2;
vpx_sad_skip_32x16 = vpx_sad_skip_32x16_c;
if (flags & HAS_SSE2) vpx_sad_skip_32x16 = vpx_sad_skip_32x16_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x16 = vpx_sad_skip_32x16_avx2;
vpx_sad_skip_32x16x4d = vpx_sad_skip_32x16x4d_c;
if (flags & HAS_SSE2) vpx_sad_skip_32x16x4d = vpx_sad_skip_32x16x4d_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x16x4d = vpx_sad_skip_32x16x4d_avx2;
vpx_sad_skip_32x32 = vpx_sad_skip_32x32_c;
if (flags & HAS_SSE2) vpx_sad_skip_32x32 = vpx_sad_skip_32x32_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x32 = vpx_sad_skip_32x32_avx2;
vpx_sad_skip_32x32x4d = vpx_sad_skip_32x32x4d_c;
if (flags & HAS_SSE2) vpx_sad_skip_32x32x4d = vpx_sad_skip_32x32x4d_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x32x4d = vpx_sad_skip_32x32x4d_avx2;
vpx_sad_skip_32x64 = vpx_sad_skip_32x64_c;
if (flags & HAS_SSE2) vpx_sad_skip_32x64 = vpx_sad_skip_32x64_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x64 = vpx_sad_skip_32x64_avx2;
vpx_sad_skip_32x64x4d = vpx_sad_skip_32x64x4d_c;
if (flags & HAS_SSE2) vpx_sad_skip_32x64x4d = vpx_sad_skip_32x64x4d_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x64x4d = vpx_sad_skip_32x64x4d_avx2;
vpx_sad_skip_4x8 = vpx_sad_skip_4x8_c;
if (flags & HAS_SSE2) vpx_sad_skip_4x8 = vpx_sad_skip_4x8_sse2;
vpx_sad_skip_4x8x4d = vpx_sad_skip_4x8x4d_c;
if (flags & HAS_SSE2) vpx_sad_skip_4x8x4d = vpx_sad_skip_4x8x4d_sse2;
vpx_sad_skip_64x32 = vpx_sad_skip_64x32_c;
if (flags & HAS_SSE2) vpx_sad_skip_64x32 = vpx_sad_skip_64x32_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_64x32 = vpx_sad_skip_64x32_avx2;
vpx_sad_skip_64x32x4d = vpx_sad_skip_64x32x4d_c;
if (flags & HAS_SSE2) vpx_sad_skip_64x32x4d = vpx_sad_skip_64x32x4d_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_64x32x4d = vpx_sad_skip_64x32x4d_avx2;
vpx_sad_skip_64x64 = vpx_sad_skip_64x64_c;
if (flags & HAS_SSE2) vpx_sad_skip_64x64 = vpx_sad_skip_64x64_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_64x64 = vpx_sad_skip_64x64_avx2;
vpx_sad_skip_64x64x4d = vpx_sad_skip_64x64x4d_c;
if (flags & HAS_SSE2) vpx_sad_skip_64x64x4d = vpx_sad_skip_64x64x4d_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_64x64x4d = vpx_sad_skip_64x64x4d_avx2;
vpx_sad_skip_8x16 = vpx_sad_skip_8x16_c;
if (flags & HAS_SSE2) vpx_sad_skip_8x16 = vpx_sad_skip_8x16_sse2;
vpx_sad_skip_8x16x4d = vpx_sad_skip_8x16x4d_c;
if (flags & HAS_SSE2) vpx_sad_skip_8x16x4d = vpx_sad_skip_8x16x4d_sse2;
vpx_sad_skip_8x8 = vpx_sad_skip_8x8_c;
if (flags & HAS_SSE2) vpx_sad_skip_8x8 = vpx_sad_skip_8x8_sse2;
vpx_sad_skip_8x8x4d = vpx_sad_skip_8x8x4d_c;
if (flags & HAS_SSE2) vpx_sad_skip_8x8x4d = vpx_sad_skip_8x8x4d_sse2;
vpx_satd = vpx_satd_c;
if (flags & HAS_SSE2) vpx_satd = vpx_satd_sse2;
if (flags & HAS_AVX2) vpx_satd = vpx_satd_avx2;
vpx_scaled_2d = vpx_scaled_2d_c;
if (flags & HAS_SSSE3) vpx_scaled_2d = vpx_scaled_2d_ssse3;
vpx_sse = vpx_sse_c;
if (flags & HAS_SSE4_1) vpx_sse = vpx_sse_sse4_1;
if (flags & HAS_AVX2) vpx_sse = vpx_sse_avx2;
vpx_sub_pixel_avg_variance16x16 = vpx_sub_pixel_avg_variance16x16_c;
if (flags & HAS_SSE2) vpx_sub_pixel_avg_variance16x16 = vpx_sub_pixel_avg_variance16x16_sse2;
if (flags & HAS_SSSE3) vpx_sub_pixel_avg_variance16x16 = vpx_sub_pixel_avg_variance16x16_ssse3;
@@ -3037,10 +3415,13 @@ static void setup_rtcd_internal(void)
if (flags & HAS_AVX2) vpx_variance64x64 = vpx_variance64x64_avx2;
vpx_variance8x16 = vpx_variance8x16_c;
if (flags & HAS_SSE2) vpx_variance8x16 = vpx_variance8x16_sse2;
if (flags & HAS_AVX2) vpx_variance8x16 = vpx_variance8x16_avx2;
vpx_variance8x4 = vpx_variance8x4_c;
if (flags & HAS_SSE2) vpx_variance8x4 = vpx_variance8x4_sse2;
if (flags & HAS_AVX2) vpx_variance8x4 = vpx_variance8x4_avx2;
vpx_variance8x8 = vpx_variance8x8_c;
if (flags & HAS_SSE2) vpx_variance8x8 = vpx_variance8x8_sse2;
if (flags & HAS_AVX2) vpx_variance8x8 = vpx_variance8x8_avx2;
vpx_vector_var = vpx_vector_var_c;
if (flags & HAS_SSE2) vpx_vector_var = vpx_vector_var_sse2;
}
@@ -3050,4 +3431,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VPX_DSP_RTCD_H_
+11 -1
View File
@@ -1,3 +1,13 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VPX_SCALE_RTCD_H_
#define VPX_SCALE_RTCD_H_
@@ -70,4 +80,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VPX_SCALE_RTCD_H_
+11 -10
View File
@@ -1,3 +1,13 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VP8_RTCD_H_
#define VP8_RTCD_H_
@@ -45,15 +55,6 @@ void vp8_bilinear_predict8x8_sse2(unsigned char *src_ptr, int src_pixels_per_lin
void vp8_bilinear_predict8x8_ssse3(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
RTCD_EXTERN void (*vp8_bilinear_predict8x8)(unsigned char *src_ptr, int src_pixels_per_line, int xoffset, int yoffset, unsigned char *dst_ptr, int dst_pitch);
void vp8_blend_b_c(unsigned char *y, unsigned char *u, unsigned char *v, int y_1, int u_1, int v_1, int alpha, int stride);
#define vp8_blend_b vp8_blend_b_c
void vp8_blend_mb_inner_c(unsigned char *y, unsigned char *u, unsigned char *v, int y_1, int u_1, int v_1, int alpha, int stride);
#define vp8_blend_mb_inner vp8_blend_mb_inner_c
void vp8_blend_mb_outer_c(unsigned char *y, unsigned char *u, unsigned char *v, int y_1, int u_1, int v_1, int alpha, int stride);
#define vp8_blend_mb_outer vp8_blend_mb_outer_c
int vp8_block_error_c(short *coeff, short *dqcoeff);
int vp8_block_error_sse2(short *coeff, short *dqcoeff);
#define vp8_block_error vp8_block_error_sse2
@@ -254,4 +255,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VP8_RTCD_H_
+31 -22
View File
@@ -1,3 +1,13 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VP9_RTCD_H_
#define VP9_RTCD_H_
@@ -21,7 +31,9 @@ struct macroblockd;
/* Encoder forward decls */
struct macroblock;
struct vp9_variance_vtable;
struct macroblock_plane;
struct vp9_sad_table;
struct ScanOrder;
struct search_site_config;
struct mv;
union int_mv;
@@ -45,9 +57,8 @@ int64_t vp9_block_error_fp_sse2(const tran_low_t *coeff, const tran_low_t *dqcoe
int64_t vp9_block_error_fp_avx2(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
RTCD_EXTERN int64_t (*vp9_block_error_fp)(const tran_low_t *coeff, const tran_low_t *dqcoeff, int block_size);
int vp9_diamond_search_sad_c(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
int vp9_diamond_search_sad_avx(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
RTCD_EXTERN int (*vp9_diamond_search_sad)(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_variance_vtable *fn_ptr, const struct mv *center_mv);
int vp9_diamond_search_sad_c(const struct macroblock *x, const struct search_site_config *cfg, struct mv *ref_mv, uint32_t start_mv_sad, struct mv *best_mv, int search_param, int sad_per_bit, int *num00, const struct vp9_sad_table *sad_fn_ptr, const struct mv *center_mv);
#define vp9_diamond_search_sad vp9_diamond_search_sad_c
void vp9_fht16x16_c(const int16_t *input, tran_low_t *output, int stride, int tx_type);
void vp9_fht16x16_sse2(const int16_t *input, tran_low_t *output, int stride, int tx_type);
@@ -97,13 +108,13 @@ void vp9_highbd_iht8x8_64_add_c(const tran_low_t *input, uint16_t *dest, int str
void vp9_highbd_iht8x8_64_add_sse4_1(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
RTCD_EXTERN void (*vp9_highbd_iht8x8_64_add)(const tran_low_t *input, uint16_t *dest, int stride, int tx_type, int bd);
void vp9_highbd_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_highbd_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vp9_highbd_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_highbd_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_highbd_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_highbd_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vp9_highbd_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_highbd_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_highbd_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_highbd_temporal_filter_apply_c(const uint8_t *frame1, unsigned int stride, const uint8_t *frame2, unsigned int block_width, unsigned int block_height, int strength, int *blk_fw, int use_32x32, uint32_t *accumulator, uint16_t *count);
#define vp9_highbd_temporal_filter_apply vp9_highbd_temporal_filter_apply_c
@@ -120,16 +131,16 @@ void vp9_iht8x8_64_add_c(const tran_low_t *input, uint8_t *dest, int stride, int
void vp9_iht8x8_64_add_sse2(const tran_low_t *input, uint8_t *dest, int stride, int tx_type);
#define vp9_iht8x8_64_add vp9_iht8x8_64_add_sse2
void vp9_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vp9_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_quantize_fp)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_32x32_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vp9_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *round_ptr, const int16_t *quant_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vp9_quantize_fp_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_quantize_fp_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vp9_quantize_fp_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vp9_scale_and_extend_frame_c(const struct yv12_buffer_config *src, struct yv12_buffer_config *dst, INTERP_FILTER filter_type, int phase_scaler);
void vp9_scale_and_extend_frame_ssse3(const struct yv12_buffer_config *src, struct yv12_buffer_config *dst, INTERP_FILTER filter_type, int phase_scaler);
@@ -151,8 +162,6 @@ static void setup_rtcd_internal(void)
if (flags & HAS_AVX2) vp9_block_error = vp9_block_error_avx2;
vp9_block_error_fp = vp9_block_error_fp_sse2;
if (flags & HAS_AVX2) vp9_block_error_fp = vp9_block_error_fp_avx2;
vp9_diamond_search_sad = vp9_diamond_search_sad_c;
if (flags & HAS_AVX) vp9_diamond_search_sad = vp9_diamond_search_sad_avx;
vp9_highbd_apply_temporal_filter = vp9_highbd_apply_temporal_filter_c;
if (flags & HAS_SSE4_1) vp9_highbd_apply_temporal_filter = vp9_highbd_apply_temporal_filter_sse4_1;
vp9_highbd_iht16x16_256_add = vp9_highbd_iht16x16_256_add_c;
@@ -180,4 +189,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VP9_RTCD_H_
+426 -99
View File
@@ -1,3 +1,13 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VPX_DSP_RTCD_H_
#define VPX_DSP_RTCD_H_
@@ -15,6 +25,10 @@
#include "vpx/vpx_integer.h"
#include "vpx_dsp/vpx_dsp_common.h"
#include "vpx_dsp/vpx_filter.h"
#if CONFIG_VP9_ENCODER
struct macroblock_plane;
struct ScanOrder;
#endif
#ifdef __cplusplus
@@ -31,7 +45,8 @@ unsigned int vpx_avg_8x8_sse2(const uint8_t *, int p);
void vpx_comp_avg_pred_c(uint8_t *comp_pred, const uint8_t *pred, int width, int height, const uint8_t *ref, int ref_stride);
void vpx_comp_avg_pred_sse2(uint8_t *comp_pred, const uint8_t *pred, int width, int height, const uint8_t *ref, int ref_stride);
#define vpx_comp_avg_pred vpx_comp_avg_pred_sse2
void vpx_comp_avg_pred_avx2(uint8_t *comp_pred, const uint8_t *pred, int width, int height, const uint8_t *ref, int ref_stride);
RTCD_EXTERN void (*vpx_comp_avg_pred)(uint8_t *comp_pred, const uint8_t *pred, int width, int height, const uint8_t *ref, int ref_stride);
void vpx_convolve8_c(const uint8_t *src, ptrdiff_t src_stride, uint8_t *dst, ptrdiff_t dst_stride, const InterpKernel *filter, int x0_q4, int x_step_q4, int y0_q4, int y_step_q4, int w, int h);
void vpx_convolve8_sse2(const uint8_t *src, ptrdiff_t src_stride, uint8_t *dst, ptrdiff_t dst_stride, const InterpKernel *filter, int x0_q4, int x_step_q4, int y0_q4, int y_step_q4, int w, int h);
@@ -1221,15 +1236,15 @@ void vpx_highbd_lpf_vertical_8_dual_sse2(uint16_t *s, int pitch, const uint8_t *
void vpx_highbd_minmax_8x8_c(const uint8_t *s8, int p, const uint8_t *d8, int dp, int *min, int *max);
#define vpx_highbd_minmax_8x8 vpx_highbd_minmax_8x8_c
void vpx_highbd_quantize_b_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vpx_highbd_quantize_b)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_highbd_quantize_b_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_highbd_quantize_b_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vpx_highbd_quantize_b)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_highbd_quantize_b_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_32x32_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vpx_highbd_quantize_b_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_highbd_quantize_b_32x32_c(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_highbd_quantize_b_32x32_sse2(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_highbd_quantize_b_32x32_avx2(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vpx_highbd_quantize_b_32x32)(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
unsigned int vpx_highbd_sad16x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad16x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1241,10 +1256,10 @@ unsigned int vpx_highbd_sad16x16_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad16x16_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad16x16_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad16x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad16x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1256,10 +1271,10 @@ unsigned int vpx_highbd_sad16x32_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad16x32_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad16x32_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad16x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad16x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1271,10 +1286,10 @@ unsigned int vpx_highbd_sad16x8_avg_sse2(const uint8_t *src_ptr, int src_stride,
unsigned int vpx_highbd_sad16x8_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad16x8_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x8x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad16x8x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad16x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad32x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad32x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1286,10 +1301,10 @@ unsigned int vpx_highbd_sad32x16_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad32x16_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad32x16_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad32x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad32x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1301,10 +1316,10 @@ unsigned int vpx_highbd_sad32x32_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad32x32_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad32x32_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad32x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad32x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1316,10 +1331,10 @@ unsigned int vpx_highbd_sad32x64_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad32x64_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad32x64_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad32x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad32x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad4x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_highbd_sad4x4 vpx_highbd_sad4x4_c
@@ -1327,8 +1342,8 @@ unsigned int vpx_highbd_sad4x4_c(const uint8_t *src_ptr, int src_stride, const u
unsigned int vpx_highbd_sad4x4_avg_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_highbd_sad4x4_avg vpx_highbd_sad4x4_avg_c
void vpx_highbd_sad4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad4x4x4d vpx_highbd_sad4x4x4d_sse2
unsigned int vpx_highbd_sad4x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1337,8 +1352,8 @@ unsigned int vpx_highbd_sad4x8_c(const uint8_t *src_ptr, int src_stride, const u
unsigned int vpx_highbd_sad4x8_avg_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_highbd_sad4x8_avg vpx_highbd_sad4x8_avg_c
void vpx_highbd_sad4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad4x8x4d vpx_highbd_sad4x8x4d_sse2
unsigned int vpx_highbd_sad64x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1351,10 +1366,10 @@ unsigned int vpx_highbd_sad64x32_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad64x32_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad64x32_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad64x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad64x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad64x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad64x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1366,10 +1381,10 @@ unsigned int vpx_highbd_sad64x64_avg_sse2(const uint8_t *src_ptr, int src_stride
unsigned int vpx_highbd_sad64x64_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_highbd_sad64x64_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_highbd_sad64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad8x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad8x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1379,8 +1394,8 @@ unsigned int vpx_highbd_sad8x16_avg_c(const uint8_t *src_ptr, int src_stride, co
unsigned int vpx_highbd_sad8x16_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_highbd_sad8x16_avg vpx_highbd_sad8x16_avg_sse2
void vpx_highbd_sad8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad8x16x4d vpx_highbd_sad8x16x4d_sse2
unsigned int vpx_highbd_sad8x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1391,8 +1406,8 @@ unsigned int vpx_highbd_sad8x4_avg_c(const uint8_t *src_ptr, int src_stride, con
unsigned int vpx_highbd_sad8x4_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_highbd_sad8x4_avg vpx_highbd_sad8x4_avg_sse2
void vpx_highbd_sad8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad8x4x4d vpx_highbd_sad8x4x4d_sse2
unsigned int vpx_highbd_sad8x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1403,14 +1418,134 @@ unsigned int vpx_highbd_sad8x8_avg_c(const uint8_t *src_ptr, int src_stride, con
unsigned int vpx_highbd_sad8x8_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_highbd_sad8x8_avg vpx_highbd_sad8x8_avg_sse2
void vpx_highbd_sad8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t* const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad8x8x4d vpx_highbd_sad8x8x4d_sse2
unsigned int vpx_highbd_sad_skip_16x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x16_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_16x16)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_16x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_16x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x32_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_16x32)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_16x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_16x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_16x8_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_16x8)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_16x8x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_16x8x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_32x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x16_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_32x16)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_32x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_32x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x32_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_32x32)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_32x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_32x64_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_32x64)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_32x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_32x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_4x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_highbd_sad_skip_4x4 vpx_highbd_sad_skip_4x4_c
void vpx_highbd_sad_skip_4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad_skip_4x4x4d vpx_highbd_sad_skip_4x4x4d_c
unsigned int vpx_highbd_sad_skip_4x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_highbd_sad_skip_4x8 vpx_highbd_sad_skip_4x8_c
void vpx_highbd_sad_skip_4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad_skip_4x8x4d vpx_highbd_sad_skip_4x8x4d_sse2
unsigned int vpx_highbd_sad_skip_64x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_64x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_64x32_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_64x32)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_64x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_64x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_64x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_64x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_64x64_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_highbd_sad_skip_64x64)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_highbd_sad_skip_64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_highbd_sad_skip_64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_highbd_sad_skip_8x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_8x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_highbd_sad_skip_8x16 vpx_highbd_sad_skip_8x16_sse2
void vpx_highbd_sad_skip_8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad_skip_8x16x4d vpx_highbd_sad_skip_8x16x4d_sse2
unsigned int vpx_highbd_sad_skip_8x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_highbd_sad_skip_8x4 vpx_highbd_sad_skip_8x4_c
void vpx_highbd_sad_skip_8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad_skip_8x4x4d vpx_highbd_sad_skip_8x4x4d_c
unsigned int vpx_highbd_sad_skip_8x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_highbd_sad_skip_8x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_highbd_sad_skip_8x8 vpx_highbd_sad_skip_8x8_sse2
void vpx_highbd_sad_skip_8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_highbd_sad_skip_8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_highbd_sad_skip_8x8x4d vpx_highbd_sad_skip_8x8x4d_sse2
int vpx_highbd_satd_c(const tran_low_t *coeff, int length);
int vpx_highbd_satd_avx2(const tran_low_t *coeff, int length);
RTCD_EXTERN int (*vpx_highbd_satd)(const tran_low_t *coeff, int length);
int64_t vpx_highbd_sse_c(const uint8_t *a8, int a_stride, const uint8_t *b8,int b_stride, int width, int height);
int64_t vpx_highbd_sse_sse4_1(const uint8_t *a8, int a_stride, const uint8_t *b8,int b_stride, int width, int height);
int64_t vpx_highbd_sse_avx2(const uint8_t *a8, int a_stride, const uint8_t *b8,int b_stride, int width, int height);
RTCD_EXTERN int64_t (*vpx_highbd_sse)(const uint8_t *a8, int a_stride, const uint8_t *b8,int b_stride, int width, int height);
void vpx_highbd_subtract_block_c(int rows, int cols, int16_t *diff_ptr, ptrdiff_t diff_stride, const uint8_t *src8_ptr, ptrdiff_t src_stride, const uint8_t *pred8_ptr, ptrdiff_t pred_stride, int bd);
void vpx_highbd_subtract_block_avx2(int rows, int cols, int16_t *diff_ptr, ptrdiff_t diff_stride, const uint8_t *src8_ptr, ptrdiff_t src_stride, const uint8_t *pred8_ptr, ptrdiff_t pred_stride, int bd);
RTCD_EXTERN void (*vpx_highbd_subtract_block)(int rows, int cols, int16_t *diff_ptr, ptrdiff_t diff_stride, const uint8_t *src8_ptr, ptrdiff_t src_stride, const uint8_t *pred8_ptr, ptrdiff_t pred_stride, int bd);
@@ -1457,7 +1592,8 @@ void vpx_idct16x16_1_add_sse2(const tran_low_t *input, uint8_t *dest, int stride
void vpx_idct16x16_256_add_c(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct16x16_256_add_sse2(const tran_low_t *input, uint8_t *dest, int stride);
#define vpx_idct16x16_256_add vpx_idct16x16_256_add_sse2
void vpx_idct16x16_256_add_avx2(const tran_low_t *input, uint8_t *dest, int stride);
RTCD_EXTERN void (*vpx_idct16x16_256_add)(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct16x16_38_add_c(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct16x16_38_add_sse2(const tran_low_t *input, uint8_t *dest, int stride);
@@ -1465,11 +1601,13 @@ void vpx_idct16x16_38_add_sse2(const tran_low_t *input, uint8_t *dest, int strid
void vpx_idct32x32_1024_add_c(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_1024_add_sse2(const tran_low_t *input, uint8_t *dest, int stride);
#define vpx_idct32x32_1024_add vpx_idct32x32_1024_add_sse2
void vpx_idct32x32_1024_add_avx2(const tran_low_t *input, uint8_t *dest, int stride);
RTCD_EXTERN void (*vpx_idct32x32_1024_add)(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_135_add_c(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_135_add_sse2(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_135_add_ssse3(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_135_add_avx2(const tran_low_t *input, uint8_t *dest, int stride);
RTCD_EXTERN void (*vpx_idct32x32_135_add)(const tran_low_t *input, uint8_t *dest, int stride);
void vpx_idct32x32_1_add_c(const tran_low_t *input, uint8_t *dest, int stride);
@@ -1605,18 +1743,18 @@ void vpx_post_proc_down_and_across_mb_row_c(unsigned char *src, unsigned char *d
void vpx_post_proc_down_and_across_mb_row_sse2(unsigned char *src, unsigned char *dst, int src_pitch, int dst_pitch, int cols, unsigned char *flimits, int size);
#define vpx_post_proc_down_and_across_mb_row vpx_post_proc_down_and_across_mb_row_sse2
void vpx_quantize_b_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_avx(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vpx_quantize_b)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_sse2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_avx(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vpx_quantize_b)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_32x32_c(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_32x32_ssse3(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_32x32_avx(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_32x32_avx2(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
RTCD_EXTERN void (*vpx_quantize_b_32x32)(const tran_low_t *coeff_ptr, intptr_t n_coeffs, const int16_t *zbin_ptr, const int16_t *round_ptr, const int16_t *quant_ptr, const int16_t *quant_shift_ptr, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const int16_t *scan, const int16_t *iscan);
void vpx_quantize_b_32x32_c(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_32x32_ssse3(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_32x32_avx(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
void vpx_quantize_b_32x32_avx2(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
RTCD_EXTERN void (*vpx_quantize_b_32x32)(const tran_low_t *coeff_ptr, const struct macroblock_plane *const mb_plane, tran_low_t *qcoeff_ptr, tran_low_t *dqcoeff_ptr, const int16_t *dequant_ptr, uint16_t *eob_ptr, const struct ScanOrder *const scan_order);
unsigned int vpx_sad16x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad16x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1626,8 +1764,8 @@ unsigned int vpx_sad16x16_avg_c(const uint8_t *src_ptr, int src_stride, const ui
unsigned int vpx_sad16x16_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_sad16x16_avg vpx_sad16x16_avg_sse2
void vpx_sad16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad16x16x4d vpx_sad16x16x4d_sse2
unsigned int vpx_sad16x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1638,8 +1776,8 @@ unsigned int vpx_sad16x32_avg_c(const uint8_t *src_ptr, int src_stride, const ui
unsigned int vpx_sad16x32_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_sad16x32_avg vpx_sad16x32_avg_sse2
void vpx_sad16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad16x32x4d vpx_sad16x32x4d_sse2
unsigned int vpx_sad16x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1650,8 +1788,8 @@ unsigned int vpx_sad16x8_avg_c(const uint8_t *src_ptr, int src_stride, const uin
unsigned int vpx_sad16x8_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_sad16x8_avg vpx_sad16x8_avg_sse2
void vpx_sad16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad16x8x4d vpx_sad16x8x4d_sse2
unsigned int vpx_sad32x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1664,8 +1802,8 @@ unsigned int vpx_sad32x16_avg_sse2(const uint8_t *src_ptr, int src_stride, const
unsigned int vpx_sad32x16_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad32x16_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad32x16x4d vpx_sad32x16x4d_sse2
unsigned int vpx_sad32x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1678,10 +1816,10 @@ unsigned int vpx_sad32x32_avg_sse2(const uint8_t *src_ptr, int src_stride, const
unsigned int vpx_sad32x32_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad32x32_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad32x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad32x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1693,8 +1831,8 @@ unsigned int vpx_sad32x64_avg_sse2(const uint8_t *src_ptr, int src_stride, const
unsigned int vpx_sad32x64_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad32x64_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad32x64x4d vpx_sad32x64x4d_sse2
unsigned int vpx_sad4x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1705,8 +1843,8 @@ unsigned int vpx_sad4x4_avg_c(const uint8_t *src_ptr, int src_stride, const uint
unsigned int vpx_sad4x4_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_sad4x4_avg vpx_sad4x4_avg_sse2
void vpx_sad4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad4x4x4d vpx_sad4x4x4d_sse2
unsigned int vpx_sad4x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1717,8 +1855,8 @@ unsigned int vpx_sad4x8_avg_c(const uint8_t *src_ptr, int src_stride, const uint
unsigned int vpx_sad4x8_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_sad4x8_avg vpx_sad4x8_avg_sse2
void vpx_sad4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad4x8x4d vpx_sad4x8x4d_sse2
unsigned int vpx_sad64x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1731,8 +1869,8 @@ unsigned int vpx_sad64x32_avg_sse2(const uint8_t *src_ptr, int src_stride, const
unsigned int vpx_sad64x32_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad64x32_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad64x32x4d vpx_sad64x32x4d_sse2
unsigned int vpx_sad64x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1745,11 +1883,11 @@ unsigned int vpx_sad64x64_avg_sse2(const uint8_t *src_ptr, int src_stride, const
unsigned int vpx_sad64x64_avg_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
RTCD_EXTERN unsigned int (*vpx_sad64x64_avg)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
void vpx_sad64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_avx512(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad64x64x4d_avx512(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad8x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad8x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1759,8 +1897,8 @@ unsigned int vpx_sad8x16_avg_c(const uint8_t *src_ptr, int src_stride, const uin
unsigned int vpx_sad8x16_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_sad8x16_avg vpx_sad8x16_avg_sse2
void vpx_sad8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad8x16x4d vpx_sad8x16x4d_sse2
unsigned int vpx_sad8x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1771,8 +1909,8 @@ unsigned int vpx_sad8x4_avg_c(const uint8_t *src_ptr, int src_stride, const uint
unsigned int vpx_sad8x4_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_sad8x4_avg vpx_sad8x4_avg_sse2
void vpx_sad8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x4x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad8x4x4d vpx_sad8x4x4d_sse2
unsigned int vpx_sad8x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
@@ -1783,10 +1921,120 @@ unsigned int vpx_sad8x8_avg_c(const uint8_t *src_ptr, int src_stride, const uint
unsigned int vpx_sad8x8_avg_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, const uint8_t *second_pred);
#define vpx_sad8x8_avg vpx_sad8x8_avg_sse2
void vpx_sad8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t * const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad8x8x4d vpx_sad8x8x4d_sse2
unsigned int vpx_sad_skip_16x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_16x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_sad_skip_16x16 vpx_sad_skip_16x16_sse2
void vpx_sad_skip_16x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_16x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad_skip_16x16x4d vpx_sad_skip_16x16x4d_sse2
unsigned int vpx_sad_skip_16x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_16x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_sad_skip_16x32 vpx_sad_skip_16x32_sse2
void vpx_sad_skip_16x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_16x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad_skip_16x32x4d vpx_sad_skip_16x32x4d_sse2
unsigned int vpx_sad_skip_16x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_16x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_sad_skip_16x8 vpx_sad_skip_16x8_sse2
void vpx_sad_skip_16x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_16x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad_skip_16x8x4d vpx_sad_skip_16x8x4d_sse2
unsigned int vpx_sad_skip_32x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x16_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_32x16)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_32x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x16x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_32x16x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_32x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x32_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_32x32)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_32x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_32x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_32x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_32x64_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_32x64)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_32x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_32x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_32x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_4x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_sad_skip_4x4 vpx_sad_skip_4x4_c
void vpx_sad_skip_4x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad_skip_4x4x4d vpx_sad_skip_4x4x4d_c
unsigned int vpx_sad_skip_4x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_4x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_sad_skip_4x8 vpx_sad_skip_4x8_sse2
void vpx_sad_skip_4x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_4x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad_skip_4x8x4d vpx_sad_skip_4x8x4d_sse2
unsigned int vpx_sad_skip_64x32_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_64x32_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_64x32_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_64x32)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_64x32x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_64x32x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_64x32x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_64x32x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_64x64_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_64x64_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_64x64_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
RTCD_EXTERN unsigned int (*vpx_sad_skip_64x64)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
void vpx_sad_skip_64x64x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_64x64x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_64x64x4d_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
RTCD_EXTERN void (*vpx_sad_skip_64x64x4d)(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
unsigned int vpx_sad_skip_8x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_8x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_sad_skip_8x16 vpx_sad_skip_8x16_sse2
void vpx_sad_skip_8x16x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_8x16x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad_skip_8x16x4d vpx_sad_skip_8x16x4d_sse2
unsigned int vpx_sad_skip_8x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_sad_skip_8x4 vpx_sad_skip_8x4_c
void vpx_sad_skip_8x4x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad_skip_8x4x4d vpx_sad_skip_8x4x4d_c
unsigned int vpx_sad_skip_8x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
unsigned int vpx_sad_skip_8x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride);
#define vpx_sad_skip_8x8 vpx_sad_skip_8x8_sse2
void vpx_sad_skip_8x8x4d_c(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
void vpx_sad_skip_8x8x4d_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *const ref_array[4], int ref_stride, uint32_t sad_array[4]);
#define vpx_sad_skip_8x8x4d vpx_sad_skip_8x8x4d_sse2
int vpx_satd_c(const tran_low_t *coeff, int length);
int vpx_satd_sse2(const tran_low_t *coeff, int length);
int vpx_satd_avx2(const tran_low_t *coeff, int length);
@@ -1811,6 +2059,11 @@ void vpx_scaled_horiz_c(const uint8_t *src, ptrdiff_t src_stride, uint8_t *dst,
void vpx_scaled_vert_c(const uint8_t *src, ptrdiff_t src_stride, uint8_t *dst, ptrdiff_t dst_stride, const InterpKernel *filter, int x0_q4, int x_step_q4, int y0_q4, int y_step_q4, int w, int h);
#define vpx_scaled_vert vpx_scaled_vert_c
int64_t vpx_sse_c(const uint8_t *src, int src_stride, const uint8_t *ref, int ref_stride, int width, int height);
int64_t vpx_sse_sse4_1(const uint8_t *src, int src_stride, const uint8_t *ref, int ref_stride, int width, int height);
int64_t vpx_sse_avx2(const uint8_t *src, int src_stride, const uint8_t *ref, int ref_stride, int width, int height);
RTCD_EXTERN int64_t (*vpx_sse)(const uint8_t *src, int src_stride, const uint8_t *ref, int ref_stride, int width, int height);
uint32_t vpx_sub_pixel_avg_variance16x16_c(const uint8_t *src_ptr, int src_stride, int x_offset, int y_offset, const uint8_t *ref_ptr, int ref_stride, uint32_t *sse, const uint8_t *second_pred);
uint32_t vpx_sub_pixel_avg_variance16x16_sse2(const uint8_t *src_ptr, int src_stride, int x_offset, int y_offset, const uint8_t *ref_ptr, int ref_stride, uint32_t *sse, const uint8_t *second_pred);
uint32_t vpx_sub_pixel_avg_variance16x16_ssse3(const uint8_t *src_ptr, int src_stride, int x_offset, int y_offset, const uint8_t *ref_ptr, int ref_stride, uint32_t *sse, const uint8_t *second_pred);
@@ -2036,15 +2289,18 @@ RTCD_EXTERN unsigned int (*vpx_variance64x64)(const uint8_t *src_ptr, int src_st
unsigned int vpx_variance8x16_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x16_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
#define vpx_variance8x16 vpx_variance8x16_sse2
unsigned int vpx_variance8x16_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
RTCD_EXTERN unsigned int (*vpx_variance8x16)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x4_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x4_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
#define vpx_variance8x4 vpx_variance8x4_sse2
unsigned int vpx_variance8x4_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
RTCD_EXTERN unsigned int (*vpx_variance8x4)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x8_c(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
unsigned int vpx_variance8x8_sse2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
#define vpx_variance8x8 vpx_variance8x8_sse2
unsigned int vpx_variance8x8_avx2(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
RTCD_EXTERN unsigned int (*vpx_variance8x8)(const uint8_t *src_ptr, int src_stride, const uint8_t *ref_ptr, int ref_stride, unsigned int *sse);
void vpx_ve_predictor_4x4_c(uint8_t *dst, ptrdiff_t stride, const uint8_t *above, const uint8_t *left);
#define vpx_ve_predictor_4x4 vpx_ve_predictor_4x4_c
@@ -2063,6 +2319,8 @@ static void setup_rtcd_internal(void)
(void)flags;
vpx_comp_avg_pred = vpx_comp_avg_pred_sse2;
if (flags & HAS_AVX2) vpx_comp_avg_pred = vpx_comp_avg_pred_avx2;
vpx_convolve8 = vpx_convolve8_sse2;
if (flags & HAS_SSSE3) vpx_convolve8 = vpx_convolve8_ssse3;
if (flags & HAS_AVX2) vpx_convolve8 = vpx_convolve8_avx2;
@@ -2245,12 +2503,52 @@ static void setup_rtcd_internal(void)
if (flags & HAS_AVX2) vpx_highbd_sad64x64_avg = vpx_highbd_sad64x64_avg_avx2;
vpx_highbd_sad64x64x4d = vpx_highbd_sad64x64x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad64x64x4d = vpx_highbd_sad64x64x4d_avx2;
vpx_highbd_sad_skip_16x16 = vpx_highbd_sad_skip_16x16_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x16 = vpx_highbd_sad_skip_16x16_avx2;
vpx_highbd_sad_skip_16x16x4d = vpx_highbd_sad_skip_16x16x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x16x4d = vpx_highbd_sad_skip_16x16x4d_avx2;
vpx_highbd_sad_skip_16x32 = vpx_highbd_sad_skip_16x32_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x32 = vpx_highbd_sad_skip_16x32_avx2;
vpx_highbd_sad_skip_16x32x4d = vpx_highbd_sad_skip_16x32x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x32x4d = vpx_highbd_sad_skip_16x32x4d_avx2;
vpx_highbd_sad_skip_16x8 = vpx_highbd_sad_skip_16x8_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x8 = vpx_highbd_sad_skip_16x8_avx2;
vpx_highbd_sad_skip_16x8x4d = vpx_highbd_sad_skip_16x8x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_16x8x4d = vpx_highbd_sad_skip_16x8x4d_avx2;
vpx_highbd_sad_skip_32x16 = vpx_highbd_sad_skip_32x16_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x16 = vpx_highbd_sad_skip_32x16_avx2;
vpx_highbd_sad_skip_32x16x4d = vpx_highbd_sad_skip_32x16x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x16x4d = vpx_highbd_sad_skip_32x16x4d_avx2;
vpx_highbd_sad_skip_32x32 = vpx_highbd_sad_skip_32x32_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x32 = vpx_highbd_sad_skip_32x32_avx2;
vpx_highbd_sad_skip_32x32x4d = vpx_highbd_sad_skip_32x32x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x32x4d = vpx_highbd_sad_skip_32x32x4d_avx2;
vpx_highbd_sad_skip_32x64 = vpx_highbd_sad_skip_32x64_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x64 = vpx_highbd_sad_skip_32x64_avx2;
vpx_highbd_sad_skip_32x64x4d = vpx_highbd_sad_skip_32x64x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_32x64x4d = vpx_highbd_sad_skip_32x64x4d_avx2;
vpx_highbd_sad_skip_64x32 = vpx_highbd_sad_skip_64x32_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_64x32 = vpx_highbd_sad_skip_64x32_avx2;
vpx_highbd_sad_skip_64x32x4d = vpx_highbd_sad_skip_64x32x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_64x32x4d = vpx_highbd_sad_skip_64x32x4d_avx2;
vpx_highbd_sad_skip_64x64 = vpx_highbd_sad_skip_64x64_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_64x64 = vpx_highbd_sad_skip_64x64_avx2;
vpx_highbd_sad_skip_64x64x4d = vpx_highbd_sad_skip_64x64x4d_sse2;
if (flags & HAS_AVX2) vpx_highbd_sad_skip_64x64x4d = vpx_highbd_sad_skip_64x64x4d_avx2;
vpx_highbd_satd = vpx_highbd_satd_c;
if (flags & HAS_AVX2) vpx_highbd_satd = vpx_highbd_satd_avx2;
vpx_highbd_sse = vpx_highbd_sse_c;
if (flags & HAS_SSE4_1) vpx_highbd_sse = vpx_highbd_sse_sse4_1;
if (flags & HAS_AVX2) vpx_highbd_sse = vpx_highbd_sse_avx2;
vpx_highbd_subtract_block = vpx_highbd_subtract_block_c;
if (flags & HAS_AVX2) vpx_highbd_subtract_block = vpx_highbd_subtract_block_avx2;
vpx_idct16x16_256_add = vpx_idct16x16_256_add_sse2;
if (flags & HAS_AVX2) vpx_idct16x16_256_add = vpx_idct16x16_256_add_avx2;
vpx_idct32x32_1024_add = vpx_idct32x32_1024_add_sse2;
if (flags & HAS_AVX2) vpx_idct32x32_1024_add = vpx_idct32x32_1024_add_avx2;
vpx_idct32x32_135_add = vpx_idct32x32_135_add_sse2;
if (flags & HAS_SSSE3) vpx_idct32x32_135_add = vpx_idct32x32_135_add_ssse3;
if (flags & HAS_AVX2) vpx_idct32x32_135_add = vpx_idct32x32_135_add_avx2;
vpx_idct32x32_34_add = vpx_idct32x32_34_add_sse2;
if (flags & HAS_SSSE3) vpx_idct32x32_34_add = vpx_idct32x32_34_add_ssse3;
vpx_idct8x8_12_add = vpx_idct8x8_12_add_sse2;
@@ -2296,10 +2594,33 @@ static void setup_rtcd_internal(void)
vpx_sad64x64x4d = vpx_sad64x64x4d_sse2;
if (flags & HAS_AVX2) vpx_sad64x64x4d = vpx_sad64x64x4d_avx2;
if (flags & HAS_AVX512) vpx_sad64x64x4d = vpx_sad64x64x4d_avx512;
vpx_sad_skip_32x16 = vpx_sad_skip_32x16_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x16 = vpx_sad_skip_32x16_avx2;
vpx_sad_skip_32x16x4d = vpx_sad_skip_32x16x4d_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x16x4d = vpx_sad_skip_32x16x4d_avx2;
vpx_sad_skip_32x32 = vpx_sad_skip_32x32_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x32 = vpx_sad_skip_32x32_avx2;
vpx_sad_skip_32x32x4d = vpx_sad_skip_32x32x4d_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x32x4d = vpx_sad_skip_32x32x4d_avx2;
vpx_sad_skip_32x64 = vpx_sad_skip_32x64_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x64 = vpx_sad_skip_32x64_avx2;
vpx_sad_skip_32x64x4d = vpx_sad_skip_32x64x4d_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_32x64x4d = vpx_sad_skip_32x64x4d_avx2;
vpx_sad_skip_64x32 = vpx_sad_skip_64x32_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_64x32 = vpx_sad_skip_64x32_avx2;
vpx_sad_skip_64x32x4d = vpx_sad_skip_64x32x4d_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_64x32x4d = vpx_sad_skip_64x32x4d_avx2;
vpx_sad_skip_64x64 = vpx_sad_skip_64x64_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_64x64 = vpx_sad_skip_64x64_avx2;
vpx_sad_skip_64x64x4d = vpx_sad_skip_64x64x4d_sse2;
if (flags & HAS_AVX2) vpx_sad_skip_64x64x4d = vpx_sad_skip_64x64x4d_avx2;
vpx_satd = vpx_satd_sse2;
if (flags & HAS_AVX2) vpx_satd = vpx_satd_avx2;
vpx_scaled_2d = vpx_scaled_2d_c;
if (flags & HAS_SSSE3) vpx_scaled_2d = vpx_scaled_2d_ssse3;
vpx_sse = vpx_sse_c;
if (flags & HAS_SSE4_1) vpx_sse = vpx_sse_sse4_1;
if (flags & HAS_AVX2) vpx_sse = vpx_sse_avx2;
vpx_sub_pixel_avg_variance16x16 = vpx_sub_pixel_avg_variance16x16_sse2;
if (flags & HAS_SSSE3) vpx_sub_pixel_avg_variance16x16 = vpx_sub_pixel_avg_variance16x16_ssse3;
vpx_sub_pixel_avg_variance16x32 = vpx_sub_pixel_avg_variance16x32_sse2;
@@ -2374,6 +2695,12 @@ static void setup_rtcd_internal(void)
if (flags & HAS_AVX2) vpx_variance64x32 = vpx_variance64x32_avx2;
vpx_variance64x64 = vpx_variance64x64_sse2;
if (flags & HAS_AVX2) vpx_variance64x64 = vpx_variance64x64_avx2;
vpx_variance8x16 = vpx_variance8x16_sse2;
if (flags & HAS_AVX2) vpx_variance8x16 = vpx_variance8x16_avx2;
vpx_variance8x4 = vpx_variance8x4_sse2;
if (flags & HAS_AVX2) vpx_variance8x4 = vpx_variance8x4_avx2;
vpx_variance8x8 = vpx_variance8x8_sse2;
if (flags & HAS_AVX2) vpx_variance8x8 = vpx_variance8x8_avx2;
}
#endif
@@ -2381,4 +2708,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VPX_DSP_RTCD_H_
+11 -1
View File
@@ -1,3 +1,13 @@
/*
* Copyright (c) 2025 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef VPX_SCALE_RTCD_H_
#define VPX_SCALE_RTCD_H_
@@ -70,4 +80,4 @@ static void setup_rtcd_internal(void)
} // extern "C"
#endif
#endif
#endif // VPX_SCALE_RTCD_H_
+1 -4
View File
@@ -8,13 +8,13 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include "args.h"
#include "vpx/vpx_integer.h"
#include "vpx_ports/msvc.h"
#if defined(__GNUC__)
__attribute__((noreturn)) extern void die(const char *fmt, ...);
@@ -135,7 +135,6 @@ unsigned int arg_parse_uint(const struct arg *arg) {
}
die("Option %s: Invalid character '%c'\n", arg->name, *endptr);
return 0;
}
int arg_parse_int(const struct arg *arg) {
@@ -152,7 +151,6 @@ int arg_parse_int(const struct arg *arg) {
}
die("Option %s: Invalid character '%c'\n", arg->name, *endptr);
return 0;
}
struct vpx_rational {
@@ -209,7 +207,6 @@ int arg_parse_enum(const struct arg *arg) {
if (!strcmp(arg->val, listptr->name)) return listptr->val;
die("Option %s: Invalid value '%s'\n", arg->name, arg->val);
return 0;
}
int arg_parse_enum_or_int(const struct arg *arg) {
+6 -7
View File
@@ -15,13 +15,9 @@ ifdef NDK_ROOT
# In an Android project place a libvpx checkout in the jni directory.
# Run the configure script from the jni directory. Base libvpx
# encoder/decoder configuration will look similar to:
# ./libvpx/configure --target=armv7-android-gcc --disable-examples \
# ./libvpx/configure --target=arm64-android-gcc --disable-examples \
# --enable-external-build
#
# When targeting Android, realtime-only is enabled by default. This can
# be overridden by adding the command line flag:
# --disable-realtime-only
#
# This will create .mk files that contain variables that contain the
# source files to compile.
#
@@ -38,11 +34,14 @@ ifdef NDK_ROOT
# but the resulting library *must* be run on devices supporting all of the
# enabled extensions. They can be disabled individually with
# --disable-{sse2, sse3, ssse3, sse4_1, avx, avx2, avx512}
# --disable-neon[-asm]
# --disable-neon{, -asm, -neon-dotprod, -neon-i8mm}
# --disable-sve
# --disable-{dspr2, msa}
#
# Running ndk-build will build libvpx and include it in your project.
# Running ndk-build will build libvpx and include it in your project. Set
# APP_ABI to match the --target passed to configure:
# https://developer.android.com/ndk/guides/application_mk#app_abi.
#
CONFIG_DIR := $(LOCAL_PATH)/
+25 -1
View File
@@ -143,6 +143,16 @@ $(BUILD_PFX)%_avx2.c.o: CFLAGS += -mavx2
$(BUILD_PFX)%_avx512.c.d: CFLAGS += -mavx512f -mavx512cd -mavx512bw -mavx512dq -mavx512vl
$(BUILD_PFX)%_avx512.c.o: CFLAGS += -mavx512f -mavx512cd -mavx512bw -mavx512dq -mavx512vl
# AARCH64
$(BUILD_PFX)%_neon_dotprod.c.d: CFLAGS += -march=armv8.2-a+dotprod
$(BUILD_PFX)%_neon_dotprod.c.o: CFLAGS += -march=armv8.2-a+dotprod
$(BUILD_PFX)%_neon_i8mm.c.d: CFLAGS += -march=armv8.2-a+dotprod+i8mm
$(BUILD_PFX)%_neon_i8mm.c.o: CFLAGS += -march=armv8.2-a+dotprod+i8mm
$(BUILD_PFX)%_sve.c.d: CFLAGS += -march=armv8.2-a+dotprod+i8mm+sve
$(BUILD_PFX)%_sve.c.o: CFLAGS += -march=armv8.2-a+dotprod+i8mm+sve
$(BUILD_PFX)%_sve2.c.d: CFLAGS += -march=armv9-a+sve2
$(BUILD_PFX)%_sve2.c.o: CFLAGS += -march=armv9-a+sve2
# POWER
$(BUILD_PFX)%_vsx.c.d: CFLAGS += -maltivec -mvsx
$(BUILD_PFX)%_vsx.c.o: CFLAGS += -maltivec -mvsx
@@ -304,6 +314,19 @@ $(1):
$(qexec)$$(AR) $$(ARFLAGS) $$@ $$^
endef
# Don't use -Wl,-z,defs with Clang's sanitizers.
#
# Clang's AddressSanitizer documentation says "When linking shared libraries,
# the AddressSanitizer run-time is not linked, so -Wl,-z,defs may cause link
# errors (don't use it with AddressSanitizer)." See
# https://clang.llvm.org/docs/AddressSanitizer.html#usage.
NO_UNDEFINED := -Wl,-z,defs
ifeq ($(findstring clang,$(CC)),clang)
ifneq ($(filter -fsanitize=%,$(LDFLAGS)),)
NO_UNDEFINED :=
endif
endif
define so_template
# Not using a pattern rule here because we don't want to generate empty
# archives when they are listed as a dependency in files not responsible
@@ -313,7 +336,8 @@ define so_template
$(1):
$(if $(quiet),@echo " [LD] $$@")
$(qexec)$$(LD) -shared $$(LDFLAGS) \
-Wl,--no-undefined -Wl,-soname,$$(SONAME) \
$(NO_UNDEFINED) \
-Wl,-soname,$$(SONAME) \
-Wl,--version-script,$$(EXPORTS_FILE) -o $$@ \
$$(filter %.o,$$^) $$(extralibs)
endef
+129 -24
View File
@@ -74,6 +74,8 @@ Build options:
--cpu=CPU optimize for a specific cpu rather than a family
--extra-cflags=ECFLAGS add ECFLAGS to CFLAGS [$CFLAGS]
--extra-cxxflags=ECXXFLAGS add ECXXFLAGS to CXXFLAGS [$CXXFLAGS]
--use-profile=PROFILE_FILE
Use PROFILE_FILE for PGO
${toggle_extra_warnings} emit harmless warnings (always non-fatal)
${toggle_werror} treat warnings as errors, if possible
(not available with all compilers)
@@ -81,6 +83,7 @@ Build options:
${toggle_pic} turn on/off Position Independent Code
${toggle_ccache} turn on/off compiler cache
${toggle_debug} enable/disable debug mode
${toggle_profile} enable/disable profiling
${toggle_gprof} enable/disable gprof profiling instrumentation
${toggle_gcov} enable/disable gcov coverage instrumentation
${toggle_thumb} enable/disable building arm assembly in thumb mode
@@ -429,6 +432,42 @@ check_gcc_machine_options() {
fi
}
check_neon_sve_bridge_compiles() {
if enabled sve; then
check_cc -march=armv8.2-a+dotprod+i8mm+sve <<EOF
#ifndef __ARM_NEON_SVE_BRIDGE
#error 1
#endif
#include <arm_sve.h>
#include <arm_neon_sve_bridge.h>
EOF
compile_result=$?
if [ ${compile_result} -eq 0 ]; then
# Check whether the compiler can compile SVE functions that require
# backup/restore of SVE registers according to AAPCS. Clang for Windows
# used to fail this, see
# https://github.com/llvm/llvm-project/issues/80009.
check_cc -march=armv8.2-a+dotprod+i8mm+sve <<EOF
#include <arm_sve.h>
void other(void);
svfloat32_t func(svfloat32_t a) {
other();
return a;
}
EOF
compile_result=$?
fi
if [ ${compile_result} -ne 0 ]; then
log_echo " disabling sve: arm_neon_sve_bridge.h not supported by compiler"
log_echo " disabling sve2: arm_neon_sve_bridge.h not supported by compiler"
disable_feature sve
disable_feature sve2
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-sve --disable-sve2 "
fi
fi
}
check_gcc_avx512_compiles() {
if disabled gcc; then
return
@@ -509,7 +548,6 @@ AR=${AR}
LD=${LD}
AS=${AS}
STRIP=${STRIP}
NM=${NM}
CFLAGS = ${CFLAGS}
CXXFLAGS = ${CXXFLAGS}
@@ -521,6 +559,7 @@ AS_SFX = ${AS_SFX:-.asm}
EXE_SFX = ${EXE_SFX}
VCPROJ_SFX = ${VCPROJ_SFX}
RTCD_OPTIONS = ${RTCD_OPTIONS}
LIBWEBM_CXXFLAGS = ${LIBWEBM_CXXFLAGS}
LIBYUV_CXXFLAGS = ${LIBYUV_CXXFLAGS}
EOF
@@ -610,6 +649,9 @@ process_common_cmdline() {
--extra-cxxflags=*)
extra_cxxflags="${optval}"
;;
--use-profile=*)
pgo_file=${optval}
;;
--enable-?*|--disable-?*)
eval `echo "$opt" | sed 's/--/action=/;s/-/ option=/;s/-/_/g'`
if is_in ${option} ${ARCH_EXT_LIST}; then
@@ -706,7 +748,6 @@ setup_gnu_toolchain() {
LD=${LD:-${CROSS}${link_with_cc:-ld}}
AS=${AS:-${CROSS}as}
STRIP=${STRIP:-${CROSS}strip}
NM=${NM:-${CROSS}nm}
AS_SFX=.S
EXE_SFX=
}
@@ -791,7 +832,7 @@ process_common_toolchain() {
tgt_isa=x86_64
tgt_os=`echo $gcctarget | sed 's/.*\(darwin1[0-9]\).*/\1/'`
;;
*darwin2[0-2]*)
*darwin2[0-4]*)
tgt_isa=`uname -m`
tgt_os=`echo $gcctarget | sed 's/.*\(darwin2[0-9]\).*/\1/'`
;;
@@ -842,6 +883,10 @@ process_common_toolchain() {
# Enable the architecture family
case ${tgt_isa} in
arm64 | armv8)
enable_feature arm
enable_feature aarch64
;;
arm*)
enable_feature arm
;;
@@ -858,8 +903,14 @@ process_common_toolchain() {
;;
esac
# PIC is probably what we want when building shared libs
# Position independent code (PIC) is probably what we want when building
# shared libs or position independent executable (PIE) targets.
enabled shared && soft_enable pic
check_cpp << EOF || soft_enable pic
#if !(__pie__ || __PIE__)
#error Neither __pie__ or __PIE__ are set
#endif
EOF
# Minimum iOS version for all target platforms (darwin and iphonesimulator).
# Shared library framework builds are only possible on iOS 8 and later.
@@ -940,7 +991,7 @@ process_common_toolchain() {
add_cflags "-mmacosx-version-min=10.15"
add_ldflags "-mmacosx-version-min=10.15"
;;
*-darwin2[0-2]-*)
*-darwin2[0-4]-*)
add_cflags "-arch ${toolchain%%-*}"
add_ldflags "-arch ${toolchain%%-*}"
;;
@@ -965,27 +1016,30 @@ process_common_toolchain() {
;;
esac
# Process ARM architecture variants
# Process architecture variants
case ${toolchain} in
arm*)
# on arm, isa versions are supersets
case ${tgt_isa} in
arm64|armv8)
soft_enable neon
case ${toolchain} in
armv7*-darwin*)
# Runtime cpu detection is not defined for these targets.
enabled runtime_cpu_detect && disable_feature runtime_cpu_detect
;;
armv7|armv7s)
soft_enable neon
# Only enable neon_asm when neon is also enabled.
enabled neon && soft_enable neon_asm
# If someone tries to force it through, die.
if disabled neon && enabled neon_asm; then
die "Disabling neon while keeping neon-asm is not supported"
fi
*)
soft_enable runtime_cpu_detect
;;
esac
asm_conversion_cmd="cat"
if [ ${tgt_isa} = "armv7" ] || [ ${tgt_isa} = "armv7s" ]; then
soft_enable neon
# Only enable neon_asm when neon is also enabled.
enabled neon && soft_enable neon_asm
# If someone tries to force it through, die.
if disabled neon && enabled neon_asm; then
die "Disabling neon while keeping neon-asm is not supported"
fi
fi
asm_conversion_cmd="cat"
case ${tgt_cc} in
gcc)
link_with_cc=gcc
@@ -1066,8 +1120,11 @@ EOF
enable_feature win_arm64_neon_h_workaround
else
# If a probe is not possible, assume this is the pure Windows
# SDK and so the workaround is necessary.
enable_feature win_arm64_neon_h_workaround
# SDK and so the workaround is necessary when using Visual
# Studio < 2019.
if [ ${tgt_cc##vs} -lt 16 ]; then
enable_feature win_arm64_neon_h_workaround
fi
fi
fi
fi
@@ -1078,7 +1135,6 @@ EOF
AS=armasm
LD="${source_path}/build/make/armlink_adapter.sh"
STRIP=arm-none-linux-gnueabi-strip
NM=arm-none-linux-gnueabi-nm
tune_cflags="--cpu="
tune_asflags="--cpu="
if [ -z "${tune_cpu}" ]; then
@@ -1115,6 +1171,14 @@ EOF
echo "See build/make/Android.mk for details."
check_add_ldflags -static
soft_enable unit_tests
case "$AS" in
*clang)
# The GNU Assembler was removed in the r24 version of the NDK.
# clang's internal assembler works, but `-c` is necessary to
# avoid linking.
add_asflags -c
;;
esac
;;
darwin)
@@ -1125,8 +1189,6 @@ EOF
AR="$(${XCRUN_FIND} ar)"
AS="$(${XCRUN_FIND} as)"
STRIP="$(${XCRUN_FIND} strip)"
NM="$(${XCRUN_FIND} nm)"
RANLIB="$(${XCRUN_FIND} ranlib)"
AS_SFX=.S
LD="${CXX:-$(${XCRUN_FIND} ld)}"
@@ -1201,6 +1263,38 @@ EOF
fi
;;
esac
# AArch64 ISA extensions are treated as supersets.
if [ ${tgt_isa} = "arm64" ] || [ ${tgt_isa} = "armv8" ]; then
aarch64_arch_flag_neon="arch=armv8-a"
aarch64_arch_flag_neon_dotprod="arch=armv8.2-a+dotprod"
aarch64_arch_flag_neon_i8mm="arch=armv8.2-a+dotprod+i8mm"
aarch64_arch_flag_sve="arch=armv8.2-a+dotprod+i8mm+sve"
aarch64_arch_flag_sve2="arch=armv9-a+sve2"
for ext in ${ARCH_EXT_LIST_AARCH64}; do
if [ "$disable_exts" = "yes" ]; then
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-${ext} "
soft_disable $ext
else
# Check the compiler supports the -march flag for the extension.
# This needs to happen after toolchain/OS inspection so we handle
# $CROSS etc correctly when checking for flags, else these will
# always fail.
flag="$(eval echo \$"aarch64_arch_flag_${ext}")"
check_gcc_machine_option "${flag}" "${ext}"
if ! enabled $ext; then
# Disable higher order extensions to simplify dependencies.
disable_exts="yes"
RTCD_OPTIONS="${RTCD_OPTIONS}--disable-${ext} "
soft_disable $ext
fi
fi
done
if enabled sve; then
check_neon_sve_bridge_compiles
fi
fi
;;
mips*)
link_with_cc=gcc
@@ -1457,6 +1551,14 @@ EOF
;;
esac
# Enable PGO
if [ -n "${pgo_file}" ]; then
check_add_cflags -fprofile-use=${pgo_file} || \
die "-fprofile-use is not supported by compiler"
check_add_ldflags -fprofile-use=${pgo_file} || \
die "-fprofile-use is not supported by linker"
fi
# Try to enable CPU specific tuning
if [ -n "${tune_cpu}" ]; then
if [ -n "${tune_cflags}" ]; then
@@ -1477,6 +1579,9 @@ EOF
else
check_add_cflags -DNDEBUG
fi
enabled profile &&
check_add_cflags -fprofile-generate &&
check_add_ldflags -fprofile-generate
enabled gprof && check_add_cflags -pg && check_add_ldflags -pg
enabled gcov &&
+30 -12
View File
@@ -141,7 +141,17 @@ for opt in "$@"; do
case "$opt" in
--help|-h) show_help
;;
--target=*) target="${optval}"
--target=*)
target="${optval}"
platform_toolset=$(echo ${target} | awk 'BEGIN{FS="-"}{print $4}')
case "$platform_toolset" in
clangcl) platform_toolset="ClangCl"
;;
"")
;;
*) die Unrecognized Visual Studio Platform Toolset in $opt
;;
esac
;;
--out=*) outfile="$optval"
;;
@@ -259,6 +269,10 @@ case "$target" in
;;
arm64*)
platforms[0]="ARM64"
# As of Visual Studio 2022 17.5.5, clang-cl does not support ARM64EC.
if [ "$vs_ver" -ge 17 -a "$platform_toolset" != "ClangCl" ]; then
platforms[1]="ARM64EC"
fi
asm_Debug_cmdline="armasm64 -nologo -oldit &quot;%(FullPath)&quot;"
asm_Release_cmdline="armasm64 -nologo -oldit &quot;%(FullPath)&quot;"
;;
@@ -335,17 +349,21 @@ generate_vcxproj() {
else
tag_content ConfigurationType StaticLibrary
fi
if [ "$vs_ver" = "14" ]; then
tag_content PlatformToolset v140
fi
if [ "$vs_ver" = "15" ]; then
tag_content PlatformToolset v141
fi
if [ "$vs_ver" = "16" ]; then
tag_content PlatformToolset v142
fi
if [ "$vs_ver" = "17" ]; then
tag_content PlatformToolset v143
if [ -n "$platform_toolset" ]; then
tag_content PlatformToolset "$platform_toolset"
else
if [ "$vs_ver" = "14" ]; then
tag_content PlatformToolset v140
fi
if [ "$vs_ver" = "15" ]; then
tag_content PlatformToolset v141
fi
if [ "$vs_ver" = "16" ]; then
tag_content PlatformToolset v142
fi
if [ "$vs_ver" = "17" ]; then
tag_content PlatformToolset v143
fi
fi
tag_content CharacterSet Unicode
if [ "$config" = "Release" ]; then
+19 -2
View File
@@ -73,6 +73,10 @@ sub vpx_config($) {
}
sub specialize {
if (@_ <= 1) {
die "'specialize' must be called with a function name and at least one ",
"architecture ('C' is implied): \n@_\n";
}
my $fn=$_[0];
shift;
foreach my $opt (@_) {
@@ -208,7 +212,19 @@ sub filter {
#
sub common_top() {
my $include_guard = uc($opts{sym})."_H_";
my @time = localtime;
my $year = $time[5] + 1900;
print <<EOF;
/*
* Copyright (c) ${year} The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
// This file is generated. Do not edit.
#ifndef ${include_guard}
#define ${include_guard}
@@ -238,13 +254,14 @@ EOF
}
sub common_bottom() {
my $include_guard = uc($opts{sym})."_H_";
print <<EOF;
#ifdef __cplusplus
} // extern "C"
#endif
#endif
#endif // ${include_guard}
EOF
}
@@ -487,7 +504,7 @@ if ($opts{arch} eq 'x86') {
@ALL_ARCHS = filter(qw/neon_asm neon/);
arm;
} elsif ($opts{arch} eq 'armv8' || $opts{arch} eq 'arm64' ) {
@ALL_ARCHS = filter(qw/neon/);
@ALL_ARCHS = filter(qw/neon neon_dotprod neon_i8mm sve sve2/);
@REQUIRES = filter(qw/neon/);
&require(@REQUIRES);
arm;
+3
View File
@@ -61,6 +61,8 @@ if [ ${bare} ]; then
else
cat<<EOF>$$.tmp
// This file is generated. Do not edit.
#ifndef VPX_VERSION_H_
#define VPX_VERSION_H_
#define VERSION_MAJOR $major_version
#define VERSION_MINOR $minor_version
#define VERSION_PATCH $patch_version
@@ -68,6 +70,7 @@ else
#define VERSION_PACKED ((VERSION_MAJOR<<16)|(VERSION_MINOR<<8)|(VERSION_PATCH))
#define ${id}_NOSP "${version_str}"
#define ${id} " ${version_str}"
#endif // VPX_VERSION_H_
EOF
fi
if [ -n "$out_file" ]; then
Vendored
+33 -9
View File
@@ -102,11 +102,15 @@ all_platforms="${all_platforms} arm64-darwin-gcc"
all_platforms="${all_platforms} arm64-darwin20-gcc"
all_platforms="${all_platforms} arm64-darwin21-gcc"
all_platforms="${all_platforms} arm64-darwin22-gcc"
all_platforms="${all_platforms} arm64-darwin23-gcc"
all_platforms="${all_platforms} arm64-darwin24-gcc"
all_platforms="${all_platforms} arm64-linux-gcc"
all_platforms="${all_platforms} arm64-win64-gcc"
all_platforms="${all_platforms} arm64-win64-vs15"
all_platforms="${all_platforms} arm64-win64-vs16"
all_platforms="${all_platforms} arm64-win64-vs16-clangcl"
all_platforms="${all_platforms} arm64-win64-vs17"
all_platforms="${all_platforms} arm64-win64-vs17-clangcl"
all_platforms="${all_platforms} armv7-android-gcc" #neon Cortex-A8
all_platforms="${all_platforms} armv7-darwin-gcc" #neon Cortex-A8
all_platforms="${all_platforms} armv7-linux-rvct" #neon Cortex-A8
@@ -163,6 +167,8 @@ all_platforms="${all_platforms} x86_64-darwin19-gcc"
all_platforms="${all_platforms} x86_64-darwin20-gcc"
all_platforms="${all_platforms} x86_64-darwin21-gcc"
all_platforms="${all_platforms} x86_64-darwin22-gcc"
all_platforms="${all_platforms} x86_64-darwin23-gcc"
all_platforms="${all_platforms} x86_64-darwin24-gcc"
all_platforms="${all_platforms} x86_64-iphonesimulator-gcc"
all_platforms="${all_platforms} x86_64-linux-gcc"
all_platforms="${all_platforms} x86_64-linux-icc"
@@ -243,12 +249,22 @@ CODEC_FAMILIES="
ARCH_LIST="
arm
aarch64
mips
x86
x86_64
ppc
loongarch
"
ARCH_EXT_LIST_AARCH64="
neon
neon_dotprod
neon_i8mm
sve
sve2
"
ARCH_EXT_LIST_X86="
mmx
sse
@@ -268,8 +284,8 @@ ARCH_EXT_LIST_LOONGSON="
"
ARCH_EXT_LIST="
neon
neon_asm
${ARCH_EXT_LIST_AARCH64}
mips32
dspr2
@@ -293,6 +309,7 @@ EXPERIMENT_LIST="
emulate_hardware
non_greedy_mv
rate_ctrl
collect_component_timing
"
CONFIG_LIST="
dependency_tracking
@@ -342,7 +359,6 @@ CONFIG_LIST="
multi_res_encoding
temporal_denoising
vp9_temporal_denoising
consistent_recode
coefficient_range_checking
vp9_highbitdepth
better_hw_compatibility
@@ -363,6 +379,7 @@ CMDLINE_SELECT="
install_libs
install_srcs
debug
profile
gprof
gcov
pic
@@ -406,7 +423,6 @@ CMDLINE_SELECT="
multi_res_encoding
temporal_denoising
vp9_temporal_denoising
consistent_recode
coefficient_range_checking
better_hw_compatibility
vp9_highbitdepth
@@ -633,7 +649,6 @@ process_toolchain() {
if enabled gcc; then
enabled werror && check_add_cflags -Werror
check_add_cflags -Wall
check_add_cflags -Wdeclaration-after-statement
check_add_cflags -Wdisabled-optimization
check_add_cflags -Wextra-semi
check_add_cflags -Wextra-semi-stmt
@@ -647,8 +662,10 @@ process_toolchain() {
check_add_cflags -Wimplicit-function-declaration
check_add_cflags -Wmissing-declarations
check_add_cflags -Wmissing-prototypes
check_add_cflags -Wshadow
check_add_cflags -Wstrict-prototypes
check_add_cflags -Wuninitialized
check_add_cflags -Wunreachable-code-loop-increment
check_add_cflags -Wunreachable-code-aggressive
check_add_cflags -Wunused
check_add_cflags -Wextra
# check_add_cflags also adds to cxxflags. gtest does not do well with
@@ -659,13 +676,16 @@ process_toolchain() {
if enabled mips || [ -z "${INLINE}" ]; then
enabled extra_warnings || check_add_cflags -Wno-unused-function
fi
# Enforce c89 for c files. Don't be too strict about it though. Allow
# gnu extensions like "//" for comments.
check_cflags -std=gnu89 && add_cflags_only -std=gnu89
# Enforce C99 for C files. Allow GNU extensions.
check_cflags -std=gnu99 && add_cflags_only -std=gnu99
# Avoid this warning for third_party C++ sources. Some reorganization
# would be needed to apply this only to test/*.cc.
check_cflags -Wshorten-64-to-32 && add_cflags_only -Wshorten-64-to-32
# Do not allow implicit vector type conversions on Clang builds (this
# is already the default on GCC builds).
check_add_cflags -flax-vector-conversions=none
# Quiet gcc 6 vs 7 abi warnings:
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77728
if enabled arm; then
@@ -676,14 +696,18 @@ process_toolchain() {
check_add_cxxflags -Wc++14-extensions
check_add_cxxflags -Wc++17-extensions
check_add_cxxflags -Wc++20-extensions
check_add_cxxflags -Wnon-virtual-dtor
# disable some warnings specific to libyuv.
# disable some warnings specific to libyuv / libwebm.
check_cxxflags -Wno-missing-declarations \
&& LIBYUV_CXXFLAGS="${LIBYUV_CXXFLAGS} -Wno-missing-declarations"
check_cxxflags -Wno-missing-prototypes \
&& LIBYUV_CXXFLAGS="${LIBYUV_CXXFLAGS} -Wno-missing-prototypes"
check_cxxflags -Wno-pass-failed \
&& LIBYUV_CXXFLAGS="${LIBYUV_CXXFLAGS} -Wno-pass-failed"
check_cxxflags -Wno-shadow \
&& LIBWEBM_CXXFLAGS="${LIBWEBM_CXXFLAGS} -Wno-shadow" \
&& LIBYUV_CXXFLAGS="${LIBYUV_CXXFLAGS} -Wno-shadow"
check_cxxflags -Wno-unused-parameter \
&& LIBYUV_CXXFLAGS="${LIBYUV_CXXFLAGS} -Wno-unused-parameter"
fi
+1 -20
View File
@@ -57,6 +57,7 @@ LIBWEBM_PARSER_SRCS = third_party/libwebm/mkvparser/mkvparser.cc \
# Add compile flags and include path for libwebm sources.
ifeq ($(CONFIG_WEBM_IO),yes)
CXXFLAGS += -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS
$(BUILD_PFX)third_party/libwebm/%.cc.o: CXXFLAGS += $(LIBWEBM_CXXFLAGS)
INC_PATH-yes += $(SRC_PATH_BARE)/third_party/libwebm
endif
@@ -68,7 +69,6 @@ vpxdec.SRCS += md5_utils.c md5_utils.h
vpxdec.SRCS += vpx_ports/compiler_attributes.h
vpxdec.SRCS += vpx_ports/mem_ops.h
vpxdec.SRCS += vpx_ports/mem_ops_aligned.h
vpxdec.SRCS += vpx_ports/msvc.h
vpxdec.SRCS += vpx_ports/vpx_timer.h
vpxdec.SRCS += vpx/vpx_integer.h
vpxdec.SRCS += args.c args.h
@@ -81,8 +81,6 @@ ifeq ($(CONFIG_LIBYUV),yes)
$(BUILD_PFX)third_party/libyuv/%.cc.o: CXXFLAGS += ${LIBYUV_CXXFLAGS}
endif
ifeq ($(CONFIG_WEBM_IO),yes)
vpxdec.SRCS += $(LIBWEBM_COMMON_SRCS)
vpxdec.SRCS += $(LIBWEBM_MUXER_SRCS)
vpxdec.SRCS += $(LIBWEBM_PARSER_SRCS)
vpxdec.SRCS += webmdec.cc webmdec.h
endif
@@ -97,7 +95,6 @@ vpxenc.SRCS += tools_common.c tools_common.h
vpxenc.SRCS += warnings.c warnings.h
vpxenc.SRCS += vpx_ports/mem_ops.h
vpxenc.SRCS += vpx_ports/mem_ops_aligned.h
vpxenc.SRCS += vpx_ports/msvc.h
vpxenc.SRCS += vpx_ports/vpx_timer.h
vpxenc.SRCS += vpxstats.c vpxstats.h
ifeq ($(CONFIG_LIBYUV),yes)
@@ -119,24 +116,18 @@ vp9_spatial_svc_encoder.SRCS += y4minput.c y4minput.h
vp9_spatial_svc_encoder.SRCS += tools_common.c tools_common.h
vp9_spatial_svc_encoder.SRCS += video_common.h
vp9_spatial_svc_encoder.SRCS += video_writer.h video_writer.c
vp9_spatial_svc_encoder.SRCS += vpx_ports/msvc.h
vp9_spatial_svc_encoder.SRCS += vpxstats.c vpxstats.h
vp9_spatial_svc_encoder.SRCS += examples/svc_encodeframe.c
vp9_spatial_svc_encoder.SRCS += examples/svc_context.h
vp9_spatial_svc_encoder.GUID = 4A38598D-627D-4505-9C7B-D4020C84100D
vp9_spatial_svc_encoder.DESCRIPTION = VP9 Spatial SVC Encoder
ifneq ($(CONFIG_SHARED),yes)
EXAMPLES-$(CONFIG_VP9_ENCODER) += resize_util.c
endif
EXAMPLES-$(CONFIG_ENCODERS) += vpx_temporal_svc_encoder.c
vpx_temporal_svc_encoder.SRCS += ivfenc.c ivfenc.h
vpx_temporal_svc_encoder.SRCS += y4minput.c y4minput.h
vpx_temporal_svc_encoder.SRCS += tools_common.c tools_common.h
vpx_temporal_svc_encoder.SRCS += video_common.h
vpx_temporal_svc_encoder.SRCS += video_writer.h video_writer.c
vpx_temporal_svc_encoder.SRCS += vpx_ports/msvc.h
vpx_temporal_svc_encoder.GUID = B18C08F2-A439-4502-A78E-849BE3D60947
vpx_temporal_svc_encoder.DESCRIPTION = Temporal SVC Encoder
EXAMPLES-$(CONFIG_DECODERS) += simple_decoder.c
@@ -148,7 +139,6 @@ simple_decoder.SRCS += video_common.h
simple_decoder.SRCS += video_reader.h video_reader.c
simple_decoder.SRCS += vpx_ports/mem_ops.h
simple_decoder.SRCS += vpx_ports/mem_ops_aligned.h
simple_decoder.SRCS += vpx_ports/msvc.h
simple_decoder.DESCRIPTION = Simplified decoder loop
EXAMPLES-$(CONFIG_DECODERS) += postproc.c
postproc.SRCS += ivfdec.h ivfdec.c
@@ -158,7 +148,6 @@ postproc.SRCS += video_common.h
postproc.SRCS += video_reader.h video_reader.c
postproc.SRCS += vpx_ports/mem_ops.h
postproc.SRCS += vpx_ports/mem_ops_aligned.h
postproc.SRCS += vpx_ports/msvc.h
postproc.GUID = 65E33355-F35E-4088-884D-3FD4905881D7
postproc.DESCRIPTION = Decoder postprocessor control
EXAMPLES-$(CONFIG_DECODERS) += decode_to_md5.c
@@ -171,7 +160,6 @@ decode_to_md5.SRCS += video_reader.h video_reader.c
decode_to_md5.SRCS += vpx_ports/compiler_attributes.h
decode_to_md5.SRCS += vpx_ports/mem_ops.h
decode_to_md5.SRCS += vpx_ports/mem_ops_aligned.h
decode_to_md5.SRCS += vpx_ports/msvc.h
decode_to_md5.GUID = 59120B9B-2735-4BFE-B022-146CA340FE42
decode_to_md5.DESCRIPTION = Frame by frame MD5 checksum
EXAMPLES-$(CONFIG_ENCODERS) += simple_encoder.c
@@ -180,7 +168,6 @@ simple_encoder.SRCS += y4minput.c y4minput.h
simple_encoder.SRCS += tools_common.h tools_common.c
simple_encoder.SRCS += video_common.h
simple_encoder.SRCS += video_writer.h video_writer.c
simple_encoder.SRCS += vpx_ports/msvc.h
simple_encoder.GUID = 4607D299-8A71-4D2C-9B1D-071899B6FBFD
simple_encoder.DESCRIPTION = Simplified encoder loop
EXAMPLES-$(CONFIG_VP9_ENCODER) += vp9_lossless_encoder.c
@@ -189,7 +176,6 @@ vp9_lossless_encoder.SRCS += y4minput.c y4minput.h
vp9_lossless_encoder.SRCS += tools_common.h tools_common.c
vp9_lossless_encoder.SRCS += video_common.h
vp9_lossless_encoder.SRCS += video_writer.h video_writer.c
vp9_lossless_encoder.SRCS += vpx_ports/msvc.h
vp9_lossless_encoder.GUID = B63C7C88-5348-46DC-A5A6-CC151EF93366
vp9_lossless_encoder.DESCRIPTION = Simplified lossless VP9 encoder
EXAMPLES-$(CONFIG_ENCODERS) += twopass_encoder.c
@@ -198,7 +184,6 @@ twopass_encoder.SRCS += y4minput.c y4minput.h
twopass_encoder.SRCS += tools_common.h tools_common.c
twopass_encoder.SRCS += video_common.h
twopass_encoder.SRCS += video_writer.h video_writer.c
twopass_encoder.SRCS += vpx_ports/msvc.h
twopass_encoder.GUID = 73494FA6-4AF9-4763-8FBB-265C92402FD8
twopass_encoder.DESCRIPTION = Two-pass encoder loop
EXAMPLES-$(CONFIG_DECODERS) += decode_with_drops.c
@@ -209,7 +194,6 @@ decode_with_drops.SRCS += video_common.h
decode_with_drops.SRCS += video_reader.h video_reader.c
decode_with_drops.SRCS += vpx_ports/mem_ops.h
decode_with_drops.SRCS += vpx_ports/mem_ops_aligned.h
decode_with_drops.SRCS += vpx_ports/msvc.h
decode_with_drops.GUID = CE5C53C4-8DDA-438A-86ED-0DDD3CDB8D26
decode_with_drops.DESCRIPTION = Drops frames while decoding
EXAMPLES-$(CONFIG_ENCODERS) += set_maps.c
@@ -218,7 +202,6 @@ set_maps.SRCS += y4minput.c y4minput.h
set_maps.SRCS += tools_common.h tools_common.c
set_maps.SRCS += video_common.h
set_maps.SRCS += video_writer.h video_writer.c
set_maps.SRCS += vpx_ports/msvc.h
set_maps.GUID = ECB2D24D-98B8-4015-A465-A4AF3DCC145F
set_maps.DESCRIPTION = Set active and ROI maps
EXAMPLES-$(CONFIG_VP8_ENCODER) += vp8cx_set_ref.c
@@ -227,7 +210,6 @@ vp8cx_set_ref.SRCS += y4minput.c y4minput.h
vp8cx_set_ref.SRCS += tools_common.h tools_common.c
vp8cx_set_ref.SRCS += video_common.h
vp8cx_set_ref.SRCS += video_writer.h video_writer.c
vp8cx_set_ref.SRCS += vpx_ports/msvc.h
vp8cx_set_ref.GUID = C5E31F7F-96F6-48BD-BD3E-10EBF6E8057A
vp8cx_set_ref.DESCRIPTION = VP8 set encoder reference frame
@@ -251,7 +233,6 @@ vp8_multi_resolution_encoder.SRCS += ivfenc.h ivfenc.c
vp8_multi_resolution_encoder.SRCS += y4minput.c y4minput.h
vp8_multi_resolution_encoder.SRCS += tools_common.h tools_common.c
vp8_multi_resolution_encoder.SRCS += video_writer.h video_writer.c
vp8_multi_resolution_encoder.SRCS += vpx_ports/msvc.h
vp8_multi_resolution_encoder.SRCS += $(LIBYUV_SRCS)
vp8_multi_resolution_encoder.GUID = 04f8738e-63c8-423b-90fa-7c2703a374de
vp8_multi_resolution_encoder.DESCRIPTION = VP8 Multiple-resolution Encoding
-123
View File
@@ -1,123 +0,0 @@
/*
* Copyright (c) 2014 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <assert.h>
#include <limits.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "../tools_common.h"
#include "../vp9/encoder/vp9_resize.h"
static const char *exec_name = NULL;
static void usage() {
printf("Usage:\n");
printf("%s <input_yuv> <width>x<height> <target_width>x<target_height> ",
exec_name);
printf("<output_yuv> [<frames>]\n");
}
void usage_exit(void) {
usage();
exit(EXIT_FAILURE);
}
static int parse_dim(char *v, int *width, int *height) {
char *x = strchr(v, 'x');
if (x == NULL) x = strchr(v, 'X');
if (x == NULL) return 0;
*width = atoi(v);
*height = atoi(&x[1]);
if (*width <= 0 || *height <= 0)
return 0;
else
return 1;
}
int main(int argc, char *argv[]) {
char *fin, *fout;
FILE *fpin, *fpout;
uint8_t *inbuf, *outbuf;
uint8_t *inbuf_u, *outbuf_u;
uint8_t *inbuf_v, *outbuf_v;
int f, frames;
int width, height, target_width, target_height;
exec_name = argv[0];
if (argc < 5) {
printf("Incorrect parameters:\n");
usage();
return 1;
}
fin = argv[1];
fout = argv[4];
if (!parse_dim(argv[2], &width, &height)) {
printf("Incorrect parameters: %s\n", argv[2]);
usage();
return 1;
}
if (!parse_dim(argv[3], &target_width, &target_height)) {
printf("Incorrect parameters: %s\n", argv[3]);
usage();
return 1;
}
fpin = fopen(fin, "rb");
if (fpin == NULL) {
printf("Can't open file %s to read\n", fin);
usage();
return 1;
}
fpout = fopen(fout, "wb");
if (fpout == NULL) {
printf("Can't open file %s to write\n", fout);
usage();
return 1;
}
if (argc >= 6)
frames = atoi(argv[5]);
else
frames = INT_MAX;
printf("Input size: %dx%d\n", width, height);
printf("Target size: %dx%d, Frames: ", target_width, target_height);
if (frames == INT_MAX)
printf("All\n");
else
printf("%d\n", frames);
inbuf = (uint8_t *)malloc(width * height * 3 / 2);
outbuf = (uint8_t *)malloc(target_width * target_height * 3 / 2);
inbuf_u = inbuf + width * height;
inbuf_v = inbuf_u + width * height / 4;
outbuf_u = outbuf + target_width * target_height;
outbuf_v = outbuf_u + target_width * target_height / 4;
f = 0;
while (f < frames) {
if (fread(inbuf, width * height * 3 / 2, 1, fpin) != 1) break;
vp9_resize_frame420(inbuf, width, inbuf_u, inbuf_v, width / 2, height,
width, outbuf, target_width, outbuf_u, outbuf_v,
target_width / 2, target_height, target_width);
fwrite(outbuf, target_width * target_height * 3 / 2, 1, fpout);
f++;
}
printf("%d frames processed\n", f);
fclose(fpin);
fclose(fpout);
free(inbuf);
free(outbuf);
return 0;
}
+4 -4
View File
@@ -279,7 +279,7 @@ vpx_codec_err_t vpx_svc_set_options(SvcContext *svc_ctx, const char *options) {
if (svc_ctx == NULL || options == NULL || si == NULL) {
return VPX_CODEC_INVALID_PARAM;
}
strncpy(si->options, options, sizeof(si->options));
strncpy(si->options, options, sizeof(si->options) - 1);
si->options[sizeof(si->options) - 1] = '\0';
return VPX_CODEC_OK;
}
@@ -381,7 +381,7 @@ vpx_codec_err_t vpx_svc_init(SvcContext *svc_ctx, vpx_codec_ctx_t *codec_ctx,
vpx_codec_iface_t *iface,
vpx_codec_enc_cfg_t *enc_cfg) {
vpx_codec_err_t res;
int i, sl, tl;
int sl, tl;
SvcInternal_t *const si = get_svc_internal(svc_ctx);
if (svc_ctx == NULL || codec_ctx == NULL || iface == NULL ||
enc_cfg == NULL) {
@@ -433,7 +433,7 @@ vpx_codec_err_t vpx_svc_init(SvcContext *svc_ctx, vpx_codec_ctx_t *codec_ctx,
}
for (tl = 0; tl < svc_ctx->temporal_layers; ++tl) {
for (sl = 0; sl < svc_ctx->spatial_layers; ++sl) {
i = sl * svc_ctx->temporal_layers + tl;
const int i = sl * svc_ctx->temporal_layers + tl;
si->svc_params.max_quantizers[i] = MAX_QUANTIZER;
si->svc_params.min_quantizers[i] = 0;
if (enc_cfg->rc_end_usage == VPX_CBR &&
@@ -503,7 +503,7 @@ vpx_codec_err_t vpx_svc_init(SvcContext *svc_ctx, vpx_codec_ctx_t *codec_ctx,
for (tl = 0; tl < svc_ctx->temporal_layers; ++tl) {
for (sl = 0; sl < svc_ctx->spatial_layers; ++sl) {
i = sl * svc_ctx->temporal_layers + tl;
const int i = sl * svc_ctx->temporal_layers + tl;
if (enc_cfg->rc_end_usage == VPX_CBR &&
enc_cfg->g_pass == VPX_RC_ONE_PASS) {
si->svc_params.max_quantizers[i] = enc_cfg->rc_max_quantizer;
+12 -8
View File
@@ -16,6 +16,7 @@
#include <math.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
@@ -32,6 +33,7 @@
#include "vp9/encoder/vp9_encoder.h"
#include "./y4minput.h"
#define OUTPUT_FRAME_STATS 0
#define OUTPUT_RC_STATS 1
#define SIMULCAST_MODE 0
@@ -315,7 +317,6 @@ static void parse_command_line(int argc, const char **argv_,
break;
default:
die("Error: Invalid bit depth selected (%d)\n", enc_cfg->g_bit_depth);
break;
}
#endif // CONFIG_VP9_HIGHBITDEPTH
} else if (arg_match(&arg, &dropframe_thresh_arg, argi)) {
@@ -880,7 +881,9 @@ int main(int argc, const char **argv) {
int pts = 0; /* PTS starts at 0 */
int frame_duration = 1; /* 1 timebase tick per frame */
int end_of_stream = 0;
#if OUTPUT_FRAME_STATS
int frames_received = 0;
#endif
#if OUTPUT_RC_STATS
VpxVideoWriter *outfile[VPX_SS_MAX_LAYERS] = { NULL };
struct RateControlStats rc;
@@ -1126,14 +1129,14 @@ int main(int argc, const char **argv) {
}
#endif
}
/*
#if OUTPUT_FRAME_STATS
printf("SVC frame: %d, kf: %d, size: %d, pts: %d\n", frames_received,
!!(cx_pkt->data.frame.flags & VPX_FRAME_IS_KEY),
(int)cx_pkt->data.frame.sz, (int)cx_pkt->data.frame.pts);
*/
++frames_received;
#endif
if (enc_cfg.ss_number_layers == 1 && enc_cfg.ts_number_layers == 1)
si->bytes_sum[0] += (int)cx_pkt->data.frame.sz;
++frames_received;
#if CONFIG_VP9_DECODER && !SIMULCAST_MODE
if (vpx_codec_decode(&decoder, cx_pkt->data.frame.buf,
(unsigned int)cx_pkt->data.frame.sz, NULL, 0))
@@ -1154,12 +1157,13 @@ int main(int argc, const char **argv) {
#if CONFIG_VP9_DECODER && !SIMULCAST_MODE
vpx_codec_control(&encoder, VP9E_GET_SVC_LAYER_ID, &layer_id);
// Don't look for mismatch on top spatial and top temporal layers as they
// are non reference frames.
// are non reference frames. Don't look at frames whose top spatial layer
// is dropped.
if ((enc_cfg.ss_number_layers > 1 || enc_cfg.ts_number_layers > 1) &&
cx_pkt->data.frame
.spatial_layer_encoded[enc_cfg.ss_number_layers - 1] &&
!(layer_id.temporal_layer_id > 0 &&
layer_id.temporal_layer_id == (int)enc_cfg.ts_number_layers - 1 &&
cx_pkt->data.frame
.spatial_layer_encoded[enc_cfg.ss_number_layers - 1])) {
layer_id.temporal_layer_id == (int)enc_cfg.ts_number_layers - 1)) {
test_decode(&encoder, &decoder, frame_cnt, &mismatch_seen);
}
#endif
+1 -1
View File
@@ -60,7 +60,7 @@
static const char *exec_name;
void usage_exit() {
void usage_exit(void) {
fprintf(stderr,
"Usage: %s <width> <height> <infile> <outfile> "
"<frame> <limit(optional)>\n",
+7 -2
View File
@@ -110,8 +110,13 @@ extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
data += IVF_FRAME_HDR_SZ;
frame_size = std::min(size, frame_size);
const vpx_codec_err_t err =
vpx_codec_decode(&codec, data, frame_size, nullptr, 0);
vpx_codec_stream_info_t stream_info;
stream_info.sz = sizeof(stream_info);
vpx_codec_err_t err = vpx_codec_peek_stream_info(VPXD_INTERFACE(DECODER),
data, size, &stream_info);
static_cast<void>(err);
err = vpx_codec_decode(&codec, data, frame_size, nullptr, 0);
static_cast<void>(err);
vpx_codec_iter_t iter = nullptr;
vpx_image_t *img = nullptr;
-8
View File
@@ -1223,14 +1223,6 @@ DOT_GRAPH_MAX_NODES = 50
MAX_DOT_GRAPH_DEPTH = 0
# Set the DOT_TRANSPARENT tag to YES to generate images with a transparent
# background. This is disabled by default, which results in a white background.
# Warning: Depending on the platform used, enabling this option may lead to
# badly anti-aliased labels on the edges of a graph (i.e. they become hard to
# read).
DOT_TRANSPARENT = YES
# Set the DOT_MULTI_TARGETS tag to YES allow dot to generate multiple output
# files in one run (i.e. multiple -o and -T options on the command line). This
# makes dot run faster, but since only newer versions of dot (>1.8.10)
+10 -9
View File
@@ -178,6 +178,7 @@ INSTALL-LIBS-yes += include/vpx/vpx_image.h
INSTALL-LIBS-yes += include/vpx/vpx_integer.h
INSTALL-LIBS-$(CONFIG_DECODERS) += include/vpx/vpx_decoder.h
INSTALL-LIBS-$(CONFIG_ENCODERS) += include/vpx/vpx_encoder.h
INSTALL-LIBS-$(CONFIG_ENCODERS) += include/vpx/vpx_tpl.h
ifeq ($(CONFIG_EXTERNAL_BUILD),yes)
ifeq ($(CONFIG_MSVS),yes)
INSTALL-LIBS-yes += $(foreach p,$(VS_PLATFORMS),$(LIBSUBDIR)/$(p)/$(CODEC_LIB).lib)
@@ -312,7 +313,7 @@ $(BUILD_PFX)libvpx_g.a: $(LIBVPX_OBJS)
# To determine SO_VERSION_{MAJOR,MINOR,PATCH}, calculate c,a,r with current
# SO_VERSION_* then follow the rules in the link to detemine the new version
# (c1, a1, r1) and set MAJOR to [c1-a1], MINOR to a1 and PATCH to r1
SO_VERSION_MAJOR := 8
SO_VERSION_MAJOR := 11
SO_VERSION_MINOR := 0
SO_VERSION_PATCH := 0
ifeq ($(filter darwin%,$(TGT_OS)),$(TGT_OS))
@@ -545,7 +546,7 @@ testdata: $(LIBVPX_TEST_DATA)
echo "Checking test data:";\
for f in $(call enabled,LIBVPX_TEST_DATA); do\
grep $$f $(SRC_PATH_BARE)/test/test-data.sha1 |\
(cd $(LIBVPX_TEST_DATA_PATH); $${sha1sum} -c);\
(cd "$(LIBVPX_TEST_DATA_PATH)"; $${sha1sum} -c);\
done; \
else\
echo "Skipping test data integrity check, sha1sum not found.";\
@@ -631,8 +632,8 @@ test_rc_interface.$(VCPROJ_SFX): $(RC_INTERFACE_TEST_SRCS) vpx.$(VCPROJ_SFX) \
-I. -I"$(SRC_PATH_BARE)/third_party/googletest/src/include" \
-L. -l$(CODEC_LIB) -l$(RC_RTC_LIB) -l$(GTEST_LIB) $^
endif # RC_INTERFACE_TEST
endif # CONFIG_VP9_ENCODER
endif
endif # CONFIG_ENCODERS
endif # CONFIG_MSVS
else
include $(SRC_PATH_BARE)/third_party/googletest/gtest.mk
@@ -699,7 +700,7 @@ $(eval $(call linkerxx_template,$(SIMPLE_ENCODE_TEST_BIN), \
-L. -lsimple_encode -lvpx -lgtest $(extralibs) -lm))
endif # SIMPLE_ENCODE_TEST
endif # CONFIG_UNIT_TESTS
endif # CONFIG_EXTERNAL_BUILD
# Install test sources only if codec source is included
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(patsubst $(SRC_PATH_BARE)/%,%,\
@@ -724,7 +725,7 @@ NUM_SHARDS := 10
SHARDS := 0 1 2 3 4 5 6 7 8 9
$(foreach s,$(SHARDS),$(eval $(call test_shard_template,$(s),$(NUM_SHARDS))))
endif
endif # CONFIG_UNIT_TESTS
##
## documentation directives
@@ -764,10 +765,10 @@ TEST_BIN_PATH := $(addsuffix /$(TGT_OS:win64=x64)/Release, $(TEST_BIN_PATH))
endif
utiltest utiltest-no-data-check:
$(qexec)$(SRC_PATH_BARE)/test/vpxdec.sh \
--test-data-path $(LIBVPX_TEST_DATA_PATH) \
--test-data-path "$(LIBVPX_TEST_DATA_PATH)" \
--bin-path $(TEST_BIN_PATH)
$(qexec)$(SRC_PATH_BARE)/test/vpxenc.sh \
--test-data-path $(LIBVPX_TEST_DATA_PATH) \
--test-data-path "$(LIBVPX_TEST_DATA_PATH)" \
--bin-path $(TEST_BIN_PATH)
utiltest: testdata
else
@@ -791,7 +792,7 @@ EXAMPLES_BIN_PATH := $(TGT_OS:win64=x64)/Release
endif
exampletest exampletest-no-data-check: examples
$(qexec)$(SRC_PATH_BARE)/test/examples.sh \
--test-data-path $(LIBVPX_TEST_DATA_PATH) \
--test-data-path "$(LIBVPX_TEST_DATA_PATH)" \
--bin-path $(EXAMPLES_BIN_PATH)
exampletest: testdata
else
+5 -3
View File
@@ -9,10 +9,11 @@
*/
#include <assert.h>
#include <stdlib.h>
#include <limits.h>
#include <stdio.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include "./rate_hist.h"
@@ -48,7 +49,8 @@ struct rate_hist *init_rate_histogram(const vpx_codec_enc_cfg_t *cfg,
// Determine the number of samples in the buffer. Use the file's framerate
// to determine the number of frames in rc_buf_sz milliseconds, with an
// adjustment (5/4) to account for alt-refs
hist->samples = cfg->rc_buf_sz * 5 / 4 * fps->num / fps->den / 1000;
hist->samples =
(int)((int64_t)cfg->rc_buf_sz * 5 / 4 * fps->num / fps->den / 1000);
// prevent division by zero
if (hist->samples == 0) hist->samples = 1;
+6 -11
View File
@@ -15,7 +15,7 @@
#include <limits>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "vpx/vpx_integer.h"
@@ -45,16 +45,11 @@ class ACMRandom {
return static_cast<int16_t>(random_.Generate(65536));
}
int16_t Rand13Signed() {
// Use 13 bits: values between 4095 and -4096.
const uint32_t value = random_.Generate(8192);
return static_cast<int16_t>(value) - 4096;
}
int16_t Rand9Signed() {
// Use 9 bits: values between 255 (0x0FF) and -256 (0x100).
const uint32_t value = random_.Generate(512);
return static_cast<int16_t>(value) - 256;
uint16_t Rand12() {
const uint32_t value =
random_.Generate(testing::internal::Random::kMaxRange);
// There's a bit more entropy in the upper bits of this implementation.
return (value >> 19) & 0xfff;
}
uint8_t Rand8() {
+5 -5
View File
@@ -8,7 +8,7 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include <algorithm>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/util.h"
@@ -62,16 +62,16 @@ class ActiveMapRefreshTest
public ::libvpx_test::CodecTestWith2Params<libvpx_test::TestMode, int> {
protected:
ActiveMapRefreshTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~ActiveMapRefreshTest() {}
~ActiveMapRefreshTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
cpu_used_ = GET_PARAM(2);
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
::libvpx_test::Y4mVideoSource *y4m_video =
static_cast<libvpx_test::Y4mVideoSource *>(video);
if (video->frame() == 0) {
+5 -5
View File
@@ -9,7 +9,7 @@
*/
#include <climits>
#include <vector>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
@@ -26,16 +26,16 @@ class ActiveMapTest
static const int kHeight = 144;
ActiveMapTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~ActiveMapTest() {}
~ActiveMapTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
cpu_used_ = GET_PARAM(2);
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, cpu_used_);
encoder->Control(VP9E_SET_AQ_MODE, GET_PARAM(3));
+4 -3
View File
@@ -10,12 +10,13 @@
#include <math.h>
#include <tuple>
#include "gtest/gtest.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "test/util.h"
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vpx_dsp_rtcd.h"
#include "vpx/vpx_integer.h"
#include "vpx_config.h"
#include "vpx_dsp/postproc.h"
#include "vpx_mem/vpx_mem.h"
@@ -32,8 +33,8 @@ typedef std::tuple<double, AddNoiseFunc> AddNoiseTestFPParam;
class AddNoiseTest : public ::testing::Test,
public ::testing::WithParamInterface<AddNoiseTestFPParam> {
public:
virtual void TearDown() { libvpx_test::ClearSystemState(); }
virtual ~AddNoiseTest() {}
void TearDown() override { libvpx_test::ClearSystemState(); }
~AddNoiseTest() override = default;
};
double stddev6(char a, char b, char c, char d, char e, char f) {
+5 -5
View File
@@ -7,7 +7,7 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
@@ -20,9 +20,9 @@ class AltRefAqSegmentTest
public ::libvpx_test::CodecTestWith2Params<libvpx_test::TestMode, int> {
protected:
AltRefAqSegmentTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~AltRefAqSegmentTest() {}
~AltRefAqSegmentTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
set_cpu_used_ = GET_PARAM(2);
@@ -30,8 +30,8 @@ class AltRefAqSegmentTest
alt_ref_aq_mode_ = 0;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, set_cpu_used_);
encoder->Control(VP9E_SET_ALT_REF_AQ, alt_ref_aq_mode_);
+13 -12
View File
@@ -7,11 +7,12 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
#include "test/util.h"
#include "vpx_config.h"
namespace {
#if CONFIG_VP8_ENCODER
@@ -24,24 +25,24 @@ class AltRefTest : public ::libvpx_test::EncoderTest,
public ::libvpx_test::CodecTestWithParam<int> {
protected:
AltRefTest() : EncoderTest(GET_PARAM(0)), altref_count_(0) {}
virtual ~AltRefTest() {}
~AltRefTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(libvpx_test::kTwoPassGood);
}
virtual void BeginPassHook(unsigned int /*pass*/) { altref_count_ = 0; }
void BeginPassHook(unsigned int /*pass*/) override { altref_count_ = 0; }
virtual void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_ENABLEAUTOALTREF, 1);
encoder->Control(VP8E_SET_CPUUSED, 3);
}
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
if (pkt->data.frame.flags & VPX_FRAME_IS_INVISIBLE) ++altref_count_;
}
@@ -75,17 +76,17 @@ class AltRefForcedKeyTestLarge
AltRefForcedKeyTestLarge()
: EncoderTest(GET_PARAM(0)), encoding_mode_(GET_PARAM(1)),
cpu_used_(GET_PARAM(2)), forced_kf_frame_num_(1), frame_num_(0) {}
virtual ~AltRefForcedKeyTestLarge() {}
~AltRefForcedKeyTestLarge() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(encoding_mode_);
cfg_.rc_end_usage = VPX_VBR;
cfg_.g_threads = 0;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, cpu_used_);
encoder->Control(VP8E_SET_ENABLEAUTOALTREF, 1);
@@ -100,7 +101,7 @@ class AltRefForcedKeyTestLarge
(video->frame() == forced_kf_frame_num_) ? VPX_EFLAG_FORCE_KF : 0;
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
if (frame_num_ == forced_kf_frame_num_) {
ASSERT_TRUE(!!(pkt->data.frame.flags & VPX_FRAME_IS_KEY))
<< "Frame #" << frame_num_ << " isn't a keyframe!";
+9 -8
View File
@@ -38,7 +38,7 @@ def get_file_sha(filename):
buf = file.read(HASH_CHUNK)
return sha_hash.hexdigest()
except IOError:
print "Error reading " + filename
print("Error reading " + filename)
# Downloads a file from a url, and then checks the sha against the passed
# in sha
@@ -67,7 +67,7 @@ try:
getopt.getopt(sys.argv[1:], \
"u:i:o:", ["url=", "input_csv=", "output_dir="])
except:
print 'get_files.py -u <url> -i <input_csv> -o <output_dir>'
print('get_files.py -u <url> -i <input_csv> -o <output_dir>')
sys.exit(2)
for opt, arg in opts:
@@ -79,7 +79,7 @@ for opt, arg in opts:
local_resource_path = os.path.join(arg)
if len(sys.argv) != 7:
print "Expects two paths and a url!"
print("Expects two paths and a url!")
exit(1)
if not os.path.isdir(local_resource_path):
@@ -89,7 +89,7 @@ file_list_csv = open(file_list_path, "rb")
# Our 'csv' file uses multiple spaces as a delimiter, python's
# csv class only uses single character delimiters, so we convert them below
file_list_reader = csv.reader((re.sub(' +', ' ', line) \
file_list_reader = csv.reader((re.sub(' +', ' ', line.decode('utf-8')) \
for line in file_list_csv), delimiter = ' ')
file_shas = []
@@ -104,15 +104,16 @@ for row in file_list_reader:
file_list_csv.close()
# Download files, only if they don't already exist and have correct shas
for filename, sha in itertools.izip(file_names, file_shas):
for filename, sha in zip(file_names, file_shas):
filename = filename.lstrip('*')
path = os.path.join(local_resource_path, filename)
if os.path.isfile(path) \
and get_file_sha(path) == sha:
print path + ' exists, skipping'
print(path + ' exists, skipping')
continue
for retry in range(0, ftp_retries):
print "Downloading " + path
print("Downloading " + path)
if not download_and_check_sha(url, filename, sha):
print "Sha does not match, retrying..."
print("Sha does not match, retrying...")
else:
break
+5 -5
View File
@@ -7,7 +7,7 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
@@ -20,17 +20,17 @@ class AqSegmentTest
public ::libvpx_test::CodecTestWith2Params<libvpx_test::TestMode, int> {
protected:
AqSegmentTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~AqSegmentTest() {}
~AqSegmentTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
set_cpu_used_ = GET_PARAM(2);
aq_mode_ = 0;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, set_cpu_used_);
encoder->Control(VP9E_SET_AQ_MODE, aq_mode_);
+39 -17
View File
@@ -13,7 +13,7 @@
#include <string.h>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_config.h"
@@ -38,7 +38,7 @@ class AverageTestBase : public ::testing::Test {
: width_(width), height_(height), source_data_(nullptr),
source_stride_(0), bit_depth_(8) {}
virtual void TearDown() {
void TearDown() override {
vpx_free(source_data_);
source_data_ = nullptr;
libvpx_test::ClearSystemState();
@@ -49,7 +49,7 @@ class AverageTestBase : public ::testing::Test {
static const int kDataAlignment = 16;
static const int kDataBlockSize = 64 * 128;
virtual void SetUp() {
void SetUp() override {
source_data_ = reinterpret_cast<Pixel *>(
vpx_memalign(kDataAlignment, kDataBlockSize * sizeof(source_data_[0])));
ASSERT_NE(source_data_, nullptr);
@@ -169,7 +169,7 @@ class IntProRowTest : public AverageTestBase<uint8_t>,
}
protected:
virtual void SetUp() {
void SetUp() override {
source_data_ = reinterpret_cast<uint8_t *>(
vpx_memalign(kDataAlignment, kDataBlockSize * sizeof(source_data_[0])));
ASSERT_NE(source_data_, nullptr);
@@ -180,7 +180,7 @@ class IntProRowTest : public AverageTestBase<uint8_t>,
vpx_memalign(kDataAlignment, sizeof(*hbuf_c_) * 16));
}
virtual void TearDown() {
void TearDown() override {
vpx_free(source_data_);
source_data_ = nullptr;
vpx_free(hbuf_c_);
@@ -190,8 +190,9 @@ class IntProRowTest : public AverageTestBase<uint8_t>,
}
void RunComparison() {
ASM_REGISTER_STATE_CHECK(c_func_(hbuf_c_, source_data_, 0, height_));
ASM_REGISTER_STATE_CHECK(asm_func_(hbuf_asm_, source_data_, 0, height_));
ASM_REGISTER_STATE_CHECK(c_func_(hbuf_c_, source_data_, width_, height_));
ASM_REGISTER_STATE_CHECK(
asm_func_(hbuf_asm_, source_data_, width_, height_));
EXPECT_EQ(0, memcmp(hbuf_c_, hbuf_asm_, sizeof(*hbuf_c_) * 16))
<< "Output mismatch";
}
@@ -238,7 +239,7 @@ typedef std::tuple<int, SatdFunc> SatdTestParam;
class SatdTest : public ::testing::Test,
public ::testing::WithParamInterface<SatdTestParam> {
protected:
virtual void SetUp() {
void SetUp() override {
satd_size_ = GET_PARAM(0);
satd_func_ = GET_PARAM(1);
rnd_.Reset(ACMRandom::DeterministicSeed());
@@ -247,7 +248,7 @@ class SatdTest : public ::testing::Test,
ASSERT_NE(src_, nullptr);
}
virtual void TearDown() {
void TearDown() override {
libvpx_test::ClearSystemState();
vpx_free(src_);
}
@@ -276,7 +277,7 @@ class SatdTest : public ::testing::Test,
class SatdLowbdTest : public SatdTest {
protected:
virtual void FillRandom() {
void FillRandom() override {
for (int i = 0; i < satd_size_; ++i) {
const int16_t tmp = rnd_.Rand16Signed();
src_[i] = (tran_low_t)tmp;
@@ -292,7 +293,7 @@ class BlockErrorTestFP
: public ::testing::Test,
public ::testing::WithParamInterface<BlockErrorTestFPParam> {
protected:
virtual void SetUp() {
void SetUp() override {
txfm_size_ = GET_PARAM(0);
block_error_func_ = GET_PARAM(1);
rnd_.Reset(ACMRandom::DeterministicSeed());
@@ -304,7 +305,7 @@ class BlockErrorTestFP
ASSERT_NE(dqcoeff_, nullptr);
}
virtual void TearDown() {
void TearDown() override {
libvpx_test::ClearSystemState();
vpx_free(coeff_);
vpx_free(dqcoeff_);
@@ -463,7 +464,7 @@ TEST_P(SatdLowbdTest, DISABLED_Speed) {
#if CONFIG_VP9_HIGHBITDEPTH
class SatdHighbdTest : public SatdTest {
protected:
virtual void FillRandom() {
void FillRandom() override {
for (int i = 0; i < satd_size_; ++i) {
src_[i] = rnd_.Rand20Signed();
}
@@ -582,6 +583,13 @@ INSTANTIATE_TEST_SUITE_P(
make_tuple(16, 16, 1, 4, &vpx_highbd_avg_4x4_sse2)));
#endif // HAVE_SSE2
#if HAVE_NEON
INSTANTIATE_TEST_SUITE_P(
NEON, AverageTestHBD,
::testing::Values(make_tuple(16, 16, 1, 8, &vpx_highbd_avg_8x8_neon),
make_tuple(16, 16, 1, 4, &vpx_highbd_avg_4x4_neon)));
#endif // HAVE_NEON
INSTANTIATE_TEST_SUITE_P(C, SatdHighbdTest,
::testing::Values(make_tuple(16, &vpx_satd_c),
make_tuple(64, &vpx_satd_c),
@@ -694,18 +702,32 @@ INSTANTIATE_TEST_SUITE_P(NEON, SatdLowbdTest,
make_tuple(256, &vpx_satd_neon),
make_tuple(1024, &vpx_satd_neon)));
// TODO(jianj): Remove the highbitdepth flag once the SIMD functions are
// in place.
#if !CONFIG_VP9_HIGHBITDEPTH
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_SUITE_P(
NEON, SatdHighbdTest,
::testing::Values(make_tuple(16, &vpx_highbd_satd_neon),
make_tuple(64, &vpx_highbd_satd_neon),
make_tuple(256, &vpx_highbd_satd_neon),
make_tuple(1024, &vpx_highbd_satd_neon)));
#endif // CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_SUITE_P(
NEON, BlockErrorTestFP,
::testing::Values(make_tuple(16, &vp9_block_error_fp_neon),
make_tuple(64, &vp9_block_error_fp_neon),
make_tuple(256, &vp9_block_error_fp_neon),
make_tuple(1024, &vp9_block_error_fp_neon)));
#endif // !CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_NEON
#if HAVE_SVE
INSTANTIATE_TEST_SUITE_P(
SVE, BlockErrorTestFP,
::testing::Values(make_tuple(16, &vp9_block_error_fp_sve),
make_tuple(64, &vp9_block_error_fp_sve),
make_tuple(256, &vp9_block_error_fp_sve),
make_tuple(1024, &vp9_block_error_fp_sve)));
#endif // HAVE_SVE
#if HAVE_MSA
INSTANTIATE_TEST_SUITE_P(
MSA, AverageTest,
+1
View File
@@ -10,6 +10,7 @@
#include <stdio.h>
#include <algorithm>
#include <cstdlib>
#include "test/bench.h"
#include "vpx_ports/vpx_timer.h"
+2
View File
@@ -16,6 +16,8 @@
class AbstractBench {
public:
virtual ~AbstractBench() = default;
void RunNTimes(int n);
void PrintMedian(const char *title);
+3 -3
View File
@@ -13,7 +13,7 @@
#include <string.h>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_config.h"
#if CONFIG_VP9_ENCODER
@@ -49,14 +49,14 @@ class BlockinessTestBase : public ::testing::Test {
reference_data_ = nullptr;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
// Handle frames up to 640x480
static const int kDataAlignment = 16;
static const int kDataBufferSize = 640 * 480;
virtual void SetUp() {
void SetUp() override {
source_stride_ = (width_ + 31) & ~31;
reference_stride_ = width_ * 2;
rnd_.Reset(ACMRandom::DeterministicSeed());
+12 -6
View File
@@ -9,11 +9,12 @@
*/
#include <climits>
#include <vector>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
#include "test/util.h"
#include "vpx_config.h"
namespace {
@@ -22,15 +23,15 @@ class BordersTest
public ::libvpx_test::CodecTestWithParam<libvpx_test::TestMode> {
protected:
BordersTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~BordersTest() {}
~BordersTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, 1);
encoder->Control(VP8E_SET_ENABLEAUTOALTREF, 1);
@@ -40,7 +41,7 @@ class BordersTest
}
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
if (pkt->data.frame.flags & VPX_FRAME_IS_KEY) {
}
}
@@ -79,6 +80,11 @@ TEST_P(BordersTest, TestLowBitrate) {
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
}
#if CONFIG_REALTIME_ONLY
VP9_INSTANTIATE_TEST_SUITE(BordersTest,
::testing::Values(::libvpx_test::kRealTime));
#else
VP9_INSTANTIATE_TEST_SUITE(BordersTest,
::testing::Values(::libvpx_test::kTwoPassGood));
#endif
} // namespace
+1 -1
View File
@@ -15,7 +15,7 @@
#include <limits>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/acm_random.h"
#include "vpx/vpx_integer.h"
+2 -2
View File
@@ -58,7 +58,7 @@ class ByteAlignmentTest
ByteAlignmentTest()
: video_(nullptr), decoder_(nullptr), md5_file_(nullptr) {}
virtual void SetUp() {
void SetUp() override {
video_ = new libvpx_test::WebMVideoSource(kVP9TestFile);
ASSERT_NE(video_, nullptr);
video_->Init();
@@ -71,7 +71,7 @@ class ByteAlignmentTest
OpenMd5File(kVP9Md5File);
}
virtual void TearDown() {
void TearDown() override {
if (md5_file_ != nullptr) fclose(md5_file_);
delete decoder_;
+29 -27
View File
@@ -40,7 +40,7 @@ class CodecFactory {
const vpx_codec_flags_t flags) const = 0;
virtual Encoder *CreateEncoder(vpx_codec_enc_cfg_t cfg,
unsigned long deadline,
vpx_enc_deadline_t deadline,
const unsigned long init_flags,
TwopassStatsStore *stats) const = 0;
@@ -84,7 +84,7 @@ class VP8Decoder : public Decoder {
: Decoder(cfg, flag) {}
protected:
virtual vpx_codec_iface_t *CodecInterface() const {
vpx_codec_iface_t *CodecInterface() const override {
#if CONFIG_VP8_DECODER
return &vpx_codec_vp8_dx_algo;
#else
@@ -95,12 +95,12 @@ class VP8Decoder : public Decoder {
class VP8Encoder : public Encoder {
public:
VP8Encoder(vpx_codec_enc_cfg_t cfg, unsigned long deadline,
VP8Encoder(vpx_codec_enc_cfg_t cfg, vpx_enc_deadline_t deadline,
const unsigned long init_flags, TwopassStatsStore *stats)
: Encoder(cfg, deadline, init_flags, stats) {}
protected:
virtual vpx_codec_iface_t *CodecInterface() const {
vpx_codec_iface_t *CodecInterface() const override {
#if CONFIG_VP8_ENCODER
return &vpx_codec_vp8_cx_algo;
#else
@@ -113,12 +113,12 @@ class VP8CodecFactory : public CodecFactory {
public:
VP8CodecFactory() : CodecFactory() {}
virtual Decoder *CreateDecoder(vpx_codec_dec_cfg_t cfg) const {
Decoder *CreateDecoder(vpx_codec_dec_cfg_t cfg) const override {
return CreateDecoder(cfg, 0);
}
virtual Decoder *CreateDecoder(vpx_codec_dec_cfg_t cfg,
const vpx_codec_flags_t flags) const {
Decoder *CreateDecoder(vpx_codec_dec_cfg_t cfg,
const vpx_codec_flags_t flags) const override {
#if CONFIG_VP8_DECODER
return new VP8Decoder(cfg, flags);
#else
@@ -128,10 +128,9 @@ class VP8CodecFactory : public CodecFactory {
#endif
}
virtual Encoder *CreateEncoder(vpx_codec_enc_cfg_t cfg,
unsigned long deadline,
const unsigned long init_flags,
TwopassStatsStore *stats) const {
Encoder *CreateEncoder(vpx_codec_enc_cfg_t cfg, vpx_enc_deadline_t deadline,
const unsigned long init_flags,
TwopassStatsStore *stats) const override {
#if CONFIG_VP8_ENCODER
return new VP8Encoder(cfg, deadline, init_flags, stats);
#else
@@ -143,8 +142,8 @@ class VP8CodecFactory : public CodecFactory {
#endif
}
virtual vpx_codec_err_t DefaultEncoderConfig(vpx_codec_enc_cfg_t *cfg,
int usage) const {
vpx_codec_err_t DefaultEncoderConfig(vpx_codec_enc_cfg_t *cfg,
int usage) const override {
#if CONFIG_VP8_ENCODER
return vpx_codec_enc_config_default(&vpx_codec_vp8_cx_algo, cfg, usage);
#else
@@ -165,7 +164,9 @@ const libvpx_test::VP8CodecFactory kVP8;
&libvpx_test::kVP8)), \
__VA_ARGS__))
#else
#define VP8_INSTANTIATE_TEST_SUITE(test, ...)
// static_assert() is used to avoid warnings about an extra ';' outside of a
// function.
#define VP8_INSTANTIATE_TEST_SUITE(test, ...) static_assert(CONFIG_VP8 == 0, "")
#endif // CONFIG_VP8
/*
@@ -180,7 +181,7 @@ class VP9Decoder : public Decoder {
: Decoder(cfg, flag) {}
protected:
virtual vpx_codec_iface_t *CodecInterface() const {
vpx_codec_iface_t *CodecInterface() const override {
#if CONFIG_VP9_DECODER
return &vpx_codec_vp9_dx_algo;
#else
@@ -191,12 +192,12 @@ class VP9Decoder : public Decoder {
class VP9Encoder : public Encoder {
public:
VP9Encoder(vpx_codec_enc_cfg_t cfg, unsigned long deadline,
VP9Encoder(vpx_codec_enc_cfg_t cfg, vpx_enc_deadline_t deadline,
const unsigned long init_flags, TwopassStatsStore *stats)
: Encoder(cfg, deadline, init_flags, stats) {}
protected:
virtual vpx_codec_iface_t *CodecInterface() const {
vpx_codec_iface_t *CodecInterface() const override {
#if CONFIG_VP9_ENCODER
return &vpx_codec_vp9_cx_algo;
#else
@@ -209,12 +210,12 @@ class VP9CodecFactory : public CodecFactory {
public:
VP9CodecFactory() : CodecFactory() {}
virtual Decoder *CreateDecoder(vpx_codec_dec_cfg_t cfg) const {
Decoder *CreateDecoder(vpx_codec_dec_cfg_t cfg) const override {
return CreateDecoder(cfg, 0);
}
virtual Decoder *CreateDecoder(vpx_codec_dec_cfg_t cfg,
const vpx_codec_flags_t flags) const {
Decoder *CreateDecoder(vpx_codec_dec_cfg_t cfg,
const vpx_codec_flags_t flags) const override {
#if CONFIG_VP9_DECODER
return new VP9Decoder(cfg, flags);
#else
@@ -224,10 +225,9 @@ class VP9CodecFactory : public CodecFactory {
#endif
}
virtual Encoder *CreateEncoder(vpx_codec_enc_cfg_t cfg,
unsigned long deadline,
const unsigned long init_flags,
TwopassStatsStore *stats) const {
Encoder *CreateEncoder(vpx_codec_enc_cfg_t cfg, vpx_enc_deadline_t deadline,
const unsigned long init_flags,
TwopassStatsStore *stats) const override {
#if CONFIG_VP9_ENCODER
return new VP9Encoder(cfg, deadline, init_flags, stats);
#else
@@ -239,8 +239,8 @@ class VP9CodecFactory : public CodecFactory {
#endif
}
virtual vpx_codec_err_t DefaultEncoderConfig(vpx_codec_enc_cfg_t *cfg,
int usage) const {
vpx_codec_err_t DefaultEncoderConfig(vpx_codec_enc_cfg_t *cfg,
int usage) const override {
#if CONFIG_VP9_ENCODER
return vpx_codec_enc_config_default(&vpx_codec_vp9_cx_algo, cfg, usage);
#else
@@ -261,7 +261,9 @@ const libvpx_test::VP9CodecFactory kVP9;
&libvpx_test::kVP9)), \
__VA_ARGS__))
#else
#define VP9_INSTANTIATE_TEST_SUITE(test, ...)
// static_assert() is used to avoid warnings about an extra ';' outside of a
// function.
#define VP9_INSTANTIATE_TEST_SUITE(test, ...) static_assert(CONFIG_VP9 == 0, "")
#endif // CONFIG_VP9
} // namespace libvpx_test
+22 -10
View File
@@ -8,13 +8,14 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_dsp_rtcd.h"
#include "test/acm_random.h"
#include "test/buffer.h"
#include "test/register_state_check.h"
#include "vpx_config.h"
#include "vpx_ports/vpx_timer.h"
namespace {
@@ -49,7 +50,7 @@ using AvgPredFunc = void (*)(uint8_t *a, const uint8_t *b, int w, int h,
template <int bitdepth, typename Pixel>
class AvgPredTest : public ::testing::TestWithParam<AvgPredFunc> {
public:
virtual void SetUp() {
void SetUp() override {
avg_pred_func_ = GetParam();
rnd_.Reset(ACMRandom::DeterministicSeed());
}
@@ -81,11 +82,11 @@ void AvgPredTest<bitdepth, Pixel>::TestSizeCombinations() {
// Only the reference buffer may have a stride not equal to width.
Buffer<Pixel> ref = Buffer<Pixel>(width, height, ref_padding ? 8 : 0);
ASSERT_TRUE(ref.Init());
Buffer<Pixel> pred = Buffer<Pixel>(width, height, 0, 16);
Buffer<Pixel> pred = Buffer<Pixel>(width, height, 0, 32);
ASSERT_TRUE(pred.Init());
Buffer<Pixel> avg_ref = Buffer<Pixel>(width, height, 0, 16);
Buffer<Pixel> avg_ref = Buffer<Pixel>(width, height, 0, 32);
ASSERT_TRUE(avg_ref.Init());
Buffer<Pixel> avg_chk = Buffer<Pixel>(width, height, 0, 16);
Buffer<Pixel> avg_chk = Buffer<Pixel>(width, height, 0, 32);
ASSERT_TRUE(avg_chk.Init());
const int bitdepth_mask = (1 << bitdepth) - 1;
for (int h = 0; h < height; ++h) {
@@ -121,11 +122,11 @@ void AvgPredTest<bitdepth, Pixel>::TestCompareReferenceRandom() {
const int height = 32;
Buffer<Pixel> ref = Buffer<Pixel>(width, height, 8);
ASSERT_TRUE(ref.Init());
Buffer<Pixel> pred = Buffer<Pixel>(width, height, 0, 16);
Buffer<Pixel> pred = Buffer<Pixel>(width, height, 0, 32);
ASSERT_TRUE(pred.Init());
Buffer<Pixel> avg_ref = Buffer<Pixel>(width, height, 0, 16);
Buffer<Pixel> avg_ref = Buffer<Pixel>(width, height, 0, 32);
ASSERT_TRUE(avg_ref.Init());
Buffer<Pixel> avg_chk = Buffer<Pixel>(width, height, 0, 16);
Buffer<Pixel> avg_chk = Buffer<Pixel>(width, height, 0, 32);
ASSERT_TRUE(avg_chk.Init());
for (int i = 0; i < 500; ++i) {
@@ -167,9 +168,9 @@ void AvgPredTest<bitdepth, Pixel>::TestSpeed() {
const int height = 1 << height_pow;
Buffer<Pixel> ref = Buffer<Pixel>(width, height, ref_padding ? 8 : 0);
ASSERT_TRUE(ref.Init());
Buffer<Pixel> pred = Buffer<Pixel>(width, height, 0, 16);
Buffer<Pixel> pred = Buffer<Pixel>(width, height, 0, 32);
ASSERT_TRUE(pred.Init());
Buffer<Pixel> avg = Buffer<Pixel>(width, height, 0, 16);
Buffer<Pixel> avg = Buffer<Pixel>(width, height, 0, 32);
ASSERT_TRUE(avg.Init());
const int bitdepth_mask = (1 << bitdepth) - 1;
for (int h = 0; h < height; ++h) {
@@ -217,6 +218,11 @@ INSTANTIATE_TEST_SUITE_P(SSE2, AvgPredTestLBD,
::testing::Values(&vpx_comp_avg_pred_sse2));
#endif // HAVE_SSE2
#if HAVE_AVX2
INSTANTIATE_TEST_SUITE_P(AVX2, AvgPredTestLBD,
::testing::Values(&vpx_comp_avg_pred_avx2));
#endif // HAVE_AVX2
#if HAVE_NEON
INSTANTIATE_TEST_SUITE_P(NEON, AvgPredTestLBD,
::testing::Values(&vpx_comp_avg_pred_neon));
@@ -260,5 +266,11 @@ INSTANTIATE_TEST_SUITE_P(
::testing::Values(&highbd_wrapper<vpx_highbd_comp_avg_pred_sse2>));
#endif // HAVE_SSE2
#if HAVE_NEON
INSTANTIATE_TEST_SUITE_P(
NEON, AvgPredTestHBD,
::testing::Values(&highbd_wrapper<vpx_highbd_comp_avg_pred_neon>));
#endif // HAVE_NEON
#endif // CONFIG_VP9_HIGHBITDEPTH
} // namespace
+6 -6
View File
@@ -7,7 +7,7 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/util.h"
@@ -22,24 +22,24 @@ class ConfigTest
ConfigTest()
: EncoderTest(GET_PARAM(0)), frame_count_in_(0), frame_count_out_(0),
frame_count_max_(0) {}
virtual ~ConfigTest() {}
~ConfigTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
}
virtual void BeginPassHook(unsigned int /*pass*/) {
void BeginPassHook(unsigned int /*pass*/) override {
frame_count_in_ = 0;
frame_count_out_ = 0;
}
virtual void PreEncodeFrameHook(libvpx_test::VideoSource * /*video*/) {
void PreEncodeFrameHook(libvpx_test::VideoSource * /*video*/) override {
++frame_count_in_;
abort_ |= (frame_count_in_ >= frame_count_max_);
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t * /*pkt*/) {
void FramePktHook(const vpx_codec_cx_pkt_t * /*pkt*/) override {
++frame_count_out_;
}
+3 -3
View File
@@ -13,7 +13,7 @@
#include <string.h>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_config.h"
#if CONFIG_VP9_ENCODER
@@ -65,14 +65,14 @@ class ConsistencyTestBase : public ::testing::Test {
delete[] ssim_array_;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
// Handle frames up to 640x480
static const int kDataAlignment = 16;
static const int kDataBufferSize = 640 * 480;
virtual void SetUp() {
void SetUp() override {
source_stride_ = (width_ + 31) & ~31;
reference_stride_ = width_ * 2;
rnd_.Reset(ACMRandom::DeterministicSeed());
+126 -4
View File
@@ -11,7 +11,7 @@
#include <string.h>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_config.h"
@@ -244,7 +244,7 @@ void highbd_filter_block2d_8_c(const uint16_t *src_ptr,
// Vertical pass (transposed intermediate -> dst).
{
uint16_t *src_ptr = intermediate_buffer;
src_ptr = intermediate_buffer;
const int dst_next_row_stride = dst_stride - output_width;
unsigned int i, j;
for (i = 0; i < output_height; ++i) {
@@ -361,7 +361,7 @@ class ConvolveTest : public ::testing::TestWithParam<ConvolveParam> {
#endif
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
static void TearDownTestSuite() {
vpx_free(input_ - 1);
@@ -403,7 +403,7 @@ class ConvolveTest : public ::testing::TestWithParam<ConvolveParam> {
i % kOuterBlockSize >= (BorderLeft() + Width()));
}
virtual void SetUp() {
void SetUp() override {
UUT_ = GET_PARAM(2);
#if CONFIG_VP9_HIGHBITDEPTH
if (UUT_->use_highbd_ != 0) {
@@ -1218,6 +1218,30 @@ WRAP(convolve8_neon, 12)
WRAP(convolve8_avg_neon, 12)
#endif // HAVE_NEON
#if HAVE_SVE
WRAP(convolve8_horiz_sve, 8)
WRAP(convolve8_avg_horiz_sve, 8)
WRAP(convolve8_horiz_sve, 10)
WRAP(convolve8_avg_horiz_sve, 10)
WRAP(convolve8_horiz_sve, 12)
WRAP(convolve8_avg_horiz_sve, 12)
#endif // HAVE_SVE
#if HAVE_SVE2
WRAP(convolve8_sve2, 8)
WRAP(convolve8_avg_sve2, 8)
WRAP(convolve8_vert_sve2, 8)
WRAP(convolve8_avg_vert_sve2, 8)
WRAP(convolve8_sve2, 10)
WRAP(convolve8_avg_sve2, 10)
WRAP(convolve8_vert_sve2, 10)
WRAP(convolve8_avg_vert_sve2, 10)
WRAP(convolve8_sve2, 12)
WRAP(convolve8_avg_sve2, 12)
WRAP(convolve8_vert_sve2, 12)
WRAP(convolve8_avg_vert_sve2, 12)
#endif // HAVE_SVE2
WRAP(convolve_copy_c, 8)
WRAP(convolve_avg_c, 8)
WRAP(convolve8_horiz_c, 8)
@@ -1423,6 +1447,104 @@ INSTANTIATE_TEST_SUITE_P(NEON, ConvolveTest,
::testing::ValuesIn(kArrayConvolve_neon));
#endif // HAVE_NEON
#if HAVE_NEON_DOTPROD
const ConvolveFunctions convolve8_neon_dotprod(
vpx_convolve_copy_c, vpx_convolve_avg_c, vpx_convolve8_horiz_neon_dotprod,
vpx_convolve8_avg_horiz_neon_dotprod, vpx_convolve8_vert_neon_dotprod,
vpx_convolve8_avg_vert_neon_dotprod, vpx_convolve8_neon_dotprod,
vpx_convolve8_avg_neon_dotprod, vpx_scaled_horiz_c, vpx_scaled_avg_horiz_c,
vpx_scaled_vert_c, vpx_scaled_avg_vert_c, vpx_scaled_2d_c,
vpx_scaled_avg_2d_c, 0);
const ConvolveParam kArrayConvolve_neon_dotprod[] = { ALL_SIZES(
convolve8_neon_dotprod) };
INSTANTIATE_TEST_SUITE_P(NEON_DOTPROD, ConvolveTest,
::testing::ValuesIn(kArrayConvolve_neon_dotprod));
#endif // HAVE_NEON_DOTPROD
#if HAVE_SVE
#if CONFIG_VP9_HIGHBITDEPTH
const ConvolveFunctions convolve8_sve(
wrap_convolve_copy_c_8, wrap_convolve_avg_c_8, wrap_convolve8_horiz_sve_8,
wrap_convolve8_avg_horiz_sve_8, wrap_convolve8_vert_c_8,
wrap_convolve8_avg_vert_c_8, wrap_convolve8_c_8, wrap_convolve8_avg_c_8,
wrap_convolve8_horiz_c_8, wrap_convolve8_avg_horiz_c_8,
wrap_convolve8_vert_c_8, wrap_convolve8_avg_vert_c_8, wrap_convolve8_c_8,
wrap_convolve8_avg_c_8, 8);
const ConvolveFunctions convolve10_sve(
wrap_convolve_copy_c_10, wrap_convolve_avg_c_10,
wrap_convolve8_horiz_sve_10, wrap_convolve8_avg_horiz_sve_10,
wrap_convolve8_vert_c_10, wrap_convolve8_avg_vert_c_10, wrap_convolve8_c_10,
wrap_convolve8_avg_c_10, wrap_convolve8_horiz_c_10,
wrap_convolve8_avg_horiz_c_10, wrap_convolve8_vert_c_10,
wrap_convolve8_avg_vert_c_10, wrap_convolve8_c_10, wrap_convolve8_avg_c_10,
10);
const ConvolveFunctions convolve12_sve(
wrap_convolve_copy_c_12, wrap_convolve_avg_c_12,
wrap_convolve8_horiz_sve_12, wrap_convolve8_avg_horiz_sve_12,
wrap_convolve8_vert_c_12, wrap_convolve8_avg_vert_c_12, wrap_convolve8_c_12,
wrap_convolve8_avg_c_12, wrap_convolve8_horiz_c_12,
wrap_convolve8_avg_horiz_c_12, wrap_convolve8_vert_c_12,
wrap_convolve8_avg_vert_c_12, wrap_convolve8_c_12, wrap_convolve8_avg_c_12,
12);
const ConvolveParam kArrayConvolve_sve[] = { ALL_SIZES(convolve8_sve),
ALL_SIZES(convolve10_sve),
ALL_SIZES(convolve12_sve) };
INSTANTIATE_TEST_SUITE_P(SVE, ConvolveTest,
::testing::ValuesIn(kArrayConvolve_sve));
#endif // CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_SVE
#if HAVE_SVE2
#if CONFIG_VP9_HIGHBITDEPTH
const ConvolveFunctions convolve8_sve2(
wrap_convolve_copy_c_8, wrap_convolve_avg_c_8, wrap_convolve8_horiz_c_8,
wrap_convolve8_avg_horiz_c_8, wrap_convolve8_vert_sve2_8,
wrap_convolve8_avg_vert_sve2_8, wrap_convolve8_sve2_8,
wrap_convolve8_avg_sve2_8, wrap_convolve8_horiz_c_8,
wrap_convolve8_avg_horiz_c_8, wrap_convolve8_vert_c_8,
wrap_convolve8_avg_vert_c_8, wrap_convolve8_c_8, wrap_convolve8_avg_c_8, 8);
const ConvolveFunctions convolve10_sve2(
wrap_convolve_copy_c_10, wrap_convolve_avg_c_10, wrap_convolve8_horiz_c_10,
wrap_convolve8_avg_horiz_c_10, wrap_convolve8_vert_sve2_10,
wrap_convolve8_avg_vert_sve2_10, wrap_convolve8_sve2_10,
wrap_convolve8_avg_sve2_10, wrap_convolve8_horiz_c_10,
wrap_convolve8_avg_horiz_c_10, wrap_convolve8_vert_c_10,
wrap_convolve8_avg_vert_c_10, wrap_convolve8_c_10, wrap_convolve8_avg_c_10,
10);
const ConvolveFunctions convolve12_sve2(
wrap_convolve_copy_c_12, wrap_convolve_avg_c_12, wrap_convolve8_horiz_c_12,
wrap_convolve8_avg_horiz_c_12, wrap_convolve8_vert_sve2_12,
wrap_convolve8_avg_vert_sve2_12, wrap_convolve8_sve2_12,
wrap_convolve8_avg_sve2_12, wrap_convolve8_horiz_c_12,
wrap_convolve8_avg_horiz_c_12, wrap_convolve8_vert_c_12,
wrap_convolve8_avg_vert_c_12, wrap_convolve8_c_12, wrap_convolve8_avg_c_12,
12);
const ConvolveParam kArrayConvolve_sve2[] = { ALL_SIZES(convolve8_sve2),
ALL_SIZES(convolve10_sve2),
ALL_SIZES(convolve12_sve2) };
INSTANTIATE_TEST_SUITE_P(SVE2, ConvolveTest,
::testing::ValuesIn(kArrayConvolve_sve2));
#endif // CONFIG_VP9_HIGHBITDEPTH
#endif // HAVE_SVE2
#if HAVE_NEON_I8MM
const ConvolveFunctions convolve8_neon_i8mm(
vpx_convolve_copy_c, vpx_convolve_avg_c, vpx_convolve8_horiz_neon_i8mm,
vpx_convolve8_avg_horiz_neon_i8mm, vpx_convolve8_vert_neon_i8mm,
vpx_convolve8_avg_vert_neon_i8mm, vpx_convolve8_neon_i8mm,
vpx_convolve8_avg_neon_i8mm, vpx_scaled_horiz_c, vpx_scaled_avg_horiz_c,
vpx_scaled_vert_c, vpx_scaled_avg_vert_c, vpx_scaled_2d_c,
vpx_scaled_avg_2d_c, 0);
const ConvolveParam kArrayConvolve_neon_i8mm[] = { ALL_SIZES(
convolve8_neon_i8mm) };
INSTANTIATE_TEST_SUITE_P(NEON_I8MM, ConvolveTest,
::testing::ValuesIn(kArrayConvolve_neon_i8mm));
#endif // HAVE_NEON_I8MM
#if HAVE_DSPR2
const ConvolveFunctions convolve8_dspr2(
vpx_convolve_copy_dspr2, vpx_convolve_avg_dspr2, vpx_convolve8_horiz_dspr2,
+9 -12
View File
@@ -7,7 +7,7 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
@@ -26,9 +26,9 @@ class CpuSpeedTest
: EncoderTest(GET_PARAM(0)), encoding_mode_(GET_PARAM(1)),
set_cpu_used_(GET_PARAM(2)), min_psnr_(kMaxPSNR),
tune_content_(VP9E_CONTENT_DEFAULT) {}
virtual ~CpuSpeedTest() {}
~CpuSpeedTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(encoding_mode_);
if (encoding_mode_ != ::libvpx_test::kRealTime) {
@@ -40,10 +40,10 @@ class CpuSpeedTest
}
}
virtual void BeginPassHook(unsigned int /*pass*/) { min_psnr_ = kMaxPSNR; }
void BeginPassHook(unsigned int /*pass*/) override { min_psnr_ = kMaxPSNR; }
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, set_cpu_used_);
encoder->Control(VP9E_SET_TUNE_CONTENT, tune_content_);
@@ -56,7 +56,7 @@ class CpuSpeedTest
}
}
virtual void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) {
void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) override {
if (pkt->data.psnr.psnr[0] < min_psnr_) min_psnr_ = pkt->data.psnr.psnr[0];
}
@@ -105,7 +105,7 @@ TEST_P(CpuSpeedTest, TestTuneScreen) {
::libvpx_test::Y4mVideoSource video("screendata.y4m", 0, 25);
cfg_.g_timebase = video.timebase();
cfg_.rc_2pass_vbr_minsection_pct = 5;
cfg_.rc_2pass_vbr_minsection_pct = 2000;
cfg_.rc_2pass_vbr_maxsection_pct = 2000;
cfg_.rc_target_bitrate = 2000;
cfg_.rc_max_quantizer = 63;
cfg_.rc_min_quantizer = 0;
@@ -148,9 +148,6 @@ TEST_P(CpuSpeedTest, TestLowBitrate) {
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
}
VP9_INSTANTIATE_TEST_SUITE(CpuSpeedTest,
::testing::Values(::libvpx_test::kTwoPassGood,
::libvpx_test::kOnePassGood,
::libvpx_test::kRealTime),
VP9_INSTANTIATE_TEST_SUITE(CpuSpeedTest, ONE_PASS_TEST_MODES,
::testing::Range(0, 10));
} // namespace
+14 -8
View File
@@ -9,11 +9,12 @@
*/
#include <cmath>
#include <map>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
#include "test/util.h"
#include "vpx_config.h"
namespace {
@@ -50,21 +51,21 @@ class CQTest : public ::libvpx_test::EncoderTest,
init_flags_ = VPX_CODEC_USE_PSNR;
}
virtual ~CQTest() {}
~CQTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(libvpx_test::kTwoPassGood);
}
virtual void BeginPassHook(unsigned int /*pass*/) {
void BeginPassHook(unsigned int /*pass*/) override {
file_size_ = 0;
psnr_ = 0.0;
n_frames_ = 0;
}
virtual void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
if (cfg_.rc_end_usage == VPX_CQ) {
encoder->Control(VP8E_SET_CQ_LEVEL, cq_level_);
@@ -73,12 +74,12 @@ class CQTest : public ::libvpx_test::EncoderTest,
}
}
virtual void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) {
void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) override {
psnr_ += pow(10.0, pkt->data.psnr.psnr[0] / 10.0);
n_frames_++;
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
file_size_ += pkt->data.frame.sz;
}
@@ -104,6 +105,10 @@ CQTest::BitrateMap CQTest::bitrates_;
TEST_P(CQTest, LinearPSNRIsHigherForCQLevel) {
const vpx_rational timebase = { 33333333, 1000000000 };
#if CONFIG_REALTIME_ONlY
GTEST_SKIP()
<< "Non-zero g_lag_in_frames is unsupported with CONFIG_REALTIME_ONLY";
#else
cfg_.g_timebase = timebase;
cfg_.rc_target_bitrate = kCQTargetBitrate;
cfg_.g_lag_in_frames = 25;
@@ -124,6 +129,7 @@ TEST_P(CQTest, LinearPSNRIsHigherForCQLevel) {
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
const double vbr_psnr_lin = GetLinearPSNROverBitrate();
EXPECT_GE(cq_psnr_lin, vbr_psnr_lin);
#endif // CONFIG_REALTIME_ONLY
}
VP8_INSTANTIATE_TEST_SUITE(CQTest, ::testing::Range(kCQLevelMin, kCQLevelMax,
+162 -21
View File
@@ -13,7 +13,7 @@
#include <string.h>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_dsp_rtcd.h"
@@ -25,8 +25,9 @@
#include "vp9/common/vp9_scan.h"
#include "vpx/vpx_codec.h"
#include "vpx/vpx_integer.h"
#include "vpx_config.h"
#include "vpx_ports/mem.h"
#include "vpx_ports/msvc.h" // for round()
#include "vpx_ports/vpx_timer.h"
using libvpx_test::ACMRandom;
@@ -309,7 +310,7 @@ void idct16x16_10_add_12_sse2(const tran_low_t *in, uint8_t *out, int stride) {
class Trans16x16TestBase {
public:
virtual ~Trans16x16TestBase() {}
virtual ~Trans16x16TestBase() = default;
protected:
virtual void RunFwdTxfm(int16_t *in, tran_low_t *out, int stride) = 0;
@@ -548,12 +549,50 @@ class Trans16x16TestBase {
}
}
void RunSpeedTest() {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 10000;
int c_sum_time = 0;
int simd_sum_time = 0;
DECLARE_ALIGNED(32, int16_t, input_block[kNumCoeffs]);
DECLARE_ALIGNED(32, tran_low_t, output_ref_block[kNumCoeffs]);
DECLARE_ALIGNED(32, tran_low_t, output_block[kNumCoeffs]);
// Initialize a test block with input range [-mask_, mask_].
for (int j = 0; j < kNumCoeffs; ++j) {
input_block[j] = (rnd.Rand16() & mask_) - (rnd.Rand16() & mask_);
}
vpx_usec_timer timer_c;
vpx_usec_timer_start(&timer_c);
for (int i = 0; i < count_test_block; ++i) {
vpx_fdct16x16_c(input_block, output_ref_block, pitch_);
}
vpx_usec_timer_mark(&timer_c);
c_sum_time += static_cast<int>(vpx_usec_timer_elapsed(&timer_c));
vpx_usec_timer timer_mod;
vpx_usec_timer_start(&timer_mod);
for (int i = 0; i < count_test_block; ++i) {
RunFwdTxfm(input_block, output_block, pitch_);
}
vpx_usec_timer_mark(&timer_mod);
simd_sum_time += static_cast<int>(vpx_usec_timer_elapsed(&timer_mod));
printf(
"c_time = %d \t simd_time = %d \t Gain = %4.2f \n", c_sum_time,
simd_sum_time,
(static_cast<float>(c_sum_time) / static_cast<float>(simd_sum_time)));
}
void CompareInvReference(IdctFunc ref_txfm, int thresh) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 10000;
const int eob = 10;
const int16_t *scan = vp9_default_scan_orders[TX_16X16].scan;
DECLARE_ALIGNED(16, tran_low_t, coeff[kNumCoeffs]);
DECLARE_ALIGNED(32, tran_low_t, coeff[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, dst[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, ref[kNumCoeffs]);
#if CONFIG_VP9_HIGHBITDEPTH
@@ -604,6 +643,80 @@ class Trans16x16TestBase {
}
}
void RunInvTrans16x16SpeedTest(IdctFunc ref_txfm, int thresh) {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 10000;
const int eob = 10;
const int16_t *scan = vp9_default_scan_orders[TX_16X16].scan;
int64_t c_sum_time = 0;
int64_t simd_sum_time = 0;
DECLARE_ALIGNED(32, tran_low_t, coeff[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, dst[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, ref[kNumCoeffs]);
#if CONFIG_VP9_HIGHBITDEPTH
DECLARE_ALIGNED(16, uint16_t, dst16[kNumCoeffs]);
DECLARE_ALIGNED(16, uint16_t, ref16[kNumCoeffs]);
#endif // CONFIG_VP9_HIGHBITDEPTH
for (int j = 0; j < kNumCoeffs; ++j) {
if (j < eob) {
// Random values less than the threshold, either positive or negative
coeff[scan[j]] = rnd(thresh);
} else {
coeff[scan[j]] = 0;
}
if (bit_depth_ == VPX_BITS_8) {
dst[j] = 0;
ref[j] = 0;
#if CONFIG_VP9_HIGHBITDEPTH
} else {
dst16[j] = 0;
ref16[j] = 0;
#endif // CONFIG_VP9_HIGHBITDEPTH
}
}
if (bit_depth_ == VPX_BITS_8) {
vpx_usec_timer timer_c;
vpx_usec_timer_start(&timer_c);
for (int i = 0; i < count_test_block; ++i) {
ref_txfm(coeff, ref, pitch_);
}
vpx_usec_timer_mark(&timer_c);
c_sum_time += vpx_usec_timer_elapsed(&timer_c);
vpx_usec_timer timer_mod;
vpx_usec_timer_start(&timer_mod);
for (int i = 0; i < count_test_block; ++i) {
RunInvTxfm(coeff, dst, pitch_);
}
vpx_usec_timer_mark(&timer_mod);
simd_sum_time += vpx_usec_timer_elapsed(&timer_mod);
} else {
#if CONFIG_VP9_HIGHBITDEPTH
vpx_usec_timer timer_c;
vpx_usec_timer_start(&timer_c);
for (int i = 0; i < count_test_block; ++i) {
ref_txfm(coeff, CAST_TO_BYTEPTR(ref16), pitch_);
}
vpx_usec_timer_mark(&timer_c);
c_sum_time += vpx_usec_timer_elapsed(&timer_c);
vpx_usec_timer timer_mod;
vpx_usec_timer_start(&timer_mod);
for (int i = 0; i < count_test_block; ++i) {
RunInvTxfm(coeff, CAST_TO_BYTEPTR(dst16), pitch_);
}
vpx_usec_timer_mark(&timer_mod);
simd_sum_time += vpx_usec_timer_elapsed(&timer_mod);
#endif // CONFIG_VP9_HIGHBITDEPTH
}
printf(
"c_time = %" PRId64 " \t simd_time = %" PRId64 " \t Gain = %4.2f \n",
c_sum_time, simd_sum_time,
(static_cast<float>(c_sum_time) / static_cast<float>(simd_sum_time)));
}
int pitch_;
int tx_type_;
vpx_bit_depth_t bit_depth_;
@@ -615,9 +728,9 @@ class Trans16x16TestBase {
class Trans16x16DCT : public Trans16x16TestBase,
public ::testing::TestWithParam<Dct16x16Param> {
public:
virtual ~Trans16x16DCT() {}
~Trans16x16DCT() override = default;
virtual void SetUp() {
void SetUp() override {
fwd_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
tx_type_ = GET_PARAM(2);
@@ -636,13 +749,13 @@ class Trans16x16DCT : public Trans16x16TestBase,
inv_txfm_ref = idct16x16_ref;
#endif
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
void RunFwdTxfm(int16_t *in, tran_low_t *out, int stride) {
void RunFwdTxfm(int16_t *in, tran_low_t *out, int stride) override {
fwd_txfm_(in, out, stride);
}
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) {
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) override {
inv_txfm_(out, dst, stride);
}
@@ -664,12 +777,14 @@ TEST_P(Trans16x16DCT, QuantCheck) {
TEST_P(Trans16x16DCT, InvAccuracyCheck) { RunInvAccuracyCheck(); }
TEST_P(Trans16x16DCT, DISABLED_Speed) { RunSpeedTest(); }
class Trans16x16HT : public Trans16x16TestBase,
public ::testing::TestWithParam<Ht16x16Param> {
public:
virtual ~Trans16x16HT() {}
~Trans16x16HT() override = default;
virtual void SetUp() {
void SetUp() override {
fwd_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
tx_type_ = GET_PARAM(2);
@@ -688,13 +803,13 @@ class Trans16x16HT : public Trans16x16TestBase,
inv_txfm_ref = iht16x16_ref;
#endif
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
void RunFwdTxfm(int16_t *in, tran_low_t *out, int stride) {
void RunFwdTxfm(int16_t *in, tran_low_t *out, int stride) override {
fwd_txfm_(in, out, stride, tx_type_);
}
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) {
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) override {
inv_txfm_(out, dst, stride, tx_type_);
}
@@ -714,13 +829,12 @@ TEST_P(Trans16x16HT, QuantCheck) {
RunQuantCheck(429, 729);
}
#if HAVE_SSE2 && CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
class InvTrans16x16DCT : public Trans16x16TestBase,
public ::testing::TestWithParam<Idct16x16Param> {
public:
virtual ~InvTrans16x16DCT() {}
~InvTrans16x16DCT() override = default;
virtual void SetUp() {
void SetUp() override {
ref_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
thresh_ = GET_PARAM(2);
@@ -728,11 +842,12 @@ class InvTrans16x16DCT : public Trans16x16TestBase,
pitch_ = 16;
mask_ = (1 << bit_depth_) - 1;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
void RunFwdTxfm(int16_t * /*in*/, tran_low_t * /*out*/, int /*stride*/) {}
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) {
void RunFwdTxfm(int16_t * /*in*/, tran_low_t * /*out*/,
int /*stride*/) override {}
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) override {
inv_txfm_(out, dst, stride);
}
@@ -745,7 +860,10 @@ GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST(InvTrans16x16DCT);
TEST_P(InvTrans16x16DCT, CompareReference) {
CompareInvReference(ref_txfm_, thresh_);
}
#endif // HAVE_SSE2 && CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
TEST_P(InvTrans16x16DCT, DISABLED_Speed) {
RunInvTrans16x16SpeedTest(ref_txfm_, thresh_);
}
using std::make_tuple;
@@ -787,6 +905,12 @@ INSTANTIATE_TEST_SUITE_P(
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 1, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 2, VPX_BITS_8),
make_tuple(&vp9_fht16x16_c, &vp9_iht16x16_256_add_c, 3, VPX_BITS_8)));
INSTANTIATE_TEST_SUITE_P(C, InvTrans16x16DCT,
::testing::Values(make_tuple(&vpx_idct16x16_256_add_c,
&vpx_idct16x16_256_add_c,
6225, VPX_BITS_8)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#if HAVE_NEON && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
@@ -821,8 +945,25 @@ INSTANTIATE_TEST_SUITE_P(
2, VPX_BITS_8),
make_tuple(&vp9_fht16x16_sse2, &vp9_iht16x16_256_add_sse2,
3, VPX_BITS_8)));
INSTANTIATE_TEST_SUITE_P(SSE2, InvTrans16x16DCT,
::testing::Values(make_tuple(
&vpx_idct16x16_256_add_c,
&vpx_idct16x16_256_add_sse2, 6225, VPX_BITS_8)));
#endif // HAVE_SSE2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_AVX2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_SUITE_P(
AVX2, Trans16x16DCT,
::testing::Values(make_tuple(&vpx_fdct16x16_avx2,
&vpx_idct16x16_256_add_sse2, 0, VPX_BITS_8)));
INSTANTIATE_TEST_SUITE_P(AVX2, InvTrans16x16DCT,
::testing::Values(make_tuple(
&vpx_idct16x16_256_add_c,
&vpx_idct16x16_256_add_avx2, 6225, VPX_BITS_8)));
#endif // HAVE_AVX2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_SSE2 && CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
INSTANTIATE_TEST_SUITE_P(
SSE2, Trans16x16DCT,
+202 -6
View File
@@ -13,7 +13,7 @@
#include <string.h>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_config.h"
@@ -24,10 +24,11 @@
#include "test/register_state_check.h"
#include "test/util.h"
#include "vp9/common/vp9_entropy.h"
#include "vp9/common/vp9_scan.h"
#include "vpx/vpx_codec.h"
#include "vpx/vpx_integer.h"
#include "vpx_ports/mem.h"
#include "vpx_ports/msvc.h" // for round()
#include "vpx_ports/vpx_timer.h"
using libvpx_test::ACMRandom;
@@ -71,6 +72,9 @@ typedef void (*InvTxfmFunc)(const tran_low_t *in, uint8_t *out, int stride);
typedef std::tuple<FwdTxfmFunc, InvTxfmFunc, int, vpx_bit_depth_t>
Trans32x32Param;
typedef std::tuple<InvTxfmFunc, InvTxfmFunc, int, vpx_bit_depth_t, int, int>
InvTrans32x32Param;
#if CONFIG_VP9_HIGHBITDEPTH
void idct32x32_10(const tran_low_t *in, uint8_t *out, int stride) {
vpx_highbd_idct32x32_1024_add_c(in, CAST_TO_SHORTPTR(out), stride, 10);
@@ -84,8 +88,8 @@ void idct32x32_12(const tran_low_t *in, uint8_t *out, int stride) {
class Trans32x32Test : public AbstractBench,
public ::testing::TestWithParam<Trans32x32Param> {
public:
virtual ~Trans32x32Test() {}
virtual void SetUp() {
~Trans32x32Test() override = default;
void SetUp() override {
fwd_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
version_ = GET_PARAM(2); // 0: high precision forward transform
@@ -94,7 +98,7 @@ class Trans32x32Test : public AbstractBench,
mask_ = (1 << bit_depth_) - 1;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
int version_;
@@ -105,7 +109,7 @@ class Trans32x32Test : public AbstractBench,
int16_t *bench_in_;
tran_low_t *bench_out_;
virtual void Run();
void Run() override;
};
void Trans32x32Test::Run() { fwd_txfm_(bench_in_, bench_out_, 32); }
@@ -314,6 +318,174 @@ TEST_P(Trans32x32Test, InverseAccuracy) {
}
}
class InvTrans32x32Test : public ::testing::TestWithParam<InvTrans32x32Param> {
public:
~InvTrans32x32Test() override = default;
void SetUp() override {
ref_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
version_ = GET_PARAM(2); // 0: high precision forward transform
// 1: low precision version for rd loop
bit_depth_ = GET_PARAM(3);
eob_ = GET_PARAM(4);
thresh_ = GET_PARAM(4);
mask_ = (1 << bit_depth_) - 1;
pitch_ = 32;
}
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
void RunRefTxfm(tran_low_t *out, uint8_t *dst, int stride) {
ref_txfm_(out, dst, stride);
}
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) {
inv_txfm_(out, dst, stride);
}
int version_;
vpx_bit_depth_t bit_depth_;
int mask_;
int eob_;
int thresh_;
InvTxfmFunc ref_txfm_;
InvTxfmFunc inv_txfm_;
int pitch_;
void RunInvTrans32x32SpeedTest() {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 10000;
int64_t c_sum_time = 0;
int64_t simd_sum_time = 0;
const int16_t *scan = vp9_default_scan_orders[TX_32X32].scan;
DECLARE_ALIGNED(32, tran_low_t, coeff[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, dst[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, ref[kNumCoeffs]);
#if CONFIG_VP9_HIGHBITDEPTH
DECLARE_ALIGNED(16, uint16_t, dst16[kNumCoeffs]);
DECLARE_ALIGNED(16, uint16_t, ref16[kNumCoeffs]);
#endif // CONFIG_VP9_HIGHBITDEPTH
for (int j = 0; j < kNumCoeffs; ++j) {
if (j < eob_) {
// Random values less than the threshold, either positive or negative
coeff[scan[j]] = rnd(thresh_);
} else {
coeff[scan[j]] = 0;
}
if (bit_depth_ == VPX_BITS_8) {
dst[j] = 0;
ref[j] = 0;
#if CONFIG_VP9_HIGHBITDEPTH
} else {
dst16[j] = 0;
ref16[j] = 0;
#endif // CONFIG_VP9_HIGHBITDEPTH
}
}
if (bit_depth_ == VPX_BITS_8) {
vpx_usec_timer timer_c;
vpx_usec_timer_start(&timer_c);
for (int i = 0; i < count_test_block; ++i) {
RunRefTxfm(coeff, ref, pitch_);
}
vpx_usec_timer_mark(&timer_c);
c_sum_time += vpx_usec_timer_elapsed(&timer_c);
vpx_usec_timer timer_mod;
vpx_usec_timer_start(&timer_mod);
for (int i = 0; i < count_test_block; ++i) {
RunInvTxfm(coeff, dst, pitch_);
}
vpx_usec_timer_mark(&timer_mod);
simd_sum_time += vpx_usec_timer_elapsed(&timer_mod);
} else {
#if CONFIG_VP9_HIGHBITDEPTH
vpx_usec_timer timer_c;
vpx_usec_timer_start(&timer_c);
for (int i = 0; i < count_test_block; ++i) {
RunRefTxfm(coeff, CAST_TO_BYTEPTR(ref16), pitch_);
}
vpx_usec_timer_mark(&timer_c);
c_sum_time += vpx_usec_timer_elapsed(&timer_c);
vpx_usec_timer timer_mod;
vpx_usec_timer_start(&timer_mod);
for (int i = 0; i < count_test_block; ++i) {
RunInvTxfm(coeff, CAST_TO_BYTEPTR(dst16), pitch_);
}
vpx_usec_timer_mark(&timer_mod);
simd_sum_time += vpx_usec_timer_elapsed(&timer_mod);
#endif // CONFIG_VP9_HIGHBITDEPTH
}
printf(
"c_time = %" PRId64 " \t simd_time = %" PRId64 " \t Gain = %4.2f \n",
c_sum_time, simd_sum_time,
(static_cast<float>(c_sum_time) / static_cast<float>(simd_sum_time)));
}
void CompareInvReference32x32() {
ACMRandom rnd(ACMRandom::DeterministicSeed());
const int count_test_block = 10000;
const int eob = 31;
const int16_t *scan = vp9_default_scan_orders[TX_32X32].scan;
DECLARE_ALIGNED(32, tran_low_t, coeff[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, dst[kNumCoeffs]);
DECLARE_ALIGNED(16, uint8_t, ref[kNumCoeffs]);
#if CONFIG_VP9_HIGHBITDEPTH
DECLARE_ALIGNED(16, uint16_t, dst16[kNumCoeffs]);
DECLARE_ALIGNED(16, uint16_t, ref16[kNumCoeffs]);
#endif // CONFIG_VP9_HIGHBITDEPTH
for (int i = 0; i < count_test_block; ++i) {
for (int j = 0; j < kNumCoeffs; ++j) {
if (j < eob) {
coeff[scan[j]] = rnd.Rand8Extremes();
} else {
coeff[scan[j]] = 0;
}
if (bit_depth_ == VPX_BITS_8) {
dst[j] = 0;
ref[j] = 0;
#if CONFIG_VP9_HIGHBITDEPTH
} else {
dst16[j] = 0;
ref16[j] = 0;
#endif // CONFIG_VP9_HIGHBITDEPTH
}
}
if (bit_depth_ == VPX_BITS_8) {
RunRefTxfm(coeff, ref, pitch_);
RunInvTxfm(coeff, dst, pitch_);
} else {
#if CONFIG_VP9_HIGHBITDEPTH
RunRefTxfm(coeff, CAST_TO_BYTEPTR(ref16), pitch_);
ASM_REGISTER_STATE_CHECK(
RunInvTxfm(coeff, CAST_TO_BYTEPTR(dst16), pitch_));
#endif // CONFIG_VP9_HIGHBITDEPTH
}
for (int j = 0; j < kNumCoeffs; ++j) {
#if CONFIG_VP9_HIGHBITDEPTH
const uint32_t diff =
bit_depth_ == VPX_BITS_8 ? dst[j] - ref[j] : dst16[j] - ref16[j];
#else
const uint32_t diff = dst[j] - ref[j];
#endif // CONFIG_VP9_HIGHBITDEPTH
const uint32_t error = diff * diff;
EXPECT_EQ(0u, error) << "Error: 32x32 IDCT Comparison has error "
<< error << " at index " << j;
}
}
}
};
GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST(InvTrans32x32Test);
TEST_P(InvTrans32x32Test, DISABLED_Speed) { RunInvTrans32x32SpeedTest(); }
TEST_P(InvTrans32x32Test, CompareReference) { CompareInvReference32x32(); }
using std::make_tuple;
#if CONFIG_VP9_HIGHBITDEPTH
@@ -334,6 +506,14 @@ INSTANTIATE_TEST_SUITE_P(
VPX_BITS_8),
make_tuple(&vpx_fdct32x32_rd_c, &vpx_idct32x32_1024_add_c,
1, VPX_BITS_8)));
INSTANTIATE_TEST_SUITE_P(
C, InvTrans32x32Test,
::testing::Values(
(make_tuple(&vpx_idct32x32_1024_add_c, &vpx_idct32x32_1024_add_c, 0,
VPX_BITS_8, 32, 6225)),
make_tuple(&vpx_idct32x32_135_add_c, &vpx_idct32x32_135_add_c, 0,
VPX_BITS_8, 16, 6255)));
#endif // CONFIG_VP9_HIGHBITDEPTH
#if HAVE_NEON && !CONFIG_EMULATE_HARDWARE
@@ -352,6 +532,14 @@ INSTANTIATE_TEST_SUITE_P(
&vpx_idct32x32_1024_add_sse2, 0, VPX_BITS_8),
make_tuple(&vpx_fdct32x32_rd_sse2,
&vpx_idct32x32_1024_add_sse2, 1, VPX_BITS_8)));
INSTANTIATE_TEST_SUITE_P(
SSE2, InvTrans32x32Test,
::testing::Values(
(make_tuple(&vpx_idct32x32_1024_add_c, &vpx_idct32x32_1024_add_sse2, 0,
VPX_BITS_8, 32, 6225)),
make_tuple(&vpx_idct32x32_135_add_c, &vpx_idct32x32_135_add_sse2, 0,
VPX_BITS_8, 16, 6225)));
#endif // HAVE_SSE2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_SSE2 && CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
@@ -377,6 +565,14 @@ INSTANTIATE_TEST_SUITE_P(
&vpx_idct32x32_1024_add_sse2, 0, VPX_BITS_8),
make_tuple(&vpx_fdct32x32_rd_avx2,
&vpx_idct32x32_1024_add_sse2, 1, VPX_BITS_8)));
INSTANTIATE_TEST_SUITE_P(
AVX2, InvTrans32x32Test,
::testing::Values(
(make_tuple(&vpx_idct32x32_1024_add_c, &vpx_idct32x32_1024_add_avx2, 0,
VPX_BITS_8, 32, 6225)),
make_tuple(&vpx_idct32x32_135_add_c, &vpx_idct32x32_135_add_avx2, 0,
VPX_BITS_8, 16, 6225)));
#endif // HAVE_AVX2 && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
#if HAVE_MSA && !CONFIG_VP9_HIGHBITDEPTH && !CONFIG_EMULATE_HARDWARE
+3 -2
View File
@@ -14,7 +14,7 @@
#include <limits>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_dsp_rtcd.h"
#include "test/acm_random.h"
@@ -22,6 +22,7 @@
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "test/util.h"
#include "vpx_config.h"
#include "vpx/vpx_codec.h"
#include "vpx/vpx_integer.h"
#include "vpx_dsp/vpx_dsp_common.h"
@@ -67,7 +68,7 @@ class PartialFdctTest : public ::testing::TestWithParam<PartialFdctParam> {
bit_depth_ = GET_PARAM(2);
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
void RunTest() {
+8 -11
View File
@@ -13,7 +13,7 @@
#include <string.h>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_dsp_rtcd.h"
@@ -23,6 +23,7 @@
#include "test/register_state_check.h"
#include "test/util.h"
#include "vp9/common/vp9_entropy.h"
#include "vpx_config.h"
#include "vpx/vpx_codec.h"
#include "vpx/vpx_integer.h"
#include "vpx_ports/mem.h"
@@ -134,7 +135,7 @@ void fwht_ref(const Buffer<int16_t> &in, Buffer<tran_low_t> *out, int size,
class TransTestBase : public ::testing::TestWithParam<DctParam> {
public:
virtual void SetUp() {
void SetUp() override {
rnd_.Reset(ACMRandom::DeterministicSeed());
const int idx = GET_PARAM(0);
const FuncInfo *func_info = &(GET_PARAM(1)[idx]);
@@ -166,7 +167,7 @@ class TransTestBase : public ::testing::TestWithParam<DctParam> {
ASSERT_NE(dst_, nullptr);
}
virtual void TearDown() {
void TearDown() override {
vpx_free(src_);
src_ = nullptr;
vpx_free(dst_);
@@ -358,14 +359,6 @@ class TransTestBase : public ::testing::TestWithParam<DctParam> {
ASSERT_TRUE(in.Init());
Buffer<tran_low_t> coeff = Buffer<tran_low_t>(size_, size_, 0, 16);
ASSERT_TRUE(coeff.Init());
Buffer<uint8_t> dst = Buffer<uint8_t>(size_, size_, 0, 16);
ASSERT_TRUE(dst.Init());
Buffer<uint8_t> src = Buffer<uint8_t>(size_, size_, 0);
ASSERT_TRUE(src.Init());
Buffer<uint16_t> dst16 = Buffer<uint16_t>(size_, size_, 0, 16);
ASSERT_TRUE(dst16.Init());
Buffer<uint16_t> src16 = Buffer<uint16_t>(size_, size_, 0);
ASSERT_TRUE(src16.Init());
for (int i = 0; i < count_test_block; ++i) {
InitMem();
@@ -671,8 +664,12 @@ static const FuncInfo ht_neon_func_info[] = {
4, 2 },
{ &vp9_highbd_fht8x8_c, &highbd_iht_wrapper<vp9_highbd_iht8x8_64_add_neon>, 8,
2 },
{ &vp9_highbd_fht8x8_neon, &highbd_iht_wrapper<vp9_highbd_iht8x8_64_add_neon>,
8, 2 },
{ &vp9_highbd_fht16x16_c,
&highbd_iht_wrapper<vp9_highbd_iht16x16_256_add_neon>, 16, 2 },
{ &vp9_highbd_fht16x16_neon,
&highbd_iht_wrapper<vp9_highbd_iht16x16_256_add_neon>, 16, 2 },
#endif
{ &vp9_fht4x4_c, &iht_wrapper<vp9_iht4x4_16_add_neon>, 4, 1 },
{ &vp9_fht4x4_neon, &iht_wrapper<vp9_iht4x4_16_add_neon>, 4, 1 },
+4 -4
View File
@@ -8,7 +8,7 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_config.h"
#include "test/ivf_video_source.h"
@@ -20,7 +20,7 @@ namespace {
#define NELEMENTS(x) static_cast<int>(sizeof(x) / sizeof(x[0]))
TEST(DecodeAPI, InvalidParams) {
static const vpx_codec_iface_t *kCodecs[] = {
static vpx_codec_iface_t *kCodecs[] = {
#if CONFIG_VP8_DECODER
&vpx_codec_vp8_dx_algo,
#endif
@@ -120,7 +120,7 @@ void TestVp9Controls(vpx_codec_ctx_t *dec) {
}
TEST(DecodeAPI, Vp9InvalidDecode) {
const vpx_codec_iface_t *const codec = &vpx_codec_vp9_dx_algo;
vpx_codec_iface_t *const codec = &vpx_codec_vp9_dx_algo;
const char filename[] =
"invalid-vp90-2-00-quantizer-00.webm.ivf.s5861_r01-05_b6-.v2.ivf";
libvpx_test::IVFVideoSource video(filename);
@@ -147,7 +147,7 @@ TEST(DecodeAPI, Vp9InvalidDecode) {
void TestPeekInfo(const uint8_t *const data, uint32_t data_sz,
uint32_t peek_size) {
const vpx_codec_iface_t *const codec = &vpx_codec_vp9_dx_algo;
vpx_codec_iface_t *const codec = &vpx_codec_vp9_dx_algo;
// Verify behavior of vpx_codec_decode. vpx_codec_decode doesn't even get
// to decoder_peek_si_internal on frames of size < 8.
if (data_sz >= 8) {
+13 -12
View File
@@ -10,12 +10,13 @@
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/util.h"
#include "test/i420_video_source.h"
#include "vpx_config.h"
#include "vpx_mem/vpx_mem.h"
namespace {
@@ -28,9 +29,9 @@ class DecodeCorruptedFrameTest
DecodeCorruptedFrameTest() : EncoderTest(GET_PARAM(0)) {}
protected:
virtual ~DecodeCorruptedFrameTest() {}
~DecodeCorruptedFrameTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(::libvpx_test::kRealTime);
cfg_.g_lag_in_frames = 0;
@@ -44,16 +45,16 @@ class DecodeCorruptedFrameTest
dec_cfg_.threads = 1;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) encoder->Control(VP8E_SET_CPUUSED, 7);
}
virtual void MismatchHook(const vpx_image_t * /*img1*/,
const vpx_image_t * /*img2*/) {}
void MismatchHook(const vpx_image_t * /*img1*/,
const vpx_image_t * /*img2*/) override {}
virtual const vpx_codec_cx_pkt_t *MutateEncoderOutputHook(
const vpx_codec_cx_pkt_t *pkt) {
const vpx_codec_cx_pkt_t *MutateEncoderOutputHook(
const vpx_codec_cx_pkt_t *pkt) override {
// Don't edit frame packet on key frame.
if (pkt->data.frame.flags & VPX_FRAME_IS_KEY) return pkt;
if (pkt->kind != VPX_CODEC_CX_FRAME_PKT) return pkt;
@@ -66,9 +67,9 @@ class DecodeCorruptedFrameTest
return &modified_pkt_;
}
virtual bool HandleDecodeResult(const vpx_codec_err_t res_dec,
const libvpx_test::VideoSource & /*video*/,
libvpx_test::Decoder *decoder) {
bool HandleDecodeResult(const vpx_codec_err_t res_dec,
const libvpx_test::VideoSource & /*video*/,
libvpx_test::Decoder *decoder) override {
EXPECT_NE(res_dec, VPX_CODEC_MEM_ERROR) << decoder->DecodeError();
return VPX_CODEC_MEM_ERROR != res_dec;
}
+12 -12
View File
@@ -19,9 +19,9 @@
#include "test/md5_helper.h"
#include "test/util.h"
#include "test/webm_video_source.h"
#include "vpx/vpx_codec.h"
#include "vpx_ports/vpx_timer.h"
#include "./ivfenc.h"
#include "./vpx_version.h"
using std::make_tuple;
@@ -98,7 +98,7 @@ TEST_P(DecodePerfTest, PerfTest) {
printf("{\n");
printf("\t\"type\" : \"decode_perf_test\",\n");
printf("\t\"version\" : \"%s\",\n", VERSION_STRING_NOSP);
printf("\t\"version\" : \"%s\",\n", vpx_codec_version_str());
printf("\t\"videoName\" : \"%s\",\n", video_name);
printf("\t\"threadCount\" : %u,\n", threads);
printf("\t\"decodeTimeSecs\" : %f,\n", elapsed_secs);
@@ -116,11 +116,11 @@ class VP9NewEncodeDecodePerfTest
protected:
VP9NewEncodeDecodePerfTest()
: EncoderTest(GET_PARAM(0)), encoding_mode_(GET_PARAM(1)), speed_(0),
outfile_(0), out_frames_(0) {}
outfile_(nullptr), out_frames_(0) {}
virtual ~VP9NewEncodeDecodePerfTest() {}
~VP9NewEncodeDecodePerfTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(encoding_mode_);
@@ -137,8 +137,8 @@ class VP9NewEncodeDecodePerfTest
cfg_.rc_end_usage = VPX_VBR;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, speed_);
encoder->Control(VP9E_SET_FRAME_PARALLEL_DECODING, 1);
@@ -146,14 +146,14 @@ class VP9NewEncodeDecodePerfTest
}
}
virtual void BeginPassHook(unsigned int /*pass*/) {
void BeginPassHook(unsigned int /*pass*/) override {
const std::string data_path = getenv("LIBVPX_TEST_DATA_PATH");
const std::string path_to_source = data_path + "/" + kNewEncodeOutputFile;
outfile_ = fopen(path_to_source.c_str(), "wb");
ASSERT_NE(outfile_, nullptr);
}
virtual void EndPassHook() {
void EndPassHook() override {
if (outfile_ != nullptr) {
if (!fseek(outfile_, 0, SEEK_SET)) {
ivf_write_file_header(outfile_, &cfg_, VP9_FOURCC, out_frames_);
@@ -163,7 +163,7 @@ class VP9NewEncodeDecodePerfTest
}
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
++out_frames_;
// Write initial file header if first frame.
@@ -177,7 +177,7 @@ class VP9NewEncodeDecodePerfTest
pkt->data.frame.sz);
}
virtual bool DoDecode() const { return false; }
bool DoDecode() const override { return false; }
void set_speed(unsigned int speed) { speed_ = speed; }
@@ -249,7 +249,7 @@ TEST_P(VP9NewEncodeDecodePerfTest, PerfTest) {
printf("{\n");
printf("\t\"type\" : \"decode_perf_test\",\n");
printf("\t\"version\" : \"%s\",\n", VERSION_STRING_NOSP);
printf("\t\"version\" : \"%s\",\n", vpx_codec_version_str());
printf("\t\"videoName\" : \"%s\",\n", kNewEncodeOutputFile);
printf("\t\"threadCount\" : %u,\n", threads);
printf("\t\"decodeTimeSecs\" : %f,\n", elapsed_secs);
+5 -6
View File
@@ -25,17 +25,16 @@ class DecodeSvcTest : public ::libvpx_test::DecoderTest,
public ::libvpx_test::CodecTestWithParam<const char *> {
protected:
DecodeSvcTest() : DecoderTest(GET_PARAM(::libvpx_test::kCodecFactoryParam)) {}
virtual ~DecodeSvcTest() {}
~DecodeSvcTest() override = default;
virtual void PreDecodeFrameHook(
const libvpx_test::CompressedVideoSource &video,
libvpx_test::Decoder *decoder) {
void PreDecodeFrameHook(const libvpx_test::CompressedVideoSource &video,
libvpx_test::Decoder *decoder) override {
if (video.frame_number() == 0)
decoder->Control(VP9_DECODE_SVC_SPATIAL_LAYER, spatial_layer_);
}
virtual void DecompressedFrameHook(const vpx_image_t &img,
const unsigned int frame_number) {
void DecompressedFrameHook(const vpx_image_t &img,
const unsigned int frame_number) override {
ASSERT_EQ(img.d_w, width_);
ASSERT_EQ(img.d_h, height_);
total_frames_ = frame_number;
+1 -1
View File
@@ -8,7 +8,7 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/decode_test_driver.h"
+1 -1
View File
@@ -11,7 +11,7 @@
#ifndef VPX_TEST_DECODE_TEST_DRIVER_H_
#define VPX_TEST_DECODE_TEST_DRIVER_H_
#include <cstring>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_config.h"
#include "vpx/vpx_decoder.h"
+1597 -33
View File
File diff suppressed because it is too large Load Diff
+11 -11
View File
@@ -7,15 +7,15 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <cstdio>
#include <string>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "./vpx_config.h"
#include "./vpx_version.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
#include "test/util.h"
#include "test/y4m_video_source.h"
#include "vpx/vpx_codec.h"
#include "vpx_ports/vpx_timer.h"
namespace {
@@ -61,9 +61,9 @@ class VP9EncodePerfTest
: EncoderTest(GET_PARAM(0)), min_psnr_(kMaxPsnr), nframes_(0),
encoding_mode_(GET_PARAM(1)), speed_(0), threads_(1) {}
virtual ~VP9EncodePerfTest() {}
~VP9EncodePerfTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(encoding_mode_);
@@ -82,8 +82,8 @@ class VP9EncodePerfTest
cfg_.g_threads = threads_;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
const int log2_tile_columns = 3;
encoder->Control(VP8E_SET_CPUUSED, speed_);
@@ -93,19 +93,19 @@ class VP9EncodePerfTest
}
}
virtual void BeginPassHook(unsigned int /*pass*/) {
void BeginPassHook(unsigned int /*pass*/) override {
min_psnr_ = kMaxPsnr;
nframes_ = 0;
}
virtual void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) {
void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) override {
if (pkt->data.psnr.psnr[0] < min_psnr_) {
min_psnr_ = pkt->data.psnr.psnr[0];
}
}
// for performance reasons don't decode
virtual bool DoDecode() const { return false; }
bool DoDecode() const override { return false; }
double min_psnr() const { return min_psnr_; }
@@ -169,7 +169,7 @@ TEST_P(VP9EncodePerfTest, PerfTest) {
printf("{\n");
printf("\t\"type\" : \"encode_perf_test\",\n");
printf("\t\"version\" : \"%s\",\n", VERSION_STRING_NOSP);
printf("\t\"version\" : \"%s\",\n", vpx_codec_version_str());
printf("\t\"videoName\" : \"%s\",\n", display_name.c_str());
printf("\t\"encodeTimeSecs\" : %f,\n", elapsed_secs);
printf("\t\"totalFrames\" : %u,\n", frames);
+1 -1
View File
@@ -11,7 +11,7 @@
#include <memory>
#include <string>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_config.h"
#include "test/codec_factory.h"
+22 -8
View File
@@ -13,13 +13,13 @@
#include <string>
#include <vector>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_config.h"
#if CONFIG_VP8_ENCODER || CONFIG_VP9_ENCODER
#include "vpx/vp8cx.h"
#endif
#include "vpx/vpx_encoder.h"
#include "vpx/vpx_tpl.h"
namespace libvpx_test {
@@ -33,15 +33,24 @@ enum TestMode {
kTwoPassGood,
kTwoPassBest
};
#if CONFIG_REALTIME_ONLY
#define ALL_TEST_MODES ::testing::Values(::libvpx_test::kRealTime)
#define ONE_PASS_TEST_MODES ::testing::Values(::libvpx_test::kRealTime)
#define ONE_OR_TWO_PASS_TEST_MODES ::testing::Values(::libvpx_test::kRealTime)
#else
#define ALL_TEST_MODES \
::testing::Values(::libvpx_test::kRealTime, ::libvpx_test::kOnePassGood, \
::libvpx_test::kOnePassBest, ::libvpx_test::kTwoPassGood, \
::libvpx_test::kTwoPassBest)
#define ONE_PASS_TEST_MODES \
::testing::Values(::libvpx_test::kRealTime, ::libvpx_test::kOnePassGood, \
::libvpx_test::kOnePassBest)
#define ONE_OR_TWO_PASS_TEST_MODES \
::testing::Values(::libvpx_test::kOnePassGood, ::libvpx_test::kTwoPassGood)
#endif
#define TWO_PASS_TEST_MODES \
::testing::Values(::libvpx_test::kTwoPassGood, ::libvpx_test::kTwoPassBest)
@@ -86,7 +95,7 @@ class TwopassStatsStore {
// level of abstraction will be fleshed out as more tests are written.
class Encoder {
public:
Encoder(vpx_codec_enc_cfg_t cfg, unsigned long deadline,
Encoder(vpx_codec_enc_cfg_t cfg, vpx_enc_deadline_t deadline,
const unsigned long init_flags, TwopassStatsStore *stats)
: cfg_(cfg), deadline_(deadline), init_flags_(init_flags), stats_(stats) {
memset(&encoder_, 0, sizeof(encoder_));
@@ -153,6 +162,11 @@ class Encoder {
const vpx_codec_err_t res = vpx_codec_control_(&encoder_, ctrl_id, arg);
ASSERT_EQ(VPX_CODEC_OK, res) << EncoderError();
}
void Control(int ctrl_id, VpxTplGopStats *arg) {
const vpx_codec_err_t res = vpx_codec_control_(&encoder_, ctrl_id, arg);
ASSERT_EQ(VPX_CODEC_OK, res) << EncoderError();
}
#endif // CONFIG_VP9_ENCODER
#if CONFIG_VP8_ENCODER || CONFIG_VP9_ENCODER
@@ -172,7 +186,7 @@ class Encoder {
cfg_ = *cfg;
}
void set_deadline(unsigned long deadline) { deadline_ = deadline; }
void set_deadline(vpx_enc_deadline_t deadline) { deadline_ = deadline; }
protected:
virtual vpx_codec_iface_t *CodecInterface() const = 0;
@@ -191,7 +205,7 @@ class Encoder {
vpx_codec_ctx_t encoder_;
vpx_codec_enc_cfg_t cfg_;
unsigned long deadline_;
vpx_enc_deadline_t deadline_;
unsigned long init_flags_;
TwopassStatsStore *stats_;
};
@@ -259,7 +273,7 @@ class EncoderTest {
const CodecFactory *codec_;
// Hook to determine whether to decode frame after encoding
virtual bool DoDecode() const { return 1; }
virtual bool DoDecode() const { return true; }
// Hook to handle encode/decode mismatch
virtual void MismatchHook(const vpx_image_t *img1, const vpx_image_t *img2);
@@ -286,7 +300,7 @@ class EncoderTest {
vpx_codec_enc_cfg_t cfg_;
vpx_codec_dec_cfg_t dec_cfg_;
unsigned int passes_;
unsigned long deadline_;
vpx_enc_deadline_t deadline_;
TwopassStatsStore stats_;
unsigned long init_flags_;
vpx_enc_frame_flags_t frame_flags_;
+22 -16
View File
@@ -8,11 +8,12 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
#include "test/util.h"
#include "vpx_config.h"
namespace {
@@ -30,7 +31,7 @@ class ErrorResilienceTestLarge
Reset();
}
virtual ~ErrorResilienceTestLarge() {}
~ErrorResilienceTestLarge() override = default;
void Reset() {
error_nframes_ = 0;
@@ -38,19 +39,19 @@ class ErrorResilienceTestLarge
pattern_switch_ = 0;
}
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(encoding_mode_);
}
virtual void BeginPassHook(unsigned int /*pass*/) {
void BeginPassHook(unsigned int /*pass*/) override {
psnr_ = 0.0;
nframes_ = 0;
mismatch_psnr_ = 0.0;
mismatch_nframes_ = 0;
}
virtual void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) {
void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) override {
psnr_ += pkt->data.psnr.psnr[0];
nframes_++;
}
@@ -90,7 +91,7 @@ class ErrorResilienceTestLarge
return frame_flags;
}
virtual void PreEncodeFrameHook(libvpx_test::VideoSource *video) {
void PreEncodeFrameHook(libvpx_test::VideoSource *video) override {
frame_flags_ &=
~(VP8_EFLAG_NO_UPD_LAST | VP8_EFLAG_NO_UPD_GF | VP8_EFLAG_NO_UPD_ARF);
// For temporal layer case.
@@ -129,21 +130,21 @@ class ErrorResilienceTestLarge
return 0.0;
}
virtual bool DoDecode() const {
bool DoDecode() const override {
if (error_nframes_ > 0 &&
(cfg_.g_pass == VPX_RC_LAST_PASS || cfg_.g_pass == VPX_RC_ONE_PASS)) {
for (unsigned int i = 0; i < error_nframes_; ++i) {
if (error_frames_[i] == nframes_ - 1) {
std::cout << " Skipping decoding frame: "
<< error_frames_[i] << "\n";
return 0;
return false;
}
}
}
return 1;
return true;
}
virtual void MismatchHook(const vpx_image_t *img1, const vpx_image_t *img2) {
void MismatchHook(const vpx_image_t *img1, const vpx_image_t *img2) override {
double mismatch_psnr = compute_psnr(img1, img2);
mismatch_psnr_ += mismatch_psnr;
++mismatch_nframes_;
@@ -194,6 +195,10 @@ class ErrorResilienceTestLarge
};
TEST_P(ErrorResilienceTestLarge, OnVersusOff) {
#if CONFIG_REALTIME_ONLY
GTEST_SKIP()
<< "Non-zero g_lag_in_frames is unsupported with CONFIG_REALTIME_ONLY";
#else
const vpx_rational timebase = { 33333333, 1000000000 };
cfg_.g_timebase = timebase;
cfg_.rc_target_bitrate = 2000;
@@ -222,6 +227,7 @@ TEST_P(ErrorResilienceTestLarge, OnVersusOff) {
EXPECT_GE(psnr_ratio, 0.9);
EXPECT_LE(psnr_ratio, 1.1);
}
#endif // CONFIG_REALTIME_ONLY
}
// Check for successful decoding and no encoder/decoder mismatch
@@ -381,7 +387,7 @@ class ErrorResilienceTestLargeCodecControls
Reset();
}
virtual ~ErrorResilienceTestLargeCodecControls() {}
~ErrorResilienceTestLargeCodecControls() override = default;
void Reset() {
last_pts_ = 0;
@@ -393,7 +399,7 @@ class ErrorResilienceTestLargeCodecControls
duration_ = 0.0;
}
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(encoding_mode_);
}
@@ -460,8 +466,8 @@ class ErrorResilienceTestLargeCodecControls
return layer_id;
}
virtual void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) override {
if (cfg_.ts_number_layers > 1) {
int layer_id = SetLayerId(video->frame(), cfg_.ts_number_layers);
int frame_flags = SetFrameFlags(video->frame(), cfg_.ts_number_layers);
@@ -476,7 +482,7 @@ class ErrorResilienceTestLargeCodecControls
}
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
// Time since last timestamp = duration.
vpx_codec_pts_t duration = pkt->data.frame.pts - last_pts_;
if (duration > 1) {
@@ -496,7 +502,7 @@ class ErrorResilienceTestLargeCodecControls
++tot_frame_number_;
}
virtual void EndPassHook() {
void EndPassHook() override {
duration_ = (last_pts_ + 1) * timebase_;
if (cfg_.ts_number_layers > 1) {
for (int layer = 0; layer < static_cast<int>(cfg_.ts_number_layers);
+8 -9
View File
@@ -210,13 +210,12 @@ class ExternalFrameBufferMD5Test
: DecoderTest(GET_PARAM(::libvpx_test::kCodecFactoryParam)),
md5_file_(nullptr), num_buffers_(0) {}
virtual ~ExternalFrameBufferMD5Test() {
~ExternalFrameBufferMD5Test() override {
if (md5_file_ != nullptr) fclose(md5_file_);
}
virtual void PreDecodeFrameHook(
const libvpx_test::CompressedVideoSource &video,
libvpx_test::Decoder *decoder) {
void PreDecodeFrameHook(const libvpx_test::CompressedVideoSource &video,
libvpx_test::Decoder *decoder) override {
if (num_buffers_ > 0 && video.frame_number() == 0) {
// Have libvpx use frame buffers we create.
ASSERT_TRUE(fb_list_.CreateBufferList(num_buffers_));
@@ -232,8 +231,8 @@ class ExternalFrameBufferMD5Test
<< "Md5 file open failed. Filename: " << md5_file_name_;
}
virtual void DecompressedFrameHook(const vpx_image_t &img,
const unsigned int frame_number) {
void DecompressedFrameHook(const vpx_image_t &img,
const unsigned int frame_number) override {
ASSERT_NE(md5_file_, nullptr);
char expected_md5[33];
char junk[128];
@@ -289,7 +288,7 @@ class ExternalFrameBufferTest : public ::testing::Test {
ExternalFrameBufferTest()
: video_(nullptr), decoder_(nullptr), num_buffers_(0) {}
virtual void SetUp() {
void SetUp() override {
video_ = new libvpx_test::WebMVideoSource(kVP9TestFile);
ASSERT_NE(video_, nullptr);
video_->Init();
@@ -300,7 +299,7 @@ class ExternalFrameBufferTest : public ::testing::Test {
ASSERT_NE(decoder_, nullptr);
}
virtual void TearDown() {
void TearDown() override {
delete decoder_;
decoder_ = nullptr;
delete video_;
@@ -355,7 +354,7 @@ class ExternalFrameBufferTest : public ::testing::Test {
class ExternalFrameBufferNonRefTest : public ExternalFrameBufferTest {
protected:
virtual void SetUp() {
void SetUp() override {
video_ = new libvpx_test::WebMVideoSource(kVP9NonRefTestFile);
ASSERT_NE(video_, nullptr);
video_->Init();
+43 -29
View File
@@ -13,7 +13,7 @@
#include <string.h>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_dsp_rtcd.h"
@@ -23,6 +23,7 @@
#include "test/util.h"
#include "vp9/common/vp9_entropy.h"
#include "vp9/common/vp9_scan.h"
#include "vpx_config.h"
#include "vpx/vpx_codec.h"
#include "vpx/vpx_integer.h"
#include "vpx_ports/mem.h"
@@ -132,9 +133,18 @@ void idct8x8_64_add_12_sse2(const tran_low_t *in, uint8_t *out, int stride) {
#endif // HAVE_SSE2
#endif // CONFIG_VP9_HIGHBITDEPTH
// Visual Studio 2022 (cl.exe) targeting AArch64 with optimizations enabled
// produces invalid code in RunExtremalCheck() and RunInvAccuracyCheck().
// See:
// https://developercommunity.visualstudio.com/t/1770-preview-1:-Misoptimization-for-AR/10369786
// TODO(jzern): check the compiler version after a fix for the issue is
// released.
#if defined(_MSC_VER) && defined(_M_ARM64) && !defined(__clang__)
#pragma optimize("", off)
#endif
class FwdTrans8x8TestBase {
public:
virtual ~FwdTrans8x8TestBase() {}
virtual ~FwdTrans8x8TestBase() = default;
protected:
virtual void RunFwdTxfm(int16_t *in, tran_low_t *out, int stride) = 0;
@@ -170,7 +180,7 @@ class FwdTrans8x8TestBase {
for (int j = 0; j < 64; ++j) {
const int diff = abs(count_sign_block[j][0] - count_sign_block[j][1]);
const int max_diff = kSignBiasMaxDiff255;
EXPECT_LT(diff, max_diff << (bit_depth_ - 8))
ASSERT_LT(diff, max_diff << (bit_depth_ - 8))
<< "Error: 8x8 FDCT/FHT has a sign bias > "
<< 1. * max_diff / count_test_block * 100 << "%"
<< " for input range [-255, 255] at index " << j
@@ -201,7 +211,7 @@ class FwdTrans8x8TestBase {
for (int j = 0; j < 64; ++j) {
const int diff = abs(count_sign_block[j][0] - count_sign_block[j][1]);
const int max_diff = kSignBiasMaxDiff15;
EXPECT_LT(diff, max_diff << (bit_depth_ - 8))
ASSERT_LT(diff, max_diff << (bit_depth_ - 8))
<< "Error: 8x8 FDCT/FHT has a sign bias > "
<< 1. * max_diff / count_test_block * 100 << "%"
<< " for input range [-15, 15] at index " << j
@@ -275,11 +285,11 @@ class FwdTrans8x8TestBase {
}
}
EXPECT_GE(1 << 2 * (bit_depth_ - 8), max_error)
ASSERT_GE(1 << 2 * (bit_depth_ - 8), max_error)
<< "Error: 8x8 FDCT/IDCT or FHT/IHT has an individual"
<< " roundtrip error > 1";
EXPECT_GE((count_test_block << 2 * (bit_depth_ - 8)) / 5, total_error)
ASSERT_GE((count_test_block << 2 * (bit_depth_ - 8)) / 5, total_error)
<< "Error: 8x8 FDCT/IDCT or FHT/IHT has average roundtrip "
<< "error > 1/5 per block";
}
@@ -360,17 +370,17 @@ class FwdTrans8x8TestBase {
total_coeff_error += abs(coeff_diff);
}
EXPECT_GE(1 << 2 * (bit_depth_ - 8), max_error)
ASSERT_GE(1 << 2 * (bit_depth_ - 8), max_error)
<< "Error: Extremal 8x8 FDCT/IDCT or FHT/IHT has"
<< "an individual roundtrip error > 1";
<< " an individual roundtrip error > 1";
EXPECT_GE((count_test_block << 2 * (bit_depth_ - 8)) / 5, total_error)
ASSERT_GE((count_test_block << 2 * (bit_depth_ - 8)) / 5, total_error)
<< "Error: Extremal 8x8 FDCT/IDCT or FHT/IHT has average"
<< " roundtrip error > 1/5 per block";
EXPECT_EQ(0, total_coeff_error)
ASSERT_EQ(0, total_coeff_error)
<< "Error: Extremal 8x8 FDCT/FHT has"
<< "overflow issues in the intermediate steps > 1";
<< " overflow issues in the intermediate steps > 1";
}
}
@@ -426,7 +436,7 @@ class FwdTrans8x8TestBase {
const int diff = dst[j] - src[j];
#endif
const uint32_t error = diff * diff;
EXPECT_GE(1u << 2 * (bit_depth_ - 8), error)
ASSERT_GE(1u << 2 * (bit_depth_ - 8), error)
<< "Error: 8x8 IDCT has error " << error << " at index " << j;
}
}
@@ -456,7 +466,7 @@ class FwdTrans8x8TestBase {
for (int j = 0; j < kNumCoeffs; ++j) {
const int32_t diff = coeff[j] - coeff_r[j];
const uint32_t error = diff * diff;
EXPECT_GE(9u << 2 * (bit_depth_ - 8), error)
ASSERT_GE(9u << 2 * (bit_depth_ - 8), error)
<< "Error: 8x8 DCT has error " << error << " at index " << j;
}
}
@@ -512,7 +522,7 @@ class FwdTrans8x8TestBase {
const int diff = dst[j] - ref[j];
#endif
const uint32_t error = diff * diff;
EXPECT_EQ(0u, error)
ASSERT_EQ(0u, error)
<< "Error: 8x8 IDCT has error " << error << " at index " << j;
}
}
@@ -523,13 +533,16 @@ class FwdTrans8x8TestBase {
vpx_bit_depth_t bit_depth_;
int mask_;
};
#if defined(_MSC_VER) && defined(_M_ARM64) && !defined(__clang__)
#pragma optimize("", on)
#endif
class FwdTrans8x8DCT : public FwdTrans8x8TestBase,
public ::testing::TestWithParam<Dct8x8Param> {
public:
virtual ~FwdTrans8x8DCT() {}
~FwdTrans8x8DCT() override = default;
virtual void SetUp() {
void SetUp() override {
fwd_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
tx_type_ = GET_PARAM(2);
@@ -539,13 +552,13 @@ class FwdTrans8x8DCT : public FwdTrans8x8TestBase,
mask_ = (1 << bit_depth_) - 1;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
void RunFwdTxfm(int16_t *in, tran_low_t *out, int stride) {
void RunFwdTxfm(int16_t *in, tran_low_t *out, int stride) override {
fwd_txfm_(in, out, stride);
}
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) {
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) override {
inv_txfm_(out, dst, stride);
}
@@ -566,9 +579,9 @@ TEST_P(FwdTrans8x8DCT, InvAccuracyCheck) { RunInvAccuracyCheck(); }
class FwdTrans8x8HT : public FwdTrans8x8TestBase,
public ::testing::TestWithParam<Ht8x8Param> {
public:
virtual ~FwdTrans8x8HT() {}
~FwdTrans8x8HT() override = default;
virtual void SetUp() {
void SetUp() override {
fwd_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
tx_type_ = GET_PARAM(2);
@@ -578,13 +591,13 @@ class FwdTrans8x8HT : public FwdTrans8x8TestBase,
mask_ = (1 << bit_depth_) - 1;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
void RunFwdTxfm(int16_t *in, tran_low_t *out, int stride) {
void RunFwdTxfm(int16_t *in, tran_low_t *out, int stride) override {
fwd_txfm_(in, out, stride, tx_type_);
}
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) {
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) override {
inv_txfm_(out, dst, stride, tx_type_);
}
@@ -602,9 +615,9 @@ TEST_P(FwdTrans8x8HT, ExtremalCheck) { RunExtremalCheck(); }
class InvTrans8x8DCT : public FwdTrans8x8TestBase,
public ::testing::TestWithParam<Idct8x8Param> {
public:
virtual ~InvTrans8x8DCT() {}
~InvTrans8x8DCT() override = default;
virtual void SetUp() {
void SetUp() override {
ref_txfm_ = GET_PARAM(0);
inv_txfm_ = GET_PARAM(1);
thresh_ = GET_PARAM(2);
@@ -613,13 +626,14 @@ class InvTrans8x8DCT : public FwdTrans8x8TestBase,
mask_ = (1 << bit_depth_) - 1;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) {
void RunInvTxfm(tran_low_t *out, uint8_t *dst, int stride) override {
inv_txfm_(out, dst, stride);
}
void RunFwdTxfm(int16_t * /*out*/, tran_low_t * /*dst*/, int /*stride*/) {}
void RunFwdTxfm(int16_t * /*out*/, tran_low_t * /*dst*/,
int /*stride*/) override {}
IdctFunc ref_txfm_;
IdctFunc inv_txfm_;
+24 -12
View File
@@ -9,17 +9,17 @@
*/
#include <memory>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/register_state_check.h"
#include "test/video_source.h"
#include "vpx_config.h"
namespace {
class EncoderWithExpectedError : public ::libvpx_test::Encoder {
public:
EncoderWithExpectedError(vpx_codec_enc_cfg_t cfg,
unsigned long deadline, // NOLINT
EncoderWithExpectedError(vpx_codec_enc_cfg_t cfg, vpx_enc_deadline_t deadline,
const unsigned long init_flags, // NOLINT
::libvpx_test::TwopassStatsStore *stats)
: ::libvpx_test::Encoder(cfg, deadline, init_flags, stats) {}
@@ -65,7 +65,7 @@ class EncoderWithExpectedError : public ::libvpx_test::Encoder {
ASSERT_EQ(expected_err, res) << EncoderError();
}
virtual vpx_codec_iface_t *CodecInterface() const {
vpx_codec_iface_t *CodecInterface() const override {
#if CONFIG_VP9_ENCODER
return &vpx_codec_vp9_cx_algo;
#else
@@ -79,22 +79,22 @@ class VP9FrameSizeTestsLarge : public ::libvpx_test::EncoderTest,
protected:
VP9FrameSizeTestsLarge()
: EncoderTest(&::libvpx_test::kVP9), expected_res_(VPX_CODEC_OK) {}
virtual ~VP9FrameSizeTestsLarge() {}
~VP9FrameSizeTestsLarge() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(::libvpx_test::kRealTime);
}
virtual bool HandleDecodeResult(const vpx_codec_err_t res_dec,
const libvpx_test::VideoSource & /*video*/,
libvpx_test::Decoder *decoder) {
bool HandleDecodeResult(const vpx_codec_err_t res_dec,
const libvpx_test::VideoSource & /*video*/,
libvpx_test::Decoder *decoder) override {
EXPECT_EQ(expected_res_, res_dec) << decoder->DecodeError();
return !::testing::Test::HasFailure();
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, 7);
encoder->Control(VP8E_SET_ENABLEAUTOALTREF, 1);
@@ -168,6 +168,9 @@ class VP9FrameSizeTestsLarge : public ::libvpx_test::EncoderTest,
};
TEST_F(VP9FrameSizeTestsLarge, TestInvalidSizes) {
#ifdef CHROMIUM
GTEST_SKIP() << "16K framebuffers are not supported by Chromium's allocator.";
#else
::libvpx_test::RandomVideoSource video;
#if CONFIG_SIZE_LIMIT
@@ -176,9 +179,16 @@ TEST_F(VP9FrameSizeTestsLarge, TestInvalidSizes) {
expected_res_ = VPX_CODEC_MEM_ERROR;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video, expected_res_));
#endif
#endif
}
TEST_F(VP9FrameSizeTestsLarge, ValidSizes) {
#ifdef CHROMIUM
GTEST_SKIP()
<< "Under Chromium's configuration the allocator is unable to provide"
"the space required for a single frame at the maximum resolution.";
#else
::libvpx_test::RandomVideoSource video;
#if CONFIG_SIZE_LIMIT
@@ -194,7 +204,7 @@ TEST_F(VP9FrameSizeTestsLarge, ValidSizes) {
// size or almost 1 gig of memory.
// In total the allocations will exceed 2GiB which may cause a failure with
// mingw + wine, use a smaller size in that case.
#if defined(_WIN32) && !defined(_WIN64) || defined(__OS2__)
#if defined(_WIN32) && !defined(_WIN64)
video.SetSize(4096, 3072);
#else
video.SetSize(4096, 4096);
@@ -203,6 +213,8 @@ TEST_F(VP9FrameSizeTestsLarge, ValidSizes) {
expected_res_ = VPX_CODEC_OK;
ASSERT_NO_FATAL_FAILURE(::libvpx_test::EncoderTest::RunLoop(&video));
#endif
#endif // defined(CHROMIUM)
}
TEST_F(VP9FrameSizeTestsLarge, OneByOneVideo) {
+57 -4
View File
@@ -10,13 +10,14 @@
#include <algorithm>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_dsp_rtcd.h"
#include "vpx_ports/vpx_timer.h"
#include "test/acm_random.h"
#include "test/register_state_check.h"
#include "vpx_config.h"
namespace {
@@ -130,13 +131,19 @@ std::ostream &operator<<(std::ostream &os, const HadamardFuncWithSize &hfs) {
class HadamardTestBase : public ::testing::TestWithParam<HadamardFuncWithSize> {
public:
virtual void SetUp() {
void SetUp() override {
h_func_ = GetParam().func;
bwh_ = GetParam().block_size;
block_size_ = bwh_ * bwh_;
rnd_.Reset(ACMRandom::DeterministicSeed());
}
// The Rand() function generates values in the range [-((1 << BitDepth) - 1),
// (1 << BitDepth) - 1]. This is because the input to the Hadamard transform
// is the residual pixel, which is defined as 'source pixel - predicted
// pixel'. Source pixel and predicted pixel take values in the range
// [0, (1 << BitDepth) - 1] and thus the residual pixel ranges from
// -((1 << BitDepth) - 1) to ((1 << BitDepth) - 1).
virtual int16_t Rand() = 0;
void ReferenceHadamard(const int16_t *a, int a_stride, tran_low_t *b,
@@ -170,6 +177,31 @@ class HadamardTestBase : public ::testing::TestWithParam<HadamardFuncWithSize> {
EXPECT_EQ(0, memcmp(b, b_ref, sizeof(b)));
}
void ExtremeValuesTest() {
const int kMaxBlockSize = 32 * 32;
DECLARE_ALIGNED(16, int16_t, input_extreme_block[kMaxBlockSize]);
DECLARE_ALIGNED(16, tran_low_t, b[kMaxBlockSize]);
memset(b, 0, sizeof(b));
tran_low_t b_ref[kMaxBlockSize];
memset(b_ref, 0, sizeof(b_ref));
for (int i = 0; i < 2; ++i) {
// Initialize a test block with input range [-mask_, mask_].
const int sign = (i == 0) ? 1 : -1;
for (int j = 0; j < kMaxBlockSize; ++j)
input_extreme_block[j] = sign * 255;
ReferenceHadamard(input_extreme_block, bwh_, b_ref, bwh_);
ASM_REGISTER_STATE_CHECK(h_func_(input_extreme_block, bwh_, b));
// The order of the output is not important. Sort before checking.
std::sort(b, b + block_size_);
std::sort(b_ref, b_ref + block_size_);
EXPECT_EQ(0, memcmp(b, b_ref, sizeof(b)));
}
}
void VaryStride() {
const int kMaxBlockSize = 32 * 32;
DECLARE_ALIGNED(16, int16_t, a[kMaxBlockSize * 8]);
@@ -220,11 +252,18 @@ class HadamardTestBase : public ::testing::TestWithParam<HadamardFuncWithSize> {
class HadamardLowbdTest : public HadamardTestBase {
protected:
virtual int16_t Rand() { return rnd_.Rand9Signed(); }
// Use values between -255 (0xFF01) and 255 (0x00FF)
int16_t Rand() override {
int16_t src = rnd_.Rand8();
int16_t pred = rnd_.Rand8();
return src - pred;
}
};
TEST_P(HadamardLowbdTest, CompareReferenceRandom) { CompareReferenceRandom(); }
TEST_P(HadamardLowbdTest, ExtremeValuesTest) { ExtremeValuesTest(); }
TEST_P(HadamardLowbdTest, VaryStride) { VaryStride(); }
TEST_P(HadamardLowbdTest, DISABLED_Speed) {
@@ -296,7 +335,12 @@ INSTANTIATE_TEST_SUITE_P(
#if CONFIG_VP9_HIGHBITDEPTH
class HadamardHighbdTest : public HadamardTestBase {
protected:
virtual int16_t Rand() { return rnd_.Rand13Signed(); }
// Use values between -4095 (0xF001) and 4095 (0x0FFF)
int16_t Rand() override {
int16_t src = rnd_.Rand12();
int16_t pred = rnd_.Rand12();
return src - pred;
}
};
TEST_P(HadamardHighbdTest, CompareReferenceRandom) { CompareReferenceRandom(); }
@@ -324,5 +368,14 @@ INSTANTIATE_TEST_SUITE_P(
32)));
#endif // HAVE_AVX2
#if HAVE_NEON
INSTANTIATE_TEST_SUITE_P(
NEON, HadamardHighbdTest,
::testing::Values(HadamardFuncWithSize(&vpx_highbd_hadamard_8x8_neon, 8),
HadamardFuncWithSize(&vpx_highbd_hadamard_16x16_neon, 16),
HadamardFuncWithSize(&vpx_highbd_hadamard_32x32_neon,
32)));
#endif
#endif // CONFIG_VP9_HIGHBITDEPTH
} // namespace
+1 -2
View File
@@ -12,12 +12,11 @@
#include <stdlib.h>
#include <string.h>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_dsp_rtcd.h"
#include "test/acm_random.h"
#include "vpx/vpx_integer.h"
#include "vpx_ports/msvc.h" // for round()
using libvpx_test::ACMRandom;
+3 -3
View File
@@ -11,7 +11,7 @@
#include "./vpx_config.h"
#include "./vp8_rtcd.h"
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/buffer.h"
#include "test/clear_system_state.h"
@@ -27,7 +27,7 @@ using libvpx_test::Buffer;
class IDCTTest : public ::testing::TestWithParam<IdctFunc> {
protected:
virtual void SetUp() {
void SetUp() override {
UUT = GetParam();
input = new Buffer<int16_t>(4, 4, 0);
@@ -41,7 +41,7 @@ class IDCTTest : public ::testing::TestWithParam<IdctFunc> {
ASSERT_TRUE(output->Init());
}
virtual void TearDown() {
void TearDown() override {
delete input;
delete predict;
delete output;
+99
View File
@@ -0,0 +1,99 @@
/*
* Copyright (c) 2023 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "test/init_vpx_test.h"
#include "./vpx_config.h"
#if !CONFIG_SHARED
#include <string>
#include "gtest/gtest.h"
#if VPX_ARCH_ARM
#include "vpx_ports/arm.h"
#endif
#if VPX_ARCH_X86 || VPX_ARCH_X86_64
#include "vpx_ports/x86.h"
#endif
extern "C" {
#if CONFIG_VP8
extern void vp8_rtcd();
#endif // CONFIG_VP8
#if CONFIG_VP9
extern void vp9_rtcd();
#endif // CONFIG_VP9
extern void vpx_dsp_rtcd();
extern void vpx_scale_rtcd();
}
#if VPX_ARCH_ARM || VPX_ARCH_X86 || VPX_ARCH_X86_64
static void append_negative_gtest_filter(const char *str) {
std::string filter = GTEST_FLAG_GET(filter);
// Negative patterns begin with one '-' followed by a ':' separated list.
if (filter.find('-') == std::string::npos) filter += '-';
filter += str;
GTEST_FLAG_SET(filter, filter);
}
#endif // VPX_ARCH_ARM || VPX_ARCH_X86 || VPX_ARCH_X86_64
#endif // !CONFIG_SHARED
namespace libvpx_test {
void init_vpx_test() {
#if !CONFIG_SHARED
#if VPX_ARCH_AARCH64
const int caps = arm_cpu_caps();
if (!(caps & HAS_NEON_DOTPROD)) {
append_negative_gtest_filter(":NEON_DOTPROD.*:NEON_DOTPROD/*");
}
if (!(caps & HAS_NEON_I8MM)) {
append_negative_gtest_filter(":NEON_I8MM.*:NEON_I8MM/*");
}
if (!(caps & HAS_SVE)) {
append_negative_gtest_filter(":SVE.*:SVE/*");
}
if (!(caps & HAS_SVE2)) {
append_negative_gtest_filter(":SVE2.*:SVE2/*");
}
#elif VPX_ARCH_ARM
const int caps = arm_cpu_caps();
if (!(caps & HAS_NEON)) append_negative_gtest_filter(":NEON.*:NEON/*");
#endif // VPX_ARCH_ARM
#if VPX_ARCH_X86 || VPX_ARCH_X86_64
const int simd_caps = x86_simd_caps();
if (!(simd_caps & HAS_MMX)) append_negative_gtest_filter(":MMX.*:MMX/*");
if (!(simd_caps & HAS_SSE)) append_negative_gtest_filter(":SSE.*:SSE/*");
if (!(simd_caps & HAS_SSE2)) append_negative_gtest_filter(":SSE2.*:SSE2/*");
if (!(simd_caps & HAS_SSE3)) append_negative_gtest_filter(":SSE3.*:SSE3/*");
if (!(simd_caps & HAS_SSSE3)) {
append_negative_gtest_filter(":SSSE3.*:SSSE3/*");
}
if (!(simd_caps & HAS_SSE4_1)) {
append_negative_gtest_filter(":SSE4_1.*:SSE4_1/*");
}
if (!(simd_caps & HAS_AVX)) append_negative_gtest_filter(":AVX.*:AVX/*");
if (!(simd_caps & HAS_AVX2)) append_negative_gtest_filter(":AVX2.*:AVX2/*");
if (!(simd_caps & HAS_AVX512)) {
append_negative_gtest_filter(":AVX512.*:AVX512/*");
}
#endif // VPX_ARCH_X86 || VPX_ARCH_X86_64
// Shared library builds don't support whitebox tests that exercise internal
// symbols.
#if CONFIG_VP8
vp8_rtcd();
#endif // CONFIG_VP8
#if CONFIG_VP9
vp9_rtcd();
#endif // CONFIG_VP9
vpx_dsp_rtcd();
vpx_scale_rtcd();
#endif // !CONFIG_SHARED
}
} // namespace libvpx_test
+18
View File
@@ -0,0 +1,18 @@
/*
* Copyright (c) 2023 The WebM project authors. All Rights Reserved.
*
* Use of this source code is governed by a BSD-style license
* that can be found in the LICENSE file in the root of the source
* tree. An additional intellectual property rights grant can be found
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#ifndef TEST_INIT_VPX_TEST_H_
#define TEST_INIT_VPX_TEST_H_
namespace libvpx_test {
void init_vpx_test();
}
#endif // TEST_INIT_VPX_TEST_H_
+8 -9
View File
@@ -13,7 +13,7 @@
#include <memory>
#include <string>
#include <vector>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_config.h"
#include "test/codec_factory.h"
#include "test/decode_test_driver.h"
@@ -40,7 +40,7 @@ class InvalidFileTest : public ::libvpx_test::DecoderTest,
protected:
InvalidFileTest() : DecoderTest(GET_PARAM(0)), res_file_(nullptr) {}
virtual ~InvalidFileTest() {
~InvalidFileTest() override {
if (res_file_ != nullptr) fclose(res_file_);
}
@@ -50,10 +50,9 @@ class InvalidFileTest : public ::libvpx_test::DecoderTest,
<< "Result file open failed. Filename: " << res_file_name_;
}
virtual bool HandleDecodeResult(
const vpx_codec_err_t res_dec,
const libvpx_test::CompressedVideoSource &video,
libvpx_test::Decoder *decoder) {
bool HandleDecodeResult(const vpx_codec_err_t res_dec,
const libvpx_test::CompressedVideoSource &video,
libvpx_test::Decoder *decoder) override {
EXPECT_NE(res_file_, nullptr);
int expected_res_dec;
@@ -172,9 +171,9 @@ VP9_INSTANTIATE_TEST_SUITE(InvalidFileTest,
class InvalidFileInvalidPeekTest : public InvalidFileTest {
protected:
InvalidFileInvalidPeekTest() : InvalidFileTest() {}
virtual void HandlePeekResult(libvpx_test::Decoder *const /*decoder*/,
libvpx_test::CompressedVideoSource * /*video*/,
const vpx_codec_err_t /*res_peek*/) {}
void HandlePeekResult(libvpx_test::Decoder *const /*decoder*/,
libvpx_test::CompressedVideoSource * /*video*/,
const vpx_codec_err_t /*res_peek*/) override {}
};
TEST_P(InvalidFileInvalidPeekTest, ReturnCode) { RunTest(); }
+7 -7
View File
@@ -33,19 +33,19 @@ class IVFVideoSource : public CompressedVideoSource {
compressed_frame_buf_(nullptr), frame_sz_(0), frame_(0),
end_of_file_(false) {}
virtual ~IVFVideoSource() {
~IVFVideoSource() override {
delete[] compressed_frame_buf_;
if (input_file_) fclose(input_file_);
}
virtual void Init() {
void Init() override {
// Allocate a buffer for read in the compressed video frame.
compressed_frame_buf_ = new uint8_t[libvpx_test::kCodeBufferSize];
ASSERT_NE(compressed_frame_buf_, nullptr) << "Allocate frame buffer failed";
}
virtual void Begin() {
void Begin() override {
input_file_ = OpenTestDataFile(file_name_);
ASSERT_NE(input_file_, nullptr)
<< "Input file open failed. Filename: " << file_name_;
@@ -62,7 +62,7 @@ class IVFVideoSource : public CompressedVideoSource {
FillFrame();
}
virtual void Next() {
void Next() override {
++frame_;
FillFrame();
}
@@ -86,11 +86,11 @@ class IVFVideoSource : public CompressedVideoSource {
}
}
virtual const uint8_t *cxdata() const {
const uint8_t *cxdata() const override {
return end_of_file_ ? nullptr : compressed_frame_buf_;
}
virtual size_t frame_size() const { return frame_sz_; }
virtual unsigned int frame_number() const { return frame_; }
size_t frame_size() const override { return frame_sz_; }
unsigned int frame_number() const override { return frame_; }
protected:
std::string file_name_;
+113 -6
View File
@@ -8,12 +8,18 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include <climits>
#include <cstring>
#include <vector>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
#include "test/util.h"
#include "./vpx_config.h"
#include "vpx/vp8cx.h"
#include "vpx/vpx_codec.h"
#include "vpx/vpx_encoder.h"
#include "vpx/vpx_image.h"
namespace {
@@ -22,9 +28,9 @@ class KeyframeTest
public ::libvpx_test::CodecTestWithParam<libvpx_test::TestMode> {
protected:
KeyframeTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~KeyframeTest() {}
~KeyframeTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
kf_count_ = 0;
@@ -33,8 +39,8 @@ class KeyframeTest
set_cpu_used_ = 0;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (kf_do_force_kf_) {
frame_flags_ = (video->frame() % 3) ? 0 : VPX_EFLAG_FORCE_KF;
}
@@ -43,7 +49,7 @@ class KeyframeTest
}
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
if (pkt->data.frame.flags & VPX_FRAME_IS_KEY) {
kf_pts_list_.push_back(pkt->data.frame.pts);
kf_count_++;
@@ -146,4 +152,105 @@ TEST_P(KeyframeTest, TestAutoKeyframe) {
}
VP8_INSTANTIATE_TEST_SUITE(KeyframeTest, ALL_TEST_MODES);
bool IsVP9(vpx_codec_iface_t *iface) {
static const char kVP9Name[] = "WebM Project VP9";
return strncmp(kVP9Name, vpx_codec_iface_name(iface), sizeof(kVP9Name) - 1) ==
0;
}
vpx_image_t *CreateGrayImage(vpx_img_fmt_t fmt, unsigned int w,
unsigned int h) {
vpx_image_t *const image = vpx_img_alloc(nullptr, fmt, w, h, 1);
if (!image) return image;
for (unsigned int i = 0; i < image->d_h; ++i) {
memset(image->planes[0] + i * image->stride[0], 128, image->d_w);
}
const unsigned int uv_h = (image->d_h + 1) / 2;
const unsigned int uv_w = (image->d_w + 1) / 2;
for (unsigned int i = 0; i < uv_h; ++i) {
memset(image->planes[1] + i * image->stride[1], 128, uv_w);
memset(image->planes[2] + i * image->stride[2], 128, uv_w);
}
return image;
}
// Tests kf_max_dist in one-pass encoding with zero lag.
void TestKeyframeMaximumInterval(vpx_codec_iface_t *iface,
vpx_enc_deadline_t deadline,
unsigned int kf_max_dist) {
vpx_codec_enc_cfg_t cfg;
ASSERT_EQ(vpx_codec_enc_config_default(iface, &cfg, /*usage=*/0),
VPX_CODEC_OK);
cfg.g_w = 320;
cfg.g_h = 240;
cfg.g_pass = VPX_RC_ONE_PASS;
cfg.g_lag_in_frames = 0;
cfg.kf_mode = VPX_KF_AUTO;
cfg.kf_min_dist = 0;
cfg.kf_max_dist = kf_max_dist;
vpx_codec_ctx_t enc;
ASSERT_EQ(vpx_codec_enc_init(&enc, iface, &cfg, 0), VPX_CODEC_OK);
const int speed = IsVP9(iface) ? 9 : -12;
ASSERT_EQ(vpx_codec_control(&enc, VP8E_SET_CPUUSED, speed), VPX_CODEC_OK);
vpx_image_t *image = CreateGrayImage(VPX_IMG_FMT_I420, cfg.g_w, cfg.g_h);
ASSERT_NE(image, nullptr);
// Encode frames.
const vpx_codec_cx_pkt_t *pkt;
const unsigned int num_frames = kf_max_dist == 0 ? 4 : 3 * kf_max_dist + 1;
for (unsigned int i = 0; i < num_frames; ++i) {
ASSERT_EQ(vpx_codec_encode(&enc, image, i, 1, 0, deadline), VPX_CODEC_OK);
vpx_codec_iter_t iter = nullptr;
while ((pkt = vpx_codec_get_cx_data(&enc, &iter)) != nullptr) {
ASSERT_EQ(pkt->kind, VPX_CODEC_CX_FRAME_PKT);
if (kf_max_dist == 0 || i % kf_max_dist == 0) {
ASSERT_EQ(pkt->data.frame.flags & VPX_FRAME_IS_KEY, VPX_FRAME_IS_KEY);
} else {
ASSERT_EQ(pkt->data.frame.flags & VPX_FRAME_IS_KEY, 0u);
}
}
}
// Flush the encoder.
bool got_data;
do {
ASSERT_EQ(vpx_codec_encode(&enc, nullptr, 0, 1, 0, deadline), VPX_CODEC_OK);
got_data = false;
vpx_codec_iter_t iter = nullptr;
while ((pkt = vpx_codec_get_cx_data(&enc, &iter)) != nullptr) {
ASSERT_EQ(pkt->kind, VPX_CODEC_CX_FRAME_PKT);
got_data = true;
}
} while (got_data);
vpx_img_free(image);
ASSERT_EQ(vpx_codec_destroy(&enc), VPX_CODEC_OK);
}
TEST(KeyframeIntervalTest, KeyframeMaximumInterval) {
std::vector<vpx_codec_iface_t *> ifaces;
#if CONFIG_VP8_ENCODER
ifaces.push_back(vpx_codec_vp8_cx());
#endif
#if CONFIG_VP9_ENCODER
ifaces.push_back(vpx_codec_vp9_cx());
#endif
for (vpx_codec_iface_t *iface : ifaces) {
for (vpx_enc_deadline_t deadline :
{ VPX_DL_REALTIME, VPX_DL_GOOD_QUALITY, VPX_DL_BEST_QUALITY }) {
// Test 0 and 1 (both mean all intra), some powers of 2, some multiples
// of 10, and some prime numbers.
for (unsigned int kf_max_dist :
{ 0, 1, 2, 3, 4, 7, 10, 13, 16, 20, 23, 29, 32 }) {
TestKeyframeMaximumInterval(iface, deadline, kf_max_dist);
}
}
}
}
} // namespace
+20 -9
View File
@@ -7,11 +7,12 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
#include "test/util.h"
#include "vpx_config.h"
namespace {
class LevelTest
@@ -22,9 +23,9 @@ class LevelTest
: EncoderTest(GET_PARAM(0)), encoding_mode_(GET_PARAM(1)),
cpu_used_(GET_PARAM(2)), min_gf_internal_(24), target_level_(0),
level_(0) {}
virtual ~LevelTest() {}
~LevelTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(encoding_mode_);
if (encoding_mode_ != ::libvpx_test::kRealTime) {
@@ -41,8 +42,8 @@ class LevelTest
cfg_.rc_min_quantizer = 0;
}
virtual void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(::libvpx_test::VideoSource *video,
::libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP8E_SET_CPUUSED, cpu_used_);
encoder->Control(VP9E_SET_TARGET_LEVEL, target_level_);
@@ -67,6 +68,9 @@ class LevelTest
};
TEST_P(LevelTest, TestTargetLevel11Large) {
#if CONFIG_REALTIME_ONLY
GTEST_SKIP();
#else
ASSERT_NE(encoding_mode_, ::libvpx_test::kRealTime);
::libvpx_test::I420VideoSource video("hantro_odd.yuv", 208, 144, 30, 1, 0,
60);
@@ -74,9 +78,13 @@ TEST_P(LevelTest, TestTargetLevel11Large) {
cfg_.rc_target_bitrate = 150;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(target_level_, level_);
#endif
}
TEST_P(LevelTest, TestTargetLevel20Large) {
#if CONFIG_REALTIME_ONLY
GTEST_SKIP();
#else
ASSERT_NE(encoding_mode_, ::libvpx_test::kRealTime);
::libvpx_test::I420VideoSource video("hantro_collage_w352h288.yuv", 352, 288,
30, 1, 0, 60);
@@ -84,9 +92,13 @@ TEST_P(LevelTest, TestTargetLevel20Large) {
cfg_.rc_target_bitrate = 1200;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(target_level_, level_);
#endif
}
TEST_P(LevelTest, TestTargetLevel31Large) {
#if CONFIG_REALTIME_ONLY
GTEST_SKIP();
#else
ASSERT_NE(encoding_mode_, ::libvpx_test::kRealTime);
::libvpx_test::I420VideoSource video("niklas_1280_720_30.y4m", 1280, 720, 30,
1, 0, 60);
@@ -94,6 +106,7 @@ TEST_P(LevelTest, TestTargetLevel31Large) {
cfg_.rc_target_bitrate = 8000;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
ASSERT_GE(target_level_, level_);
#endif
}
// Test for keeping level stats only
@@ -120,7 +133,7 @@ TEST_P(LevelTest, TestTargetLevel255) {
TEST_P(LevelTest, TestTargetLevelApi) {
::libvpx_test::I420VideoSource video("hantro_odd.yuv", 208, 144, 30, 1, 0, 1);
static const vpx_codec_iface_t *codec = &vpx_codec_vp9_cx_algo;
static vpx_codec_iface_t *codec = &vpx_codec_vp9_cx_algo;
vpx_codec_ctx_t enc;
vpx_codec_enc_cfg_t cfg;
EXPECT_EQ(VPX_CODEC_OK, vpx_codec_enc_config_default(codec, &cfg, 0));
@@ -140,8 +153,6 @@ TEST_P(LevelTest, TestTargetLevelApi) {
EXPECT_EQ(VPX_CODEC_OK, vpx_codec_destroy(&enc));
}
VP9_INSTANTIATE_TEST_SUITE(LevelTest,
::testing::Values(::libvpx_test::kTwoPassGood,
::libvpx_test::kOnePassGood),
VP9_INSTANTIATE_TEST_SUITE(LevelTest, ONE_OR_TWO_PASS_TEST_MODES,
::testing::Range(0, 9));
} // namespace
+7 -7
View File
@@ -13,7 +13,7 @@
#include <string>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_config.h"
#include "./vpx_dsp_rtcd.h"
@@ -129,15 +129,15 @@ uint8_t GetHevThresh(ACMRandom *rnd) {
class Loop8Test6Param : public ::testing::TestWithParam<loop8_param_t> {
public:
virtual ~Loop8Test6Param() {}
virtual void SetUp() {
~Loop8Test6Param() override = default;
void SetUp() override {
loopfilter_op_ = GET_PARAM(0);
ref_loopfilter_op_ = GET_PARAM(1);
bit_depth_ = GET_PARAM(2);
mask_ = (1 << bit_depth_) - 1;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
int bit_depth_;
@@ -151,15 +151,15 @@ GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST(Loop8Test6Param);
(HAVE_DSPR2 || HAVE_MSA && !CONFIG_VP9_HIGHBITDEPTH)
class Loop8Test9Param : public ::testing::TestWithParam<dualloop8_param_t> {
public:
virtual ~Loop8Test9Param() {}
virtual void SetUp() {
~Loop8Test9Param() override = default;
void SetUp() override {
loopfilter_op_ = GET_PARAM(0);
ref_loopfilter_op_ = GET_PARAM(1);
bit_depth_ = GET_PARAM(2);
mask_ = (1 << bit_depth_) - 1;
}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
int bit_depth_;
+116 -2
View File
@@ -11,10 +11,12 @@
#include <stdlib.h>
#include <string.h>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "vpx_config.h"
#include "./vpx_dsp_rtcd.h"
#include "vpx/vpx_integer.h"
#include "vpx_mem/vpx_mem.h"
#include "test/acm_random.h"
#include "test/register_state_check.h"
@@ -28,7 +30,7 @@ typedef void (*MinMaxFunc)(const uint8_t *a, int a_stride, const uint8_t *b,
class MinMaxTest : public ::testing::TestWithParam<MinMaxFunc> {
public:
virtual void SetUp() {
void SetUp() override {
mm_func_ = GetParam();
rnd_.Reset(ACMRandom::DeterministicSeed());
}
@@ -115,7 +117,115 @@ TEST_P(MinMaxTest, CompareReferenceAndVaryStride) {
}
}
#if CONFIG_VP9_HIGHBITDEPTH
using HBDMinMaxTest = MinMaxTest;
void highbd_reference_minmax(const uint8_t *a, int a_stride, const uint8_t *b,
int b_stride, int *min_ret, int *max_ret) {
int min = 65535;
int max = 0;
const uint16_t *a_ptr = CONVERT_TO_SHORTPTR(a);
const uint16_t *b_ptr = CONVERT_TO_SHORTPTR(b);
for (int i = 0; i < 8; i++) {
for (int j = 0; j < 8; j++) {
const int diff = abs(a_ptr[i * a_stride + j] - b_ptr[i * b_stride + j]);
if (min > diff) min = diff;
if (max < diff) max = diff;
}
}
*min_ret = min;
*max_ret = max;
}
TEST_P(HBDMinMaxTest, MinValue) {
uint8_t *a = CONVERT_TO_BYTEPTR(
reinterpret_cast<uint16_t *>(vpx_malloc(64 * sizeof(uint16_t))));
uint8_t *b = CONVERT_TO_BYTEPTR(
reinterpret_cast<uint16_t *>(vpx_malloc(64 * sizeof(uint16_t))));
for (int i = 0; i < 64; i++) {
vpx_memset16(CONVERT_TO_SHORTPTR(a), 0, 64);
vpx_memset16(CONVERT_TO_SHORTPTR(b), 65535, 64);
CONVERT_TO_SHORTPTR(b)[i] = i; // Set a minimum difference of i.
int min, max;
ASM_REGISTER_STATE_CHECK(mm_func_(a, 8, b, 8, &min, &max));
EXPECT_EQ(65535, max);
EXPECT_EQ(i, min);
}
vpx_free(CONVERT_TO_SHORTPTR(a));
vpx_free(CONVERT_TO_SHORTPTR(b));
}
TEST_P(HBDMinMaxTest, MaxValue) {
uint8_t *a = CONVERT_TO_BYTEPTR(
reinterpret_cast<uint16_t *>(vpx_malloc(64 * sizeof(uint16_t))));
uint8_t *b = CONVERT_TO_BYTEPTR(
reinterpret_cast<uint16_t *>(vpx_malloc(64 * sizeof(uint16_t))));
for (int i = 0; i < 64; i++) {
vpx_memset16(CONVERT_TO_SHORTPTR(a), 0, 64);
vpx_memset16(CONVERT_TO_SHORTPTR(b), 0, 64);
CONVERT_TO_SHORTPTR(b)[i] = i; // Set a minimum difference of i.
int min, max;
ASM_REGISTER_STATE_CHECK(mm_func_(a, 8, b, 8, &min, &max));
EXPECT_EQ(i, max);
EXPECT_EQ(0, min);
}
vpx_free(CONVERT_TO_SHORTPTR(a));
vpx_free(CONVERT_TO_SHORTPTR(b));
}
TEST_P(HBDMinMaxTest, CompareReference) {
uint8_t *a = CONVERT_TO_BYTEPTR(
reinterpret_cast<uint16_t *>(vpx_malloc(64 * sizeof(uint16_t))));
uint8_t *b = CONVERT_TO_BYTEPTR(
reinterpret_cast<uint16_t *>(vpx_malloc(64 * sizeof(uint16_t))));
for (int j = 0; j < 64; j++) {
CONVERT_TO_SHORTPTR(a)[j] = rnd_.Rand16();
CONVERT_TO_SHORTPTR(b)[j] = rnd_.Rand16();
}
int min_ref, max_ref, min, max;
highbd_reference_minmax(a, 8, b, 8, &min_ref, &max_ref);
ASM_REGISTER_STATE_CHECK(mm_func_(a, 8, b, 8, &min, &max));
vpx_free(CONVERT_TO_SHORTPTR(a));
vpx_free(CONVERT_TO_SHORTPTR(b));
EXPECT_EQ(max_ref, max);
EXPECT_EQ(min_ref, min);
}
TEST_P(HBDMinMaxTest, CompareReferenceAndVaryStride) {
uint8_t *a = CONVERT_TO_BYTEPTR(
reinterpret_cast<uint16_t *>(vpx_malloc((8 * 64) * sizeof(uint16_t))));
uint8_t *b = CONVERT_TO_BYTEPTR(
reinterpret_cast<uint16_t *>(vpx_malloc((8 * 64) * sizeof(uint16_t))));
for (int i = 0; i < 8 * 64; i++) {
CONVERT_TO_SHORTPTR(a)[i] = rnd_.Rand16();
CONVERT_TO_SHORTPTR(b)[i] = rnd_.Rand16();
}
for (int a_stride = 8; a_stride <= 64; a_stride += 8) {
for (int b_stride = 8; b_stride <= 64; b_stride += 8) {
int min_ref, max_ref, min, max;
highbd_reference_minmax(a, a_stride, b, b_stride, &min_ref, &max_ref);
ASM_REGISTER_STATE_CHECK(mm_func_(a, a_stride, b, b_stride, &min, &max));
EXPECT_EQ(max_ref, max)
<< "when a_stride = " << a_stride << " and b_stride = " << b_stride;
EXPECT_EQ(min_ref, min)
<< "when a_stride = " << a_stride << " and b_stride = " << b_stride;
}
}
vpx_free(CONVERT_TO_SHORTPTR(a));
vpx_free(CONVERT_TO_SHORTPTR(b));
}
#endif
INSTANTIATE_TEST_SUITE_P(C, MinMaxTest, ::testing::Values(&vpx_minmax_8x8_c));
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_SUITE_P(C, HBDMinMaxTest,
::testing::Values(&vpx_highbd_minmax_8x8_c));
#endif
#if HAVE_SSE2
INSTANTIATE_TEST_SUITE_P(SSE2, MinMaxTest,
@@ -125,6 +235,10 @@ INSTANTIATE_TEST_SUITE_P(SSE2, MinMaxTest,
#if HAVE_NEON
INSTANTIATE_TEST_SUITE_P(NEON, MinMaxTest,
::testing::Values(&vpx_minmax_8x8_neon));
#if CONFIG_VP9_HIGHBITDEPTH
INSTANTIATE_TEST_SUITE_P(NEON, HBDMinMaxTest,
::testing::Values(&vpx_highbd_minmax_8x8_neon));
#endif
#endif
#if HAVE_MSA
+1 -1
View File
@@ -9,7 +9,7 @@
*/
#include <math.h>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "vp9/encoder/vp9_non_greedy_mv.h"
#include "./vpx_dsp_rtcd.h"
+6 -5
View File
@@ -14,7 +14,7 @@
#include <limits>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vp9_rtcd.h"
#include "./vpx_dsp_rtcd.h"
@@ -25,6 +25,7 @@
#include "vp9/common/vp9_blockd.h"
#include "vp9/common/vp9_scan.h"
#include "vpx/vpx_integer.h"
#include "vpx_config.h"
#include "vpx_ports/vpx_timer.h"
using libvpx_test::ACMRandom;
@@ -59,8 +60,8 @@ const int kCountTestBlock = 1000;
class PartialIDctTest : public ::testing::TestWithParam<PartialInvTxfmParam> {
public:
virtual ~PartialIDctTest() {}
virtual void SetUp() {
~PartialIDctTest() override = default;
void SetUp() override {
rnd_.Reset(ACMRandom::DeterministicSeed());
fwd_txfm_ = GET_PARAM(0);
full_inv_txfm_ = GET_PARAM(1);
@@ -76,7 +77,7 @@ class PartialIDctTest : public ::testing::TestWithParam<PartialInvTxfmParam> {
case TX_8X8: size_ = 8; break;
case TX_16X16: size_ = 16; break;
case TX_32X32: size_ = 32; break;
default: FAIL() << "Wrong Size!"; break;
default: FAIL() << "Wrong Size!";
}
// Randomize stride_ to a value less than or equal to 1024
@@ -100,7 +101,7 @@ class PartialIDctTest : public ::testing::TestWithParam<PartialInvTxfmParam> {
vpx_memalign(16, pixel_size_ * output_block_size_));
}
virtual void TearDown() {
void TearDown() override {
vpx_free(input_block_);
input_block_ = nullptr;
vpx_free(output_block_);
+7 -7
View File
@@ -14,12 +14,12 @@
#include "./vpx_config.h"
#include "./vpx_dsp_rtcd.h"
#include "gtest/gtest.h"
#include "test/acm_random.h"
#include "test/bench.h"
#include "test/buffer.h"
#include "test/clear_system_state.h"
#include "test/register_state_check.h"
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "vpx/vpx_integer.h"
#include "vpx_mem/vpx_mem.h"
@@ -51,10 +51,10 @@ class VpxPostProcDownAndAcrossMbRowTest
public:
VpxPostProcDownAndAcrossMbRowTest()
: mb_post_proc_down_and_across_(GetParam()) {}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
virtual void Run();
void Run() override;
const VpxPostProcDownAndAcrossMbRowFunc mb_post_proc_down_and_across_;
// Size of the underlying data block that will be filtered.
@@ -227,10 +227,10 @@ class VpxMbPostProcAcrossIpTest
VpxMbPostProcAcrossIpTest()
: rows_(16), cols_(16), mb_post_proc_across_ip_(GetParam()),
src_(Buffer<uint8_t>(rows_, cols_, 8, 8, 17, 8)) {}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
virtual void Run();
void Run() override;
void SetCols(unsigned char *s, int rows, int cols, int src_width) {
for (int r = 0; r < rows; r++) {
@@ -356,10 +356,10 @@ class VpxMbPostProcDownTest
: rows_(16), cols_(16), mb_post_proc_down_(GetParam()),
src_c_(Buffer<uint8_t>(rows_, cols_, 8, 8, 8, 17)) {}
virtual void TearDown() { libvpx_test::ClearSystemState(); }
void TearDown() override { libvpx_test::ClearSystemState(); }
protected:
virtual void Run();
void Run() override;
void SetRows(unsigned char *src_c, int rows, int cols, int src_width) {
for (int r = 0; r < rows; r++) {
+13 -5
View File
@@ -8,11 +8,12 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vp8_rtcd.h"
#include "./vpx_config.h"
@@ -23,7 +24,6 @@
#include "test/util.h"
#include "vpx/vpx_integer.h"
#include "vpx_mem/vpx_mem.h"
#include "vpx_ports/msvc.h"
namespace {
@@ -43,7 +43,7 @@ class PredictTestBase : public AbstractBench,
: width_(GET_PARAM(0)), height_(GET_PARAM(1)), predict_(GET_PARAM(2)),
src_(nullptr), padded_dst_(nullptr), dst_(nullptr), dst_c_(nullptr) {}
virtual void SetUp() {
void SetUp() override {
src_ = new uint8_t[kSrcSize];
ASSERT_NE(src_, nullptr);
@@ -64,7 +64,7 @@ class PredictTestBase : public AbstractBench,
memset(dst_c_, 0, 16 * 16);
}
virtual void TearDown() {
void TearDown() override {
delete[] src_;
src_ = nullptr;
vpx_free(padded_dst_);
@@ -209,7 +209,7 @@ class PredictTestBase : public AbstractBench,
}
}
void Run() {
void Run() override {
for (int xoffset = 0; xoffset < 8; ++xoffset) {
for (int yoffset = 0; yoffset < 8; ++yoffset) {
if (xoffset == 0 && yoffset == 0) {
@@ -350,6 +350,14 @@ INSTANTIATE_TEST_SUITE_P(
make_tuple(4, 4, &vp8_sixtap_predict4x4_mmi)));
#endif
#if HAVE_LSX
INSTANTIATE_TEST_SUITE_P(
LSX, SixtapPredictTest,
::testing::Values(make_tuple(16, 16, &vp8_sixtap_predict16x16_lsx),
make_tuple(8, 8, &vp8_sixtap_predict8x8_lsx),
make_tuple(4, 4, &vp8_sixtap_predict4x4_lsx)));
#endif
class BilinearPredictTest : public PredictTestBase {};
TEST_P(BilinearPredictTest, TestWithRandomData) {
+3 -3
View File
@@ -11,7 +11,7 @@
#include <string.h>
#include <tuple>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vp8_rtcd.h"
#include "./vpx_config.h"
@@ -121,13 +121,13 @@ class QuantizeTest : public QuantizeTestBase,
public ::testing::TestWithParam<VP8QuantizeParam>,
public AbstractBench {
protected:
virtual void SetUp() {
void SetUp() override {
SetupCompressor();
asm_quant_ = GET_PARAM(0);
c_quant_ = GET_PARAM(1);
}
virtual void Run() {
void Run() override {
asm_quant_(&vp8_comp_->mb.block[0], &macroblockd_dst_->block[0]);
}
+8 -3
View File
@@ -9,11 +9,12 @@
*/
#include <limits.h>
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/util.h"
#include "test/video_source.h"
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "vpx_config.h"
namespace {
@@ -26,7 +27,7 @@ class RealtimeTest
public ::libvpx_test::CodecTestWithParam<libvpx_test::TestMode> {
protected:
RealtimeTest() : EncoderTest(GET_PARAM(0)), frame_packets_(0) {}
~RealtimeTest() override {}
~RealtimeTest() override = default;
void SetUp() override {
InitializeConfig();
@@ -94,8 +95,11 @@ TEST_P(RealtimeTest, RealtimeDefaultCpuUsed) {
TEST_P(RealtimeTest, IntegerOverflow) { TestIntegerOverflow(2048, 2048); }
TEST_P(RealtimeTest, IntegerOverflowLarge) {
#ifdef CHROMIUM
GTEST_SKIP() << "16K framebuffers are not supported by Chromium's allocator.";
#else
if (IsVP9()) {
#if VPX_ARCH_X86_64
#if VPX_ARCH_AARCH64 || VPX_ARCH_X86_64
TestIntegerOverflow(16384, 16384);
#else
TestIntegerOverflow(4096, 4096);
@@ -107,6 +111,7 @@ TEST_P(RealtimeTest, IntegerOverflowLarge) {
"warnings are fixed.";
// TestIntegerOverflow(16383, 16383);
}
#endif // defined(CHROMIUM)
}
VP8_INSTANTIATE_TEST_SUITE(RealtimeTest,
+8 -8
View File
@@ -11,7 +11,7 @@
#ifndef VPX_TEST_REGISTER_STATE_CHECK_H_
#define VPX_TEST_REGISTER_STATE_CHECK_H_
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_config.h"
#include "vpx/vpx_integer.h"
@@ -184,13 +184,13 @@ class RegisterStateCheckMMX {
uint16_t pre_fpu_env_[14];
};
#define API_REGISTER_STATE_CHECK(statement) \
do { \
{ \
libvpx_test::RegisterStateCheckMMX reg_check; \
ASM_REGISTER_STATE_CHECK(statement); \
} \
__asm__ volatile("" ::: "memory"); \
#define API_REGISTER_STATE_CHECK(statement) \
do { \
{ \
libvpx_test::RegisterStateCheckMMX reg_check_mmx; \
ASM_REGISTER_STATE_CHECK(statement); \
} \
__asm__ volatile("" ::: "memory"); \
} while (false)
} // namespace libvpx_test
+42 -46
View File
@@ -7,16 +7,15 @@
* in the file PATENTS. All contributing project authors may
* be found in the AUTHORS file in the root of the source tree.
*/
#include <stdio.h>
#include <climits>
#include <vector>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/codec_factory.h"
#include "test/encode_test_driver.h"
#include "test/i420_video_source.h"
#include "test/video_source.h"
#include "test/util.h"
#include "test/video_source.h"
#include "vpx_config.h"
// Enable(1) or Disable(0) writing of the compressed bitstream.
#define WRITE_COMPRESSED_STREAM 0
@@ -102,11 +101,8 @@ void ScaleForFrameNumber(unsigned int frame, unsigned int initial_w,
if (frame < 30) {
return;
}
if (frame < 100) {
*w = initial_w * 7 / 10;
*h = initial_h * 16 / 10;
return;
}
*w = initial_w * 7 / 10;
*h = initial_h * 16 / 10;
return;
}
if (frame < 10) {
@@ -247,10 +243,10 @@ class ResizingVideoSource : public ::libvpx_test::DummyVideoSource {
}
bool flag_codec_;
bool smaller_width_larger_size_;
virtual ~ResizingVideoSource() {}
~ResizingVideoSource() override = default;
protected:
virtual void Next() {
void Next() override {
++frame_;
unsigned int width = 0;
unsigned int height = 0;
@@ -267,14 +263,14 @@ class ResizeTest
protected:
ResizeTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~ResizeTest() {}
~ResizeTest() override = default;
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
ASSERT_NE(static_cast<int>(pkt->data.frame.width[0]), 0);
ASSERT_NE(static_cast<int>(pkt->data.frame.height[0]), 0);
encode_frame_width_.push_back(pkt->data.frame.width[0]);
@@ -289,8 +285,8 @@ class ResizeTest
return encode_frame_height_[idx];
}
virtual void DecompressedFrameHook(const vpx_image_t &img,
vpx_codec_pts_t pts) {
void DecompressedFrameHook(const vpx_image_t &img,
vpx_codec_pts_t pts) override {
frame_info_list_.push_back(FrameInfo(pts, img.d_w, img.d_h));
}
@@ -336,15 +332,15 @@ class ResizeInternalTest : public ResizeTest {
ResizeInternalTest() : ResizeTest(), frame0_psnr_(0.0) {}
#endif
virtual ~ResizeInternalTest() {}
~ResizeInternalTest() override = default;
virtual void BeginPassHook(unsigned int /*pass*/) {
void BeginPassHook(unsigned int /*pass*/) override {
#if WRITE_COMPRESSED_STREAM
outfile_ = fopen("vp90-2-05-resize.ivf", "wb");
#endif
}
virtual void EndPassHook() {
void EndPassHook() override {
#if WRITE_COMPRESSED_STREAM
if (outfile_) {
if (!fseek(outfile_, 0, SEEK_SET))
@@ -355,8 +351,8 @@ class ResizeInternalTest : public ResizeTest {
#endif
}
virtual void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) override {
if (change_config_) {
int new_q = 60;
if (video->frame() == 0) {
@@ -381,13 +377,13 @@ class ResizeInternalTest : public ResizeTest {
}
}
virtual void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) {
void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) override {
if (frame0_psnr_ == 0.) frame0_psnr_ = pkt->data.psnr.psnr[0];
EXPECT_NEAR(pkt->data.psnr.psnr[0], frame0_psnr_, 2.0);
}
#if WRITE_COMPRESSED_STREAM
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
++out_frames_;
// Write initial file header if first frame.
@@ -450,10 +446,10 @@ class ResizeRealtimeTest
public ::libvpx_test::CodecTestWith2Params<libvpx_test::TestMode, int> {
protected:
ResizeRealtimeTest() : EncoderTest(GET_PARAM(0)) {}
virtual ~ResizeRealtimeTest() {}
~ResizeRealtimeTest() override = default;
virtual void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) override {
if (video->frame() == 0) {
encoder->Control(VP9E_SET_AQ_MODE, 3);
encoder->Control(VP8E_SET_CPUUSED, set_cpu_used_);
@@ -466,24 +462,24 @@ class ResizeRealtimeTest
}
}
virtual void SetUp() {
void SetUp() override {
InitializeConfig();
SetMode(GET_PARAM(1));
set_cpu_used_ = GET_PARAM(2);
}
virtual void DecompressedFrameHook(const vpx_image_t &img,
vpx_codec_pts_t pts) {
void DecompressedFrameHook(const vpx_image_t &img,
vpx_codec_pts_t pts) override {
frame_info_list_.push_back(FrameInfo(pts, img.d_w, img.d_h));
}
virtual void MismatchHook(const vpx_image_t *img1, const vpx_image_t *img2) {
void MismatchHook(const vpx_image_t *img1, const vpx_image_t *img2) override {
double mismatch_psnr = compute_psnr(img1, img2);
mismatch_psnr_ += mismatch_psnr;
++mismatch_nframes_;
}
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
ASSERT_NE(static_cast<int>(pkt->data.frame.width[0]), 0);
ASSERT_NE(static_cast<int>(pkt->data.frame.height[0]), 0);
encode_frame_width_.push_back(pkt->data.frame.width[0]);
@@ -559,9 +555,7 @@ TEST_P(ResizeRealtimeTest, TestExternalResizeWorks) {
}
}
// TODO(https://crbug.com/webm/1642): This causes a segfault in
// init_encode_frame_mb_context().
TEST_P(ResizeRealtimeTest, DISABLED_TestExternalResizeSmallerWidthBiggerSize) {
TEST_P(ResizeRealtimeTest, TestExternalResizeSmallerWidthBiggerSize) {
ResizingVideoSource video;
video.flag_codec_ = true;
video.smaller_width_larger_size_ = true;
@@ -603,6 +597,7 @@ TEST_P(ResizeRealtimeTest, TestInternalResizeDown) {
mismatch_nframes_ = 0;
ASSERT_NO_FATAL_FAILURE(RunLoop(&video));
#if CONFIG_VP9_DECODER
unsigned int last_w = cfg_.g_w;
unsigned int last_h = cfg_.g_h;
int resize_count = 0;
@@ -618,12 +613,12 @@ TEST_P(ResizeRealtimeTest, TestInternalResizeDown) {
}
}
#if CONFIG_VP9_DECODER
// Verify that we get 1 resize down event in this test.
ASSERT_EQ(1, resize_count) << "Resizing should occur.";
EXPECT_EQ(static_cast<unsigned int>(0), GetMismatchFrames());
#else
printf("Warning: VP9 decoder unavailable, unable to check resize count!\n");
GTEST_SKIP()
<< "Warning: VP9 decoder unavailable, unable to check resize count!\n";
#endif
}
@@ -674,7 +669,8 @@ TEST_P(ResizeRealtimeTest, TestInternalResizeDownUpChangeBitRate) {
ASSERT_EQ(resize_count, 4) << "Resizing should occur twice.";
EXPECT_EQ(static_cast<unsigned int>(0), GetMismatchFrames());
#else
printf("Warning: VP9 decoder unavailable, unable to check resize count!\n");
GTEST_SKIP()
<< "Warning: VP9 decoder unavailable, unable to check resize count!\n";
#endif
}
@@ -693,15 +689,15 @@ class ResizeCspTest : public ResizeTest {
ResizeCspTest() : ResizeTest(), frame0_psnr_(0.0) {}
#endif
virtual ~ResizeCspTest() {}
~ResizeCspTest() override = default;
virtual void BeginPassHook(unsigned int /*pass*/) {
void BeginPassHook(unsigned int /*pass*/) override {
#if WRITE_COMPRESSED_STREAM
outfile_ = fopen("vp91-2-05-cspchape.ivf", "wb");
#endif
}
virtual void EndPassHook() {
void EndPassHook() override {
#if WRITE_COMPRESSED_STREAM
if (outfile_) {
if (!fseek(outfile_, 0, SEEK_SET))
@@ -712,8 +708,8 @@ class ResizeCspTest : public ResizeTest {
#endif
}
virtual void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) {
void PreEncodeFrameHook(libvpx_test::VideoSource *video,
libvpx_test::Encoder *encoder) override {
if (CspForFrameNumber(video->frame()) != VPX_IMG_FMT_I420 &&
cfg_.g_profile != 1) {
cfg_.g_profile = 1;
@@ -726,13 +722,13 @@ class ResizeCspTest : public ResizeTest {
}
}
virtual void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) {
void PSNRPktHook(const vpx_codec_cx_pkt_t *pkt) override {
if (frame0_psnr_ == 0.) frame0_psnr_ = pkt->data.psnr.psnr[0];
EXPECT_NEAR(pkt->data.psnr.psnr[0], frame0_psnr_, 2.0);
}
#if WRITE_COMPRESSED_STREAM
virtual void FramePktHook(const vpx_codec_cx_pkt_t *pkt) {
void FramePktHook(const vpx_codec_cx_pkt_t *pkt) override {
++out_frames_;
// Write initial file header if first frame.
@@ -758,10 +754,10 @@ class ResizingCspVideoSource : public ::libvpx_test::DummyVideoSource {
limit_ = 30;
}
virtual ~ResizingCspVideoSource() {}
~ResizingCspVideoSource() override = default;
protected:
virtual void Next() {
void Next() override {
++frame_;
SetImageFormat(CspForFrameNumber(frame_));
FillFrame();
-69
View File
@@ -1,69 +0,0 @@
#!/bin/sh
##
## Copyright (c) 2014 The WebM project authors. All Rights Reserved.
##
## Use of this source code is governed by a BSD-style license
## that can be found in the LICENSE file in the root of the source
## tree. An additional intellectual property rights grant can be found
## in the file PATENTS. All contributing project authors may
## be found in the AUTHORS file in the root of the source tree.
##
## This file tests the libvpx resize_util example code. To add new tests to
## this file, do the following:
## 1. Write a shell function (this is your test).
## 2. Add the function to resize_util_tests (on a new line).
##
. $(dirname $0)/tools_common.sh
# Environment check: $YUV_RAW_INPUT is required.
resize_util_verify_environment() {
if [ ! -e "${YUV_RAW_INPUT}" ]; then
echo "Libvpx test data must exist in LIBVPX_TEST_DATA_PATH."
return 1
fi
}
# Resizes $YUV_RAW_INPUT using the resize_util example. $1 is the output
# dimensions that will be passed to resize_util.
resize_util() {
local resizer="${LIBVPX_BIN_PATH}/resize_util${VPX_TEST_EXE_SUFFIX}"
local output_file="${VPX_TEST_OUTPUT_DIR}/resize_util.raw"
local frames_to_resize="10"
local target_dimensions="$1"
# resize_util is available only when CONFIG_SHARED is disabled.
if [ -z "$(vpx_config_option_enabled CONFIG_SHARED)" ]; then
if [ ! -x "${resizer}" ]; then
elog "${resizer} does not exist or is not executable."
return 1
fi
eval "${VPX_TEST_PREFIX}" "${resizer}" "${YUV_RAW_INPUT}" \
"${YUV_RAW_INPUT_WIDTH}x${YUV_RAW_INPUT_HEIGHT}" \
"${target_dimensions}" "${output_file}" ${frames_to_resize} \
${devnull} || return 1
[ -e "${output_file}" ] || return 1
fi
}
# Halves each dimension of $YUV_RAW_INPUT using resize_util().
resize_down() {
local target_width=$((${YUV_RAW_INPUT_WIDTH} / 2))
local target_height=$((${YUV_RAW_INPUT_HEIGHT} / 2))
resize_util "${target_width}x${target_height}"
}
# Doubles each dimension of $YUV_RAW_INPUT using resize_util().
resize_up() {
local target_width=$((${YUV_RAW_INPUT_WIDTH} * 2))
local target_height=$((${YUV_RAW_INPUT_HEIGHT} * 2))
resize_util "${target_width}x${target_height}"
}
resize_util_tests="resize_down
resize_up"
run_tests resize_util_verify_environment "${resize_util_tests}"
+709 -4
View File
@@ -8,10 +8,11 @@
* be found in the AUTHORS file in the root of the source tree.
*/
#include <stdio.h>
#include <string.h>
#include <limits.h>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "./vpx_config.h"
#include "./vpx_dsp_rtcd.h"
@@ -23,7 +24,6 @@
#include "vpx/vpx_codec.h"
#include "vpx_mem/vpx_mem.h"
#include "vpx_ports/mem.h"
#include "vpx_ports/msvc.h"
#include "vpx_ports/vpx_timer.h"
// const[expr] should be sufficient for DECLARE_ALIGNED but early
@@ -42,6 +42,10 @@ typedef unsigned int (*SadMxNFunc)(const uint8_t *src_ptr, int src_stride,
const uint8_t *ref_ptr, int ref_stride);
typedef TestParams<SadMxNFunc> SadMxNParam;
typedef unsigned int (*SadSkipMxNFunc)(const uint8_t *src_ptr, int src_stride,
const uint8_t *ref_ptr, int ref_stride);
typedef TestParams<SadSkipMxNFunc> SadSkipMxNParam;
typedef unsigned int (*SadMxNAvgFunc)(const uint8_t *src_ptr, int src_stride,
const uint8_t *ref_ptr, int ref_stride,
const uint8_t *second_pred);
@@ -52,6 +56,11 @@ typedef void (*SadMxNx4Func)(const uint8_t *src_ptr, int src_stride,
unsigned int *sad_array);
typedef TestParams<SadMxNx4Func> SadMxNx4Param;
typedef void (*SadSkipMxNx4Func)(const uint8_t *src_ptr, int src_stride,
const uint8_t *const ref_ptr[], int ref_stride,
unsigned int *sad_array);
typedef TestParams<SadSkipMxNx4Func> SadSkipMxNx4Param;
typedef void (*SadMxNx8Func)(const uint8_t *src_ptr, int src_stride,
const uint8_t *ref_ptr, int ref_stride,
unsigned int *sad_array);
@@ -64,7 +73,7 @@ class SADTestBase : public ::testing::TestWithParam<ParamType> {
public:
explicit SADTestBase(const ParamType &params) : params_(params) {}
virtual void SetUp() {
void SetUp() override {
source_data8_ = reinterpret_cast<uint8_t *>(
vpx_memalign(kDataAlignment, kDataBlockSize));
reference_data8_ = reinterpret_cast<uint8_t *>(
@@ -99,7 +108,7 @@ class SADTestBase : public ::testing::TestWithParam<ParamType> {
rnd_.Reset(ACMRandom::DeterministicSeed());
}
virtual void TearDown() {
void TearDown() override {
vpx_free(source_data8_);
source_data8_ = nullptr;
vpx_free(reference_data8_);
@@ -170,6 +179,34 @@ class SADTestBase : public ::testing::TestWithParam<ParamType> {
return sad;
}
// Sum of Absolute Differences Skip rows. Given two blocks, calculate the
// absolute difference between two pixels in the same relative location every
// other row; accumulate and double the result at the end.
uint32_t ReferenceSADSkip(int ref_offset) const {
uint32_t sad = 0;
const uint8_t *const reference8 = GetReferenceFromOffset(ref_offset);
const uint8_t *const source8 = source_data_;
#if CONFIG_VP9_HIGHBITDEPTH
const uint16_t *const reference16 =
CONVERT_TO_SHORTPTR(GetReferenceFromOffset(ref_offset));
const uint16_t *const source16 = CONVERT_TO_SHORTPTR(source_data_);
#endif // CONFIG_VP9_HIGHBITDEPTH
for (int h = 0; h < params_.height; h += 2) {
for (int w = 0; w < params_.width; ++w) {
if (!use_high_bit_depth_) {
sad += abs(source8[h * source_stride_ + w] -
reference8[h * reference_stride_ + w]);
#if CONFIG_VP9_HIGHBITDEPTH
} else {
sad += abs(source16[h * source_stride_ + w] -
reference16[h * reference_stride_ + w]);
#endif // CONFIG_VP9_HIGHBITDEPTH
}
}
}
return sad * 2;
}
// Sum of Absolute Differences Average. Given two blocks, and a prediction
// calculate the absolute difference between one pixel and average of the
// corresponding and predicted pixels; accumulate.
@@ -290,6 +327,32 @@ class SADx4Test : public SADTestBase<SadMxNx4Param> {
}
};
class SADSkipx4Test : public SADTestBase<SadMxNx4Param> {
public:
SADSkipx4Test() : SADTestBase(GetParam()) {}
protected:
void SADs(unsigned int *results) const {
const uint8_t *references[] = { GetReference(0), GetReference(1),
GetReference(2), GetReference(3) };
ASM_REGISTER_STATE_CHECK(params_.func(
source_data_, source_stride_, references, reference_stride_, results));
}
void CheckSADs() const {
uint32_t reference_sad;
DECLARE_ALIGNED(kDataAlignment, uint32_t, exp_sad[4]);
SADs(exp_sad);
for (int block = 0; block < 4; ++block) {
reference_sad = ReferenceSADSkip(GetBlockRefOffset(block));
EXPECT_EQ(reference_sad, exp_sad[block]) << "block " << block;
}
}
};
class SADTest : public AbstractBench, public SADTestBase<SadMxNParam> {
public:
SADTest() : SADTestBase(GetParam()) {}
@@ -317,6 +380,33 @@ class SADTest : public AbstractBench, public SADTestBase<SadMxNParam> {
}
};
class SADSkipTest : public AbstractBench, public SADTestBase<SadMxNParam> {
public:
SADSkipTest() : SADTestBase(GetParam()) {}
protected:
unsigned int SAD(int block_idx) const {
unsigned int ret;
const uint8_t *const reference = GetReference(block_idx);
ASM_REGISTER_STATE_CHECK(ret = params_.func(source_data_, source_stride_,
reference, reference_stride_));
return ret;
}
void CheckSAD() const {
const unsigned int reference_sad = ReferenceSADSkip(GetBlockRefOffset(0));
const unsigned int exp_sad = SAD(0);
ASSERT_EQ(reference_sad, exp_sad);
}
void Run() override {
params_.func(source_data_, source_stride_, reference_data_,
reference_stride_);
}
};
class SADavgTest : public AbstractBench, public SADTestBase<SadMxNAvgParam> {
public:
SADavgTest() : SADTestBase(GetParam()) {}
@@ -397,6 +487,58 @@ TEST_P(SADTest, DISABLED_Speed) {
PrintMedian(title);
}
TEST_P(SADSkipTest, MaxRef) {
FillConstant(source_data_, source_stride_, 0);
FillConstant(reference_data_, reference_stride_, mask_);
CheckSAD();
}
TEST_P(SADSkipTest, MaxSrc) {
FillConstant(source_data_, source_stride_, mask_);
FillConstant(reference_data_, reference_stride_, 0);
CheckSAD();
}
TEST_P(SADSkipTest, ShortRef) {
const int tmp_stride = reference_stride_;
reference_stride_ >>= 1;
FillRandom(source_data_, source_stride_);
FillRandom(reference_data_, reference_stride_);
CheckSAD();
reference_stride_ = tmp_stride;
}
TEST_P(SADSkipTest, UnalignedRef) {
// The reference frame, but not the source frame, may be unaligned for
// certain types of searches.
const int tmp_stride = reference_stride_;
reference_stride_ -= 1;
FillRandom(source_data_, source_stride_);
FillRandom(reference_data_, reference_stride_);
CheckSAD();
reference_stride_ = tmp_stride;
}
TEST_P(SADSkipTest, ShortSrc) {
const int tmp_stride = source_stride_;
source_stride_ >>= 1;
FillRandom(source_data_, source_stride_);
FillRandom(reference_data_, reference_stride_);
CheckSAD();
source_stride_ = tmp_stride;
}
TEST_P(SADSkipTest, DISABLED_Speed) {
const int kCountSpeedTestBlock = 50000000 / (params_.width * params_.height);
FillRandom(source_data_, source_stride_);
RunNTimes(kCountSpeedTestBlock);
char title[16];
snprintf(title, sizeof(title), "%dx%d", params_.width, params_.height);
PrintMedian(title);
}
TEST_P(SADavgTest, MaxRef) {
FillConstant(source_data_, source_stride_, 0);
FillConstant(reference_data_, reference_stride_, mask_);
@@ -554,6 +696,105 @@ TEST_P(SADx4Test, DISABLED_Speed) {
reference_stride_ = tmp_stride;
}
TEST_P(SADSkipx4Test, MaxRef) {
FillConstant(source_data_, source_stride_, 0);
FillConstant(GetReference(0), reference_stride_, mask_);
FillConstant(GetReference(1), reference_stride_, mask_);
FillConstant(GetReference(2), reference_stride_, mask_);
FillConstant(GetReference(3), reference_stride_, mask_);
CheckSADs();
}
TEST_P(SADSkipx4Test, MaxSrc) {
FillConstant(source_data_, source_stride_, mask_);
FillConstant(GetReference(0), reference_stride_, 0);
FillConstant(GetReference(1), reference_stride_, 0);
FillConstant(GetReference(2), reference_stride_, 0);
FillConstant(GetReference(3), reference_stride_, 0);
CheckSADs();
}
TEST_P(SADSkipx4Test, ShortRef) {
int tmp_stride = reference_stride_;
reference_stride_ >>= 1;
FillRandom(source_data_, source_stride_);
FillRandom(GetReference(0), reference_stride_);
FillRandom(GetReference(1), reference_stride_);
FillRandom(GetReference(2), reference_stride_);
FillRandom(GetReference(3), reference_stride_);
CheckSADs();
reference_stride_ = tmp_stride;
}
TEST_P(SADSkipx4Test, UnalignedRef) {
// The reference frame, but not the source frame, may be unaligned for
// certain types of searches.
int tmp_stride = reference_stride_;
reference_stride_ -= 1;
FillRandom(source_data_, source_stride_);
FillRandom(GetReference(0), reference_stride_);
FillRandom(GetReference(1), reference_stride_);
FillRandom(GetReference(2), reference_stride_);
FillRandom(GetReference(3), reference_stride_);
CheckSADs();
reference_stride_ = tmp_stride;
}
TEST_P(SADSkipx4Test, ShortSrc) {
int tmp_stride = source_stride_;
source_stride_ >>= 1;
FillRandom(source_data_, source_stride_);
FillRandom(GetReference(0), reference_stride_);
FillRandom(GetReference(1), reference_stride_);
FillRandom(GetReference(2), reference_stride_);
FillRandom(GetReference(3), reference_stride_);
CheckSADs();
source_stride_ = tmp_stride;
}
TEST_P(SADSkipx4Test, SrcAlignedByWidth) {
uint8_t *tmp_source_data = source_data_;
source_data_ += params_.width;
FillRandom(source_data_, source_stride_);
FillRandom(GetReference(0), reference_stride_);
FillRandom(GetReference(1), reference_stride_);
FillRandom(GetReference(2), reference_stride_);
FillRandom(GetReference(3), reference_stride_);
CheckSADs();
source_data_ = tmp_source_data;
}
TEST_P(SADSkipx4Test, DISABLED_Speed) {
int tmp_stride = reference_stride_;
reference_stride_ -= 1;
FillRandom(source_data_, source_stride_);
FillRandom(GetReference(0), reference_stride_);
FillRandom(GetReference(1), reference_stride_);
FillRandom(GetReference(2), reference_stride_);
FillRandom(GetReference(3), reference_stride_);
const int kCountSpeedTestBlock = 500000000 / (params_.width * params_.height);
uint32_t reference_sad[4];
DECLARE_ALIGNED(kDataAlignment, uint32_t, exp_sad[4]);
vpx_usec_timer timer;
for (int block = 0; block < 4; ++block) {
reference_sad[block] = ReferenceSADSkip(GetBlockRefOffset(block));
}
vpx_usec_timer_start(&timer);
for (int i = 0; i < kCountSpeedTestBlock; ++i) {
SADs(exp_sad);
}
vpx_usec_timer_mark(&timer);
for (int block = 0; block < 4; ++block) {
EXPECT_EQ(reference_sad[block], exp_sad[block]) << "block " << block;
}
const int elapsed_time =
static_cast<int>(vpx_usec_timer_elapsed(&timer) / 1000);
printf("sad%dx%dx4 (%2dbit) time: %5d ms\n", params_.width, params_.height,
bit_depth_, elapsed_time);
reference_stride_ = tmp_stride;
}
//------------------------------------------------------------------------------
// C functions
const SadMxNParam c_tests[] = {
@@ -614,6 +855,56 @@ const SadMxNParam c_tests[] = {
};
INSTANTIATE_TEST_SUITE_P(C, SADTest, ::testing::ValuesIn(c_tests));
const SadSkipMxNParam skip_c_tests[] = {
SadSkipMxNParam(64, 64, &vpx_sad_skip_64x64_c),
SadSkipMxNParam(64, 32, &vpx_sad_skip_64x32_c),
SadSkipMxNParam(32, 64, &vpx_sad_skip_32x64_c),
SadSkipMxNParam(32, 32, &vpx_sad_skip_32x32_c),
SadSkipMxNParam(32, 16, &vpx_sad_skip_32x16_c),
SadSkipMxNParam(16, 32, &vpx_sad_skip_16x32_c),
SadSkipMxNParam(16, 16, &vpx_sad_skip_16x16_c),
SadSkipMxNParam(16, 8, &vpx_sad_skip_16x8_c),
SadSkipMxNParam(8, 16, &vpx_sad_skip_8x16_c),
SadSkipMxNParam(8, 8, &vpx_sad_skip_8x8_c),
SadSkipMxNParam(4, 8, &vpx_sad_skip_4x8_c),
#if CONFIG_VP9_HIGHBITDEPTH
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_c, 8),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_c, 8),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_c, 8),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_c, 8),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_c, 8),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_c, 8),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_c, 8),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_c, 8),
SadSkipMxNParam(8, 16, &vpx_highbd_sad_skip_8x16_c, 8),
SadSkipMxNParam(8, 8, &vpx_highbd_sad_skip_8x8_c, 8),
SadSkipMxNParam(4, 8, &vpx_highbd_sad_skip_4x8_c, 8),
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_c, 10),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_c, 10),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_c, 10),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_c, 10),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_c, 10),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_c, 10),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_c, 10),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_c, 10),
SadSkipMxNParam(8, 16, &vpx_highbd_sad_skip_8x16_c, 10),
SadSkipMxNParam(8, 8, &vpx_highbd_sad_skip_8x8_c, 10),
SadSkipMxNParam(4, 8, &vpx_highbd_sad_skip_4x8_c, 10),
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_c, 12),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_c, 12),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_c, 12),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_c, 12),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_c, 12),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_c, 12),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_c, 12),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_c, 12),
SadSkipMxNParam(8, 16, &vpx_highbd_sad_skip_8x16_c, 12),
SadSkipMxNParam(8, 8, &vpx_highbd_sad_skip_8x8_c, 12),
SadSkipMxNParam(4, 8, &vpx_highbd_sad_skip_4x8_c, 12),
#endif // CONFIG_VP9_HIGHBITDEPTH
};
INSTANTIATE_TEST_SUITE_P(C, SADSkipTest, ::testing::ValuesIn(skip_c_tests));
const SadMxNAvgParam avg_c_tests[] = {
SadMxNAvgParam(64, 64, &vpx_sad64x64_avg_c),
SadMxNAvgParam(64, 32, &vpx_sad64x32_avg_c),
@@ -730,6 +1021,57 @@ const SadMxNx4Param x4d_c_tests[] = {
};
INSTANTIATE_TEST_SUITE_P(C, SADx4Test, ::testing::ValuesIn(x4d_c_tests));
const SadSkipMxNx4Param skip_x4d_c_tests[] = {
SadSkipMxNx4Param(64, 64, &vpx_sad_skip_64x64x4d_c),
SadSkipMxNx4Param(64, 32, &vpx_sad_skip_64x32x4d_c),
SadSkipMxNx4Param(32, 64, &vpx_sad_skip_32x64x4d_c),
SadSkipMxNx4Param(32, 32, &vpx_sad_skip_32x32x4d_c),
SadSkipMxNx4Param(32, 16, &vpx_sad_skip_32x16x4d_c),
SadSkipMxNx4Param(16, 32, &vpx_sad_skip_16x32x4d_c),
SadSkipMxNx4Param(16, 16, &vpx_sad_skip_16x16x4d_c),
SadSkipMxNx4Param(16, 8, &vpx_sad_skip_16x8x4d_c),
SadSkipMxNx4Param(8, 16, &vpx_sad_skip_8x16x4d_c),
SadSkipMxNx4Param(8, 8, &vpx_sad_skip_8x8x4d_c),
SadSkipMxNx4Param(4, 8, &vpx_sad_skip_4x8x4d_c),
#if CONFIG_VP9_HIGHBITDEPTH
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_c, 8),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_c, 8),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_c, 8),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_c, 8),
SadSkipMxNx4Param(32, 16, &vpx_highbd_sad_skip_32x16x4d_c, 8),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_c, 8),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_c, 8),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_c, 8),
SadSkipMxNx4Param(8, 16, &vpx_highbd_sad_skip_8x16x4d_c, 8),
SadSkipMxNx4Param(8, 8, &vpx_highbd_sad_skip_8x8x4d_c, 8),
SadSkipMxNx4Param(4, 8, &vpx_highbd_sad_skip_4x8x4d_c, 8),
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_c, 10),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_c, 10),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_c, 10),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_c, 10),
SadSkipMxNx4Param(32, 16, &vpx_highbd_sad_skip_32x16x4d_c, 10),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_c, 10),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_c, 10),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_c, 10),
SadSkipMxNx4Param(8, 16, &vpx_highbd_sad_skip_8x16x4d_c, 10),
SadSkipMxNx4Param(8, 8, &vpx_highbd_sad_skip_8x8x4d_c, 10),
SadSkipMxNx4Param(4, 8, &vpx_highbd_sad_skip_4x8x4d_c, 10),
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_c, 12),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_c, 12),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_c, 12),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_c, 12),
SadSkipMxNx4Param(32, 16, &vpx_highbd_sad_skip_32x16x4d_c, 12),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_c, 12),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_c, 12),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_c, 12),
SadSkipMxNx4Param(8, 16, &vpx_highbd_sad_skip_8x16x4d_c, 12),
SadSkipMxNx4Param(8, 8, &vpx_highbd_sad_skip_8x8x4d_c, 12),
SadSkipMxNx4Param(4, 8, &vpx_highbd_sad_skip_4x8x4d_c, 12),
#endif // CONFIG_VP9_HIGHBITDEPTH
};
INSTANTIATE_TEST_SUITE_P(C, SADSkipx4Test,
::testing::ValuesIn(skip_x4d_c_tests));
//------------------------------------------------------------------------------
// ARM functions
#if HAVE_NEON
@@ -787,6 +1129,95 @@ const SadMxNParam neon_tests[] = {
};
INSTANTIATE_TEST_SUITE_P(NEON, SADTest, ::testing::ValuesIn(neon_tests));
#if HAVE_NEON_DOTPROD
const SadMxNParam neon_dotprod_tests[] = {
SadMxNParam(64, 64, &vpx_sad64x64_neon_dotprod),
SadMxNParam(64, 32, &vpx_sad64x32_neon_dotprod),
SadMxNParam(32, 64, &vpx_sad32x64_neon_dotprod),
SadMxNParam(32, 32, &vpx_sad32x32_neon_dotprod),
SadMxNParam(32, 16, &vpx_sad32x16_neon_dotprod),
SadMxNParam(16, 32, &vpx_sad16x32_neon_dotprod),
SadMxNParam(16, 16, &vpx_sad16x16_neon_dotprod),
SadMxNParam(16, 8, &vpx_sad16x8_neon_dotprod),
};
INSTANTIATE_TEST_SUITE_P(NEON_DOTPROD, SADTest,
::testing::ValuesIn(neon_dotprod_tests));
#endif // HAVE_NEON_DOTPROD
const SadSkipMxNParam skip_neon_tests[] = {
SadSkipMxNParam(64, 64, &vpx_sad_skip_64x64_neon),
SadSkipMxNParam(64, 32, &vpx_sad_skip_64x32_neon),
SadSkipMxNParam(32, 64, &vpx_sad_skip_32x64_neon),
SadSkipMxNParam(32, 32, &vpx_sad_skip_32x32_neon),
SadSkipMxNParam(32, 16, &vpx_sad_skip_32x16_neon),
SadSkipMxNParam(16, 32, &vpx_sad_skip_16x32_neon),
SadSkipMxNParam(16, 16, &vpx_sad_skip_16x16_neon),
SadSkipMxNParam(16, 8, &vpx_sad_skip_16x8_neon),
SadSkipMxNParam(8, 16, &vpx_sad_skip_8x16_neon),
SadSkipMxNParam(8, 8, &vpx_sad_skip_8x8_neon),
SadSkipMxNParam(8, 4, &vpx_sad_skip_8x4_neon),
SadSkipMxNParam(4, 8, &vpx_sad_skip_4x8_neon),
SadSkipMxNParam(4, 4, &vpx_sad_skip_4x4_neon),
#if CONFIG_VP9_HIGHBITDEPTH
SadSkipMxNParam(4, 4, &vpx_highbd_sad_skip_4x4_neon, 8),
SadSkipMxNParam(4, 8, &vpx_highbd_sad_skip_4x8_neon, 8),
SadSkipMxNParam(8, 4, &vpx_highbd_sad_skip_8x4_neon, 8),
SadSkipMxNParam(8, 8, &vpx_highbd_sad_skip_8x8_neon, 8),
SadSkipMxNParam(8, 16, &vpx_highbd_sad_skip_8x16_neon, 8),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_neon, 8),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_neon, 8),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_neon, 8),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_neon, 8),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_neon, 8),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_neon, 8),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_neon, 8),
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_neon, 8),
SadSkipMxNParam(4, 4, &vpx_highbd_sad_skip_4x4_neon, 10),
SadSkipMxNParam(4, 8, &vpx_highbd_sad_skip_4x8_neon, 10),
SadSkipMxNParam(8, 4, &vpx_highbd_sad_skip_8x4_neon, 10),
SadSkipMxNParam(8, 8, &vpx_highbd_sad_skip_8x8_neon, 10),
SadSkipMxNParam(8, 16, &vpx_highbd_sad_skip_8x16_neon, 10),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_neon, 10),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_neon, 10),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_neon, 10),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_neon, 10),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_neon, 10),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_neon, 10),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_neon, 10),
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_neon, 10),
SadSkipMxNParam(4, 4, &vpx_highbd_sad_skip_4x4_neon, 12),
SadSkipMxNParam(4, 8, &vpx_highbd_sad_skip_4x8_neon, 12),
SadSkipMxNParam(8, 4, &vpx_highbd_sad_skip_8x4_neon, 12),
SadSkipMxNParam(8, 8, &vpx_highbd_sad_skip_8x8_neon, 12),
SadSkipMxNParam(8, 16, &vpx_highbd_sad_skip_8x16_neon, 12),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_neon, 12),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_neon, 12),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_neon, 12),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_neon, 12),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_neon, 12),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_neon, 12),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_neon, 12),
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_neon, 12),
#endif // CONFIG_VP9_HIGHBITDEPTH
};
INSTANTIATE_TEST_SUITE_P(NEON, SADSkipTest,
::testing::ValuesIn(skip_neon_tests));
#if HAVE_NEON_DOTPROD
const SadSkipMxNParam skip_neon_dotprod_tests[] = {
SadSkipMxNParam(64, 64, &vpx_sad_skip_64x64_neon_dotprod),
SadSkipMxNParam(64, 32, &vpx_sad_skip_64x32_neon_dotprod),
SadSkipMxNParam(32, 64, &vpx_sad_skip_32x64_neon_dotprod),
SadSkipMxNParam(32, 32, &vpx_sad_skip_32x32_neon_dotprod),
SadSkipMxNParam(32, 16, &vpx_sad_skip_32x16_neon_dotprod),
SadSkipMxNParam(16, 32, &vpx_sad_skip_16x32_neon_dotprod),
SadSkipMxNParam(16, 16, &vpx_sad_skip_16x16_neon_dotprod),
SadSkipMxNParam(16, 8, &vpx_sad_skip_16x8_neon_dotprod),
};
INSTANTIATE_TEST_SUITE_P(NEON_DOTPROD, SADSkipTest,
::testing::ValuesIn(skip_neon_dotprod_tests));
#endif // HAVE_NEON_DOTPROD
const SadMxNAvgParam avg_neon_tests[] = {
SadMxNAvgParam(64, 64, &vpx_sad64x64_avg_neon),
SadMxNAvgParam(64, 32, &vpx_sad64x32_avg_neon),
@@ -845,6 +1276,21 @@ const SadMxNAvgParam avg_neon_tests[] = {
};
INSTANTIATE_TEST_SUITE_P(NEON, SADavgTest, ::testing::ValuesIn(avg_neon_tests));
#if HAVE_NEON_DOTPROD
const SadMxNAvgParam avg_neon_dotprod_tests[] = {
SadMxNAvgParam(64, 64, &vpx_sad64x64_avg_neon_dotprod),
SadMxNAvgParam(64, 32, &vpx_sad64x32_avg_neon_dotprod),
SadMxNAvgParam(32, 64, &vpx_sad32x64_avg_neon_dotprod),
SadMxNAvgParam(32, 32, &vpx_sad32x32_avg_neon_dotprod),
SadMxNAvgParam(32, 16, &vpx_sad32x16_avg_neon_dotprod),
SadMxNAvgParam(16, 32, &vpx_sad16x32_avg_neon_dotprod),
SadMxNAvgParam(16, 16, &vpx_sad16x16_avg_neon_dotprod),
SadMxNAvgParam(16, 8, &vpx_sad16x8_avg_neon_dotprod),
};
INSTANTIATE_TEST_SUITE_P(NEON_DOTPROD, SADavgTest,
::testing::ValuesIn(avg_neon_dotprod_tests));
#endif // HAVE_NEON_DOTPROD
const SadMxNx4Param x4d_neon_tests[] = {
SadMxNx4Param(64, 64, &vpx_sad64x64x4d_neon),
SadMxNx4Param(64, 32, &vpx_sad64x32x4d_neon),
@@ -899,6 +1345,92 @@ const SadMxNx4Param x4d_neon_tests[] = {
#endif // CONFIG_VP9_HIGHBITDEPTH
};
INSTANTIATE_TEST_SUITE_P(NEON, SADx4Test, ::testing::ValuesIn(x4d_neon_tests));
#if HAVE_NEON_DOTPROD
const SadMxNx4Param x4d_neon_dotprod_tests[] = {
SadMxNx4Param(64, 64, &vpx_sad64x64x4d_neon_dotprod),
SadMxNx4Param(64, 32, &vpx_sad64x32x4d_neon_dotprod),
SadMxNx4Param(32, 64, &vpx_sad32x64x4d_neon_dotprod),
SadMxNx4Param(32, 32, &vpx_sad32x32x4d_neon_dotprod),
SadMxNx4Param(32, 16, &vpx_sad32x16x4d_neon_dotprod),
SadMxNx4Param(16, 32, &vpx_sad16x32x4d_neon_dotprod),
SadMxNx4Param(16, 16, &vpx_sad16x16x4d_neon_dotprod),
SadMxNx4Param(16, 8, &vpx_sad16x8x4d_neon_dotprod),
};
INSTANTIATE_TEST_SUITE_P(NEON_DOTPROD, SADx4Test,
::testing::ValuesIn(x4d_neon_dotprod_tests));
#endif // HAVE_NEON_DOTPROD
const SadSkipMxNx4Param skip_x4d_neon_tests[] = {
SadSkipMxNx4Param(64, 64, &vpx_sad_skip_64x64x4d_neon),
SadSkipMxNx4Param(64, 32, &vpx_sad_skip_64x32x4d_neon),
SadSkipMxNx4Param(32, 64, &vpx_sad_skip_32x64x4d_neon),
SadSkipMxNx4Param(32, 32, &vpx_sad_skip_32x32x4d_neon),
SadSkipMxNx4Param(32, 16, &vpx_sad_skip_32x16x4d_neon),
SadSkipMxNx4Param(16, 32, &vpx_sad_skip_16x32x4d_neon),
SadSkipMxNx4Param(16, 16, &vpx_sad_skip_16x16x4d_neon),
SadSkipMxNx4Param(16, 8, &vpx_sad_skip_16x8x4d_neon),
SadSkipMxNx4Param(8, 16, &vpx_sad_skip_8x16x4d_neon),
SadSkipMxNx4Param(8, 8, &vpx_sad_skip_8x8x4d_neon),
SadSkipMxNx4Param(8, 4, &vpx_sad_skip_8x4x4d_neon),
SadSkipMxNx4Param(4, 8, &vpx_sad_skip_4x8x4d_neon),
SadSkipMxNx4Param(4, 4, &vpx_sad_skip_4x4x4d_neon),
#if CONFIG_VP9_HIGHBITDEPTH
SadSkipMxNx4Param(4, 4, &vpx_highbd_sad_skip_4x4x4d_neon, 8),
SadSkipMxNx4Param(4, 8, &vpx_highbd_sad_skip_4x8x4d_neon, 8),
SadSkipMxNx4Param(8, 4, &vpx_highbd_sad_skip_8x4x4d_neon, 8),
SadSkipMxNx4Param(8, 8, &vpx_highbd_sad_skip_8x8x4d_neon, 8),
SadSkipMxNx4Param(8, 16, &vpx_highbd_sad_skip_8x16x4d_neon, 8),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_neon, 8),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_neon, 8),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_neon, 8),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_neon, 8),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_neon, 8),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_neon, 8),
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_neon, 8),
SadSkipMxNx4Param(4, 4, &vpx_highbd_sad_skip_4x4x4d_neon, 10),
SadSkipMxNx4Param(4, 8, &vpx_highbd_sad_skip_4x8x4d_neon, 10),
SadSkipMxNx4Param(8, 4, &vpx_highbd_sad_skip_8x4x4d_neon, 10),
SadSkipMxNx4Param(8, 8, &vpx_highbd_sad_skip_8x8x4d_neon, 10),
SadSkipMxNx4Param(8, 16, &vpx_highbd_sad_skip_8x16x4d_neon, 10),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_neon, 10),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_neon, 10),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_neon, 10),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_neon, 10),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_neon, 10),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_neon, 10),
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_neon, 10),
SadSkipMxNx4Param(4, 4, &vpx_highbd_sad_skip_4x4x4d_neon, 12),
SadSkipMxNx4Param(4, 8, &vpx_highbd_sad_skip_4x8x4d_neon, 12),
SadSkipMxNx4Param(8, 4, &vpx_highbd_sad_skip_8x4x4d_neon, 12),
SadSkipMxNx4Param(8, 8, &vpx_highbd_sad_skip_8x8x4d_neon, 12),
SadSkipMxNx4Param(8, 16, &vpx_highbd_sad_skip_8x16x4d_neon, 12),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_neon, 12),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_neon, 12),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_neon, 12),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_neon, 12),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_neon, 12),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_neon, 12),
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_neon, 12),
#endif // CONFIG_VP9_HIGHBITDEPTH
};
INSTANTIATE_TEST_SUITE_P(NEON, SADSkipx4Test,
::testing::ValuesIn(skip_x4d_neon_tests));
#if HAVE_NEONE_DOTPROD
const SadSkipMxNx4Param skip_x4d_neon_dotprod_tests[] = {
SadSkipMxNx4Param(64, 64, &vpx_sad_skip_64x64x4d_neon_dotprod),
SadSkipMxNx4Param(64, 32, &vpx_sad_skip_64x32x4d_neon_dotprod),
SadSkipMxNx4Param(32, 64, &vpx_sad_skip_32x64x4d_neon_dotprod),
SadSkipMxNx4Param(32, 32, &vpx_sad_skip_32x32x4d_neon_dotprod),
SadSkipMxNx4Param(32, 16, &vpx_sad_skip_32x16x4d_neon_dotprod),
SadSkipMxNx4Param(16, 32, &vpx_sad_skip_16x32x4d_neon_dotprod),
SadSkipMxNx4Param(16, 16, &vpx_sad_skip_16x16x4d_neon_dotprod),
SadSkipMxNx4Param(16, 8, &vpx_sad_skip_16x8x4d_neon_dotprod),
};
INSTANTIATE_TEST_SUITE_P(NEON_DOTPROD, SADSkipx4Test,
::testing::ValuesIn(skip_x4d_neon_dotprod_tests));
#endif // HAVE_NEON_DOTPROD
#endif // HAVE_NEON
//------------------------------------------------------------------------------
@@ -956,6 +1488,54 @@ const SadMxNParam sse2_tests[] = {
};
INSTANTIATE_TEST_SUITE_P(SSE2, SADTest, ::testing::ValuesIn(sse2_tests));
const SadSkipMxNParam skip_sse2_tests[] = {
SadSkipMxNParam(64, 64, &vpx_sad_skip_64x64_sse2),
SadSkipMxNParam(64, 32, &vpx_sad_skip_64x32_sse2),
SadSkipMxNParam(32, 64, &vpx_sad_skip_32x64_sse2),
SadSkipMxNParam(32, 32, &vpx_sad_skip_32x32_sse2),
SadSkipMxNParam(32, 16, &vpx_sad_skip_32x16_sse2),
SadSkipMxNParam(16, 32, &vpx_sad_skip_16x32_sse2),
SadSkipMxNParam(16, 16, &vpx_sad_skip_16x16_sse2),
SadSkipMxNParam(16, 8, &vpx_sad_skip_16x8_sse2),
SadSkipMxNParam(8, 16, &vpx_sad_skip_8x16_sse2),
SadSkipMxNParam(8, 8, &vpx_sad_skip_8x8_sse2),
SadSkipMxNParam(4, 8, &vpx_sad_skip_4x8_sse2),
#if CONFIG_VP9_HIGHBITDEPTH
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_sse2, 8),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_sse2, 8),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_sse2, 8),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_sse2, 8),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_sse2, 8),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_sse2, 8),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_sse2, 8),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_sse2, 8),
SadSkipMxNParam(8, 16, &vpx_highbd_sad_skip_8x16_sse2, 8),
SadSkipMxNParam(8, 8, &vpx_highbd_sad_skip_8x8_sse2, 8),
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_sse2, 10),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_sse2, 10),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_sse2, 10),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_sse2, 10),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_sse2, 10),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_sse2, 10),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_sse2, 10),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_sse2, 10),
SadSkipMxNParam(8, 16, &vpx_highbd_sad_skip_8x16_sse2, 10),
SadSkipMxNParam(8, 8, &vpx_highbd_sad_skip_8x8_sse2, 10),
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_sse2, 12),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_sse2, 12),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_sse2, 12),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_sse2, 12),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_sse2, 12),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_sse2, 12),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_sse2, 12),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_sse2, 12),
SadSkipMxNParam(8, 16, &vpx_highbd_sad_skip_8x16_sse2, 12),
SadSkipMxNParam(8, 8, &vpx_highbd_sad_skip_8x8_sse2, 12),
#endif // CONFIG_VP9_HIGHBITDEPTH
};
INSTANTIATE_TEST_SUITE_P(SSE2, SADSkipTest,
::testing::ValuesIn(skip_sse2_tests));
const SadMxNAvgParam avg_sse2_tests[] = {
SadMxNAvgParam(64, 64, &vpx_sad64x64_avg_sse2),
SadMxNAvgParam(64, 32, &vpx_sad64x32_avg_sse2),
@@ -1065,6 +1645,57 @@ const SadMxNx4Param x4d_sse2_tests[] = {
#endif // CONFIG_VP9_HIGHBITDEPTH
};
INSTANTIATE_TEST_SUITE_P(SSE2, SADx4Test, ::testing::ValuesIn(x4d_sse2_tests));
const SadSkipMxNx4Param skip_x4d_sse2_tests[] = {
SadSkipMxNx4Param(64, 64, &vpx_sad_skip_64x64x4d_sse2),
SadSkipMxNx4Param(64, 32, &vpx_sad_skip_64x32x4d_sse2),
SadSkipMxNx4Param(32, 64, &vpx_sad_skip_32x64x4d_sse2),
SadSkipMxNx4Param(32, 32, &vpx_sad_skip_32x32x4d_sse2),
SadSkipMxNx4Param(32, 16, &vpx_sad_skip_32x16x4d_sse2),
SadSkipMxNx4Param(16, 32, &vpx_sad_skip_16x32x4d_sse2),
SadSkipMxNx4Param(16, 16, &vpx_sad_skip_16x16x4d_sse2),
SadSkipMxNx4Param(16, 8, &vpx_sad_skip_16x8x4d_sse2),
SadSkipMxNx4Param(8, 16, &vpx_sad_skip_8x16x4d_sse2),
SadSkipMxNx4Param(8, 8, &vpx_sad_skip_8x8x4d_sse2),
SadSkipMxNx4Param(4, 8, &vpx_sad_skip_4x8x4d_sse2),
#if CONFIG_VP9_HIGHBITDEPTH
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_sse2, 8),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_sse2, 8),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_sse2, 8),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_sse2, 8),
SadSkipMxNx4Param(32, 16, &vpx_highbd_sad_skip_32x16x4d_sse2, 8),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_sse2, 8),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_sse2, 8),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_sse2, 8),
SadSkipMxNx4Param(8, 16, &vpx_highbd_sad_skip_8x16x4d_sse2, 8),
SadSkipMxNx4Param(8, 8, &vpx_highbd_sad_skip_8x8x4d_sse2, 8),
SadSkipMxNx4Param(4, 8, &vpx_highbd_sad_skip_4x8x4d_sse2, 8),
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_sse2, 10),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_sse2, 10),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_sse2, 10),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_sse2, 10),
SadSkipMxNx4Param(32, 16, &vpx_highbd_sad_skip_32x16x4d_sse2, 10),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_sse2, 10),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_sse2, 10),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_sse2, 10),
SadSkipMxNx4Param(8, 16, &vpx_highbd_sad_skip_8x16x4d_sse2, 10),
SadSkipMxNx4Param(8, 8, &vpx_highbd_sad_skip_8x8x4d_sse2, 10),
SadSkipMxNx4Param(4, 8, &vpx_highbd_sad_skip_4x8x4d_sse2, 10),
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_sse2, 12),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_sse2, 12),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_sse2, 12),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_sse2, 12),
SadSkipMxNx4Param(32, 16, &vpx_highbd_sad_skip_32x16x4d_sse2, 12),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_sse2, 12),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_sse2, 12),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_sse2, 12),
SadSkipMxNx4Param(8, 16, &vpx_highbd_sad_skip_8x16x4d_sse2, 12),
SadSkipMxNx4Param(8, 8, &vpx_highbd_sad_skip_8x8x4d_sse2, 12),
SadSkipMxNx4Param(4, 8, &vpx_highbd_sad_skip_4x8x4d_sse2, 12),
#endif // CONFIG_VP9_HIGHBITDEPTH
};
INSTANTIATE_TEST_SUITE_P(SSE2, SADSkipx4Test,
::testing::ValuesIn(skip_x4d_sse2_tests));
#endif // HAVE_SSE2
#if HAVE_SSE3
@@ -1113,6 +1744,44 @@ const SadMxNParam avx2_tests[] = {
};
INSTANTIATE_TEST_SUITE_P(AVX2, SADTest, ::testing::ValuesIn(avx2_tests));
const SadSkipMxNParam skip_avx2_tests[] = {
SadSkipMxNParam(64, 64, &vpx_sad_skip_64x64_avx2),
SadSkipMxNParam(64, 32, &vpx_sad_skip_64x32_avx2),
SadSkipMxNParam(32, 64, &vpx_sad_skip_32x64_avx2),
SadSkipMxNParam(32, 32, &vpx_sad_skip_32x32_avx2),
SadSkipMxNParam(32, 16, &vpx_sad_skip_32x16_avx2),
#if CONFIG_VP9_HIGHBITDEPTH
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_avx2, 8),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_avx2, 8),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_avx2, 8),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_avx2, 8),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_avx2, 8),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_avx2, 8),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_avx2, 8),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_avx2, 8),
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_avx2, 10),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_avx2, 10),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_avx2, 10),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_avx2, 10),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_avx2, 10),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_avx2, 10),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_avx2, 10),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_avx2, 10),
SadSkipMxNParam(64, 64, &vpx_highbd_sad_skip_64x64_avx2, 12),
SadSkipMxNParam(64, 32, &vpx_highbd_sad_skip_64x32_avx2, 12),
SadSkipMxNParam(32, 64, &vpx_highbd_sad_skip_32x64_avx2, 12),
SadSkipMxNParam(32, 32, &vpx_highbd_sad_skip_32x32_avx2, 12),
SadSkipMxNParam(32, 16, &vpx_highbd_sad_skip_32x16_avx2, 12),
SadSkipMxNParam(16, 32, &vpx_highbd_sad_skip_16x32_avx2, 12),
SadSkipMxNParam(16, 16, &vpx_highbd_sad_skip_16x16_avx2, 12),
SadSkipMxNParam(16, 8, &vpx_highbd_sad_skip_16x8_avx2, 12),
#endif // CONFIG_VP9_HIGHBITDEPTH
};
INSTANTIATE_TEST_SUITE_P(AVX2, SADSkipTest,
::testing::ValuesIn(skip_avx2_tests));
const SadMxNAvgParam avg_avx2_tests[] = {
SadMxNAvgParam(64, 64, &vpx_sad64x64_avg_avx2),
SadMxNAvgParam(64, 32, &vpx_sad64x32_avg_avx2),
@@ -1180,6 +1849,42 @@ const SadMxNx4Param x4d_avx2_tests[] = {
};
INSTANTIATE_TEST_SUITE_P(AVX2, SADx4Test, ::testing::ValuesIn(x4d_avx2_tests));
const SadSkipMxNx4Param skip_x4d_avx2_tests[] = {
SadSkipMxNx4Param(64, 64, &vpx_sad_skip_64x64x4d_avx2),
SadSkipMxNx4Param(64, 32, &vpx_sad_skip_64x32x4d_avx2),
SadSkipMxNx4Param(32, 64, &vpx_sad_skip_32x64x4d_avx2),
SadSkipMxNx4Param(32, 32, &vpx_sad_skip_32x32x4d_avx2),
SadSkipMxNx4Param(32, 16, &vpx_sad_skip_32x16x4d_avx2),
#if CONFIG_VP9_HIGHBITDEPTH
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_avx2, 8),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_avx2, 8),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_avx2, 8),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_avx2, 8),
SadSkipMxNx4Param(32, 16, &vpx_highbd_sad_skip_32x16x4d_avx2, 8),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_avx2, 8),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_avx2, 8),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_avx2, 8),
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_avx2, 10),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_avx2, 10),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_avx2, 10),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_avx2, 10),
SadSkipMxNx4Param(32, 16, &vpx_highbd_sad_skip_32x16x4d_avx2, 10),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_avx2, 10),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_avx2, 10),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_avx2, 10),
SadSkipMxNx4Param(64, 64, &vpx_highbd_sad_skip_64x64x4d_avx2, 12),
SadSkipMxNx4Param(64, 32, &vpx_highbd_sad_skip_64x32x4d_avx2, 12),
SadSkipMxNx4Param(32, 64, &vpx_highbd_sad_skip_32x64x4d_avx2, 12),
SadSkipMxNx4Param(32, 32, &vpx_highbd_sad_skip_32x32x4d_avx2, 12),
SadSkipMxNx4Param(32, 16, &vpx_highbd_sad_skip_32x16x4d_avx2, 12),
SadSkipMxNx4Param(16, 32, &vpx_highbd_sad_skip_16x32x4d_avx2, 12),
SadSkipMxNx4Param(16, 16, &vpx_highbd_sad_skip_16x16x4d_avx2, 12),
SadSkipMxNx4Param(16, 8, &vpx_highbd_sad_skip_16x8x4d_avx2, 12),
#endif // CONFIG_VP9_HIGHBITDEPTH
};
INSTANTIATE_TEST_SUITE_P(AVX2, SADSkipx4Test,
::testing::ValuesIn(skip_x4d_avx2_tests));
#endif // HAVE_AVX2
#if HAVE_AVX512
+2 -2
View File
@@ -15,7 +15,7 @@
#include <string.h>
#include <sys/types.h>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/acm_random.h"
#include "vp8/encoder/onyx_int.h"
#include "vpx/vpx_integer.h"
@@ -40,7 +40,7 @@ TEST(VP8RoiMapTest, ParameterCheck) {
// Initialize elements of cpi with valid defaults.
VP8_COMP cpi;
cpi.mb.e_mbd.mb_segement_abs_delta = SEGMENT_DELTADATA;
cpi.mb.e_mbd.mb_segment_abs_delta = SEGMENT_DELTADATA;
cpi.cyclic_refresh_mode_enabled = 0;
cpi.mb.e_mbd.segmentation_enabled = 0;
cpi.mb.e_mbd.update_mb_segmentation_map = 0;
+1 -1
View File
@@ -12,7 +12,7 @@
#include <memory>
#include <string>
#include <vector>
#include "third_party/googletest/src/include/gtest/gtest.h"
#include "gtest/gtest.h"
#include "test/video_source.h"
#include "vp9/simple_encode.h"

Some files were not shown because too many files have changed in this diff Show More