dolphin

Author	SHA1	Message	Date
Pierre Bourdon	0ff1481494	Optimize PPC CR emulation by using magic 64 bit values PowerPC has a 32 bit CR register, which is used to store flags for results of computations. Most instructions have an optional bit that tells the CPU whether the flags should be updated. This 32 bit register actually contains 8 sets of 4 flags: Summary Overflow (SO), Equals (EQ), Greater Than (GT), Less Than (LT). These 8 sets are usually called CR0-CR7 and accessed independently. In the most common operations, the flags are computed from the result of the operation in the following fashion: * EQ is set iff result == 0 * LT is set iff result < 0 * GT is set iff result > 0 * (Dolphin does not emulate SO) While X86 architectures have a similar concept of flags, it is very difficult to access the FLAGS register directly to translate its value to an equivalent PowerPC value. With the current Dolphin implementation, updating a PPC CR register requires CPU branching, which has a few performance issues: it uses space in the BTB, and in the worst case (!GT, !LT, EQ) requires 2 branches not taken. After some brainstorming on IRC about how this could be improved, calc84maniac figured out a neat trick that makes common CR operations way more efficient to JIT on 64 bit X86 architectures. It relies on emulating each CRn bitfield with a 64 bit register internally, whose value is the result of the operation from which flags are updated, sign extended to 64 bits. Then, checking if a CR bit is set can be done in the following way: * EQ is set iff LOWER_32_BITS(cr_64b_val) == 0 * GT is set iff (s64)cr_64b_val > 0 * LT is set iff bit 62 of cr_64b_val is set To take a few examples, if the result of an operation is: * -1 (0xFFFFFFFFFFFFFFFF) -> lower 32 bits not 0 => !EQ -> (s64)val (-1) is not > 0 => !GT -> bit 62 is set => LT !EQ, !GT, LT * 0 (0x0000000000000000) -> lower 32 bits are 0 => EQ -> (s64)val (0) is not > 0 => !GT -> bit 62 is not set => !LT EQ, !GT, !LT * 1 (0x0000000000000001) -> lower 32 bits not 0 => !EQ -> (s64)val (1) is > 0 => GT -> bit 62 is not set => !LT !EQ, GT, !LT Sometimes we need to convert PPC CR values to these 64 bit values. The following convention is used in this case: * Bit 0 (LSB) is set iff !EQ * Bit 62 is set iff LT * Bit 63 is set iff !GT * Bit 32 always set to disambiguize between EQ and GT Some more examples: * !EQ, GT, LT -> 0x4000000100000001 (!B63, B62, B32, B0) -> lower 32 bits not 0 => !EQ -> (s64)val is > 0 => GT -> bit 62 is set => LT * EQ, GT, !LT -> 0x0000000100000000 -> lower 32 bits are 0 => EQ -> (s64)val is > 0 (note: B32) => GT -> bit 62 is not set => !LT	2014-07-30 21:41:17 -07:00
Lioncash	b03c12764d	Really get rid of the MSVC 2005 workaround completely	2014-07-29 21:20:43 -04:00
Lioncash	412196a055	Core: Remove defines used to work around an MSVC 2005 bug	2014-07-29 19:33:08 -04:00
degasus	6d3f249dcc	mark all local variables as static	2014-07-11 16:10:20 +02:00
degasus	22e1aa5bb4	mark all local functions as static	2014-07-11 16:07:23 +02:00
Ryan Houdek	a40ae6883a	Move CoreTiming::downcount to PowerPC::ppcState. This isn't technically the correct place to have the downcount variable, but it is similar to what PPSSPP does to gain a bit of extra speed on ARM. We access this variable quite a bit, with each exit in a block it is subtracted from. On ARM this required four instructions to load and store the value, while now it only requires two. This gives an average of 1FPS gain to most games. Examples: Crazy Taxi: 54FPS -> 55FPS Luigi's Mansion: 20FPS -> 21FPS Wind Waker(Save Screen): 27FPS -> 28FPS This seems to average a 6mhz to 16mhz CPU core emulation improvement in the few games I've tested.	2014-06-26 01:48:00 +00:00
Tony Wasserka	9f22b2378d	Merge pull request #485 from magumagu/packed-fp-reciprocal Interpreter: return single-precision results for ps_rsqrte.	2014-06-19 16:51:33 +02:00
Lioncash	ce54c1e571	Kill off replaceable usages of s[n]printf.	2014-06-18 19:53:38 -04:00
magumagu	3da52018dc	Interpreter: return single-precision results for ps_rsqrte.	2014-06-11 19:50:33 -07:00
Paul Olszewski	5d793881b0	Fix the capitalization of "GameCube" throughout the project.	2014-06-08 11:24:49 +09:00
magumagu	98dd99a696	Interpreter: correctly support HLE functions. m_EndBlock is always false at the beginning of SingleStepInner in the normal interpreter loop.	2014-05-25 15:39:46 -07:00
magumagu	440246a190	Interpreter: use numeric_limits instead of FLT_MAX. Minor cleanup, and fixes compilation on some systems.	2014-05-24 10:58:15 +02:00
magumagu	6955fef161	Interpreter: Code style fixes.	2014-05-23 15:06:09 -07:00
magumagu	d0ed3b8192	Jit: Use infinity and NaN from numeric_limits. MSVC's implementation of INFINITY is unusable.	2014-05-23 14:59:03 -07:00
magumagu	a9a2d3d98d	New frsqrte implementation; verified accurate. This is similar to the old implementation, but it uses smaller tables, and handles more edge cases correctly. (hwtest coming soon.)	2014-05-23 14:59:02 -07:00
magumagu	129e76e60d	Interpreter: refactor the rsqrte code, and use it for ps_rsqrte.	2014-05-23 14:59:00 -07:00
magumagu	2f8a147eda	Interpreter: make fres match hardware. New table-based implementation written based on actual hardware behavior. (hwtest coming soon).	2014-05-22 19:48:48 -07:00
magumagu	ad4ad7c1ed	Use accurate frsqrte in Interpreter. The implementation of frsqrte exposed by this change isn't completely correct; that will be fixed in a later commit.	2014-05-22 19:46:27 -07:00
booto	9892c8ea54	numCyclesMinusOne to numCycles in GekkoOPInfo	2014-04-30 19:04:02 +08:00
Ryan Houdek	94497961ac	Removes unused argument in Helper_UpdateCR1. Interpreter::Helper_UpdateCR1 doesn't use the argument passed to UpdateCR1. It pulls its value from the FPSCR register. Also there was a Interpreter::Helper_UpdateCR1(float) in addition to Helper_UpdateCR1(double) that hasn't ever existed. Remove the function declaration.	2014-04-24 22:00:58 -05:00
magumagu	002fb0b563	Interpreter: don't PanicAlert on write to SPR_HID2. The alert apparently triggers on Midway Arcade Treasures 2; given that the game otherwise works fine, it's not a high priority to accurately emulate the bit in question. Fixes issue 7197.	2014-04-18 20:20:42 -07:00
Tillmann Karras	2fcaca0603	More range-based loops and overrides	2014-03-17 02:55:55 +01:00
Tillmann Karras	3c46c0ede9	Interpreter: make some class members private	2014-03-17 02:55:54 +01:00
Matthew Parlane	31cfc73a09	Fixes spacing for "for", "while", "switch" and "if" Also moved && and \|\| to ends of lines instead of start. Fixed misc vertical alignments and some { needed newlining.	2014-03-11 00:35:07 +13:00
Tillmann Karras	d802d39281	clang-modernize -use-nullptr and s/\bNULL\b/nullptr/g for *.cpp/h/mm files not compiled on my machine	2014-03-09 21:14:26 +01:00
Tillmann Karras	16885d0f74	Interpreter: less duplicate code in float compares	2014-03-09 19:35:13 +01:00
Tillmann Karras	9ef64245fa	MathUtil: fix IsQNAN() The constants were one nibble too short and the lower 51 bits don't actually have to be zero.	2014-03-09 19:34:58 +01:00
Tillmann Karras	d05e205a24	FPURoundMode: revert use of enums in bit-fields The workaround of using fixed underlying types produces lots of warnings in GCC because now the bit-fields are too small for the value range used for conversion semantics.	2014-03-09 15:24:35 +01:00
Ryan Houdek	4f02132f93	Make our architecture defines less stupid. Our defines were never clear between what meant 64bit or x86_64 This makes a clear cut between bitness and architecture. This commit also has the side effect of bringing up aarch64 compiling support.	2014-03-04 09:36:59 -06:00
Lioncash	13a007abed	Remove another clamp function laying in the codebase and replace it with the one in MathUtil.h.	2014-03-02 13:57:27 -05:00
Pierre Bourdon	311caef094	Merge pull request #25 from Tilka/ppc_fp Fix non-IEEE mode	2014-02-23 04:15:37 +01:00
Tillmann Karras	ee21cbe2d1	Add phire's more accurate DoubleToSingle version This method doesn't involve messing around with the quirks of the x87 FPU and should be reasonably fast. As a bonus, it does the correct thing for out-of-range doubles. However, it is also a little slower and only benefits programs that rely on undefined behavior so it is disabled for now.	2014-02-23 04:13:47 +01:00
Lioncash	146b301a91	Fix more header sorting issues in Core/ (now check-includes clean).	2014-02-20 01:01:11 +01:00
Lioncash	2afe215271	Convert all includes to relative paths.	2014-02-18 02:19:10 -05:00
Lioncash	3fd87a7636	Second and final pass of clearing out tabs.	2014-02-17 02:19:41 -05:00
Tillmann Karras	404624bf0b	Turn loops into range-based form and some things suggested by cppcheck and compiler warnings.	2014-02-13 09:05:50 +01:00
Scott Mansell	7062cf8657	Interpeter: Fixed ConvertToDouble to match the manual. Also added some documntation comments.	2014-02-12 23:12:17 +01:00
Tillmann Karras	f6897039c7	Interpreter: fix float conversions Can't use simple casting, otherwise we get the same problems as in Jit64.	2014-02-12 23:12:15 +01:00
lioncash	d2038049f5	Replace all include guard ifdefs with "#pragma once"	2014-02-10 18:07:16 -05:00
Lioncash	ebb48d019e	Clean up some struct indentations Also cleaned up the indentations of some variable declarations.	2014-02-09 19:40:11 -05:00
Jasper St. Pierre	34692ab826	Remove unnecessary Src/ folders	2013-12-31 14:03:19 -05:00

41 Commits