wolfwings: (Default)
[personal profile] wolfwings

I've been working on building an SSH client, focussing on only supporting SSH 2.0, and ignoring all unneeded features and backwards-compatability options so this will fundamentally be a Java ME MIDP 2.0/CIDP 1.1 SSH 2.0-only client; no port forwarding, no X11 forwarding, no strange keyboard or terminal mappings just multiple shells on Linux servers running recent distro's.

Anyways, I've gotten to the Diffie-Hellman Key Exchange part, and I hunted for hours to find a good implementation of the math that didn't gobble up scads of memory and run way too slow for use in Java on my BlackBerry. It was a deep-recursive function, that buried itself as many layers deep as the numbers it was working with were wide.

But... there was a saving grace here, I unwound the function-calls, and verified that it was actually a linear test. It just had to start at the MSB of one operand, and run down the line to the LSB performing a core loop with one optional branch. Huzzah, bit-wise operands can convert this to a loop!

Well... I built and tested the recursive function against this new version, woo, it worked. Then I realized this may be a bit of code that others would find useful. So... posting it here for now in case anyone else needs/wants it. It's fully C99-compliant, and I'll update this post when I make a Java version, since the built-in function for this isn't available in Java ME as far as I can find.

/* Non-recursively calculates X to the power of Y, modulus N. */
unsigned long long int XpowYmodN(
                unsigned long long int x,
                unsigned long long int y,
                unsigned long long int n) {
        unsigned long long int accum, slice;
        int i;

        /* Find the single highest bit set of Y.
         *
         *      This is done with by bitshifting and boolean-ORing
         * Y against itself to 'bit fill' all the way to the LSB.
         *
         *      We then increment the accumulator to convert to a
         * single-bit value we use as a bitfield mask from then on.
         *
         *      Note that we initially bit-shift Y one bit to the
         * right to avoid an overflow condition, as we'd have to
         * bit-shift the result one to the right otherwise.
         */
        slice = y >> 1;
        i = 1;
        for (;;) {
                /* Stop when I is large enough to zero out
                 * the number with a bitwise right-shift. */
                accum = (slice >> i);
                if (accum == 0) {
                        break;
                }
                slice |= accum;
                i += i;
        }
        slice++;

        /* This is based on power chaining over modulus N.
         * Normally this is a recursive computation, this
         * construction reduces it to a tight inner loop
         * that most compilers can keep in-register. */
        accum = x % n; /* The 'Y == 1' deepest starting point. */
        for (;;) {
                slice >>= 1;
                if (slice == 0) {
                        break; /* Out of bits to process. */
                }
                accum = ((accum * accum) % n);  /* Always a power-2 cycle. */
                if (y & slice) {                /* The optional *X cycle. */
                        accum = ((accum * x) % n);
                }
        }

        return accum;
}

(no subject)

Date: 2009-08-26 05:20 pm (UTC)
From: [identity profile] aerowolf.livejournal.com
I wonder if any of the crypto libraries would find this useful -- BouncyCastle, OpenSSL, Crypto++?

(It does help me. Thanks. :))

(no subject)

Date: 2009-08-26 05:33 pm (UTC)
From: [identity profile] wolfwings.livejournal.com
All of them will have a function like this, it's simply required for RSA, Diffie-Hellman, etc.

The problem is that I don't want/need a heavyweight 'do everything for me' library. As this is a Java applet for use on a cellphone, I need a stripped-down 'only what is absolutely required to work' set of functions to build things from.

And trying to crowbar out only the fragments from the heavyweights like BouncyCastle and/or OpenSSL that I need is harder than writing my own versions from scratch, and usually what I'd end up with anyways.

BC, for instance, appears (from what I've seen) to rely on the JCE extensions, which are a huge framework designed to handle PKCS and everything else. All I need is Diffie-Hellman, SHA1, and AES-128-CBC to get SSH-2.0 running. For separating out the layers of the SSH protocol, Java's nearly ideal, but I simply can't afford the load of going through ten layers to get SSH running, the BlackBerry doesn't have the grunt.

And I'll be honest, I'm not targetted corporate/certified-level stuff. If someone cracks/hacks my BlackBerry, I'm toast. Timing attacks and all that, I'm sure do and will exist in my code. But it'll let my securely go over my cell-phone link to get to a remote Linux shell, and work as good or better than if I was logged in locally to the server.

Hell, I'm doing one thing I'm unaware of any other GUI-ish SSH client doing before now too: Specifically building around the idea of multiple seperate shell's (connections) running over the single SSH transport, so it'll have multiple terminal's all cached remotely so there's no latency in flipping between them like using a server-side solution such as Screen.

(no subject)

Date: 2009-08-28 11:50 pm (UTC)
From: [identity profile] uplinktruck.livejournal.com
Light years over my head, but I can still hand code an 8085 or Z80 processor.

Style Credit