News:

TESTAMONIAL:  "I was still a bit rattled by the spectacular devastation."

Main Menu

Data Mine THIS!

Started by Random Probability, July 15, 2013, 11:44:27 PM

Previous topic - Next topic

Random Probability

Hello again.  Been a while.  How are you all doing?  ECH, gratz on the new rectum.  Weaponized.  Very impressive.

Once, long ago, I started looking for either a Tar Baby or a Briar Patch (preferably both). Cain might remember.  Something...  I wasn't looking for something to smash the system so much as gum up the works.

About that time I stopped coming by and posting because of reasons.  Now that those reasons are well known and widespread I think it's time to discuss what to do about them.

One of the biggest problems facing intelligence analysts is information overload.  In the broader sense, this is a vexing problem; separating the needles from the haystack. In a world where governments are collecting every bit of internet interaction they can find on everyone everywhere makes for a very big haystack indeed.  With enough data they can know what a specific individual might do before that person ever knows they want to do it.  But that condition only obtains when all of the data is true and relevant.  With an infinitely powerful magnet it is simple to pull those needles out of that haystack.

But what if it wasn't a haystack, but a pile of needles?

I've been poking around looking for a plugin or program that can do this, but I haven't found it yet.  What I'm thinking of is something that would work a bit like stumble.  You set a profile of "interests" and it goes off and visits a bunch of sites for you.  It crawls all over the web, pretending to be you looking at... whatever.  A well crafted program could do something clever such as look at what you're really looking at and then scurry off and hit opposing points of interest.  The end result is that the folks gathering data, if they ever have cause to look at YOU directly will find that their amazing database of magical everything is really just a pile of steaming fecal matter.  Moreover, if enough people use it, it will impact the statistcal and traffic analytical models in interesting ways.

My background is in RF and things.  I could design you a true random generator, but I have no idea how to go about coding something like this.  I'd imagine it would have to be a shell or launcher for another browser, or something.  Does anyone have an idea of how to pull something like this off, or if it's even possible?

Anyway, that's all I have for now.  The rest of my thoughts are just cock, repost, and fail.

Cheers!

Q. G. Pennyworth

It could probably work as a browser extension for firefox and chrome, but I don't know enough about programming either to be useful :(

The Johnny


to distort the analysis you have to obfuscate two things:

-recurring themes
-recurring keywords

so at least your "auto generated" information would had to be twice as much as your real actions... the problem is that its impossible to code a truly random generator, there will absolutely always be patterns, code that can be reverse engineered or inferred what it is doing...

how do you fake email conversations? how do you fake facebook interactions? you cant.

you should look into programs that do qualitative analysis and you will know what i mean... they go fo around $450 and anyone can buy them.... now imagine how a military grade qualitative analysis program with infinite computing capacity would work and how efficient its filters and heuristics can actually be.

you cant fool anyone with the right training, determination and tools from finding out what you are all about.
<<My image in some places, is of a monster of some kind who wants to pull a string and manipulate people. Nothing could be further from the truth. People are manipulated; I just want them to be manipulated more effectively.>>

-B.F. Skinner

Mesozoic Mister Nigel

I am intrigued! Johnny has a point, but one thing it could accomplish, if enough people used it, is to make widespread surveillance more expensive and time-consuming.
"I'm guessing it was January 2007, a meeting in Bethesda, we got a bag of bees and just started smashing them on the desk," Charles Wick said. "It was very complicated."


LMNO

This. Making brute force hacks unwieldy. I have a feeling that even with quantum computing, Our Goddess may still prevail.

Pergamos

Quote from: The Johnny on July 16, 2013, 02:34:32 AM

to distort the analysis you have to obfuscate two things:

-recurring themes
-recurring keywords

so at least your "auto generated" information would had to be twice as much as your real actions... the problem is that its impossible to code a truly random generator, there will absolutely always be patterns, code that can be reverse engineered or inferred what it is doing...

how do you fake email conversations? how do you fake facebook interactions? you cant.

you should look into programs that do qualitative analysis and you will know what i mean... they go fo around $450 and anyone can buy them.... now imagine how a military grade qualitative analysis program with infinite computing capacity would work and how efficient its filters and heuristics can actually be.

you cant fool anyone with the right training, determination and tools from finding out what you are all about.

You can fake e-mail and facebook pretty easily actually.  Spambots do that all the time.

LMNO

Sadly, those are REAL ignorant assholes with an Internet connection.

Mesozoic Mister Nigel

Quote from: LMNO, PhD (life continues) on July 16, 2013, 03:35:07 AM
This. Making brute force hacks unwieldy. I have a feeling that even with quantum computing, Our Goddess may still prevail.

I'm OK with just being kind of a pain in the ass.
"I'm guessing it was January 2007, a meeting in Bethesda, we got a bag of bees and just started smashing them on the desk," Charles Wick said. "It was very complicated."


The Johnny

Quote from: Pergamos on July 16, 2013, 04:02:44 AM
Quote from: The Johnny on July 16, 2013, 02:34:32 AM

to distort the analysis you have to obfuscate two things:

-recurring themes
-recurring keywords

so at least your "auto generated" information would had to be twice as much as your real actions... the problem is that its impossible to code a truly random generator, there will absolutely always be patterns, code that can be reverse engineered or inferred what it is doing...

how do you fake email conversations? how do you fake facebook interactions? you cant.

you should look into programs that do qualitative analysis and you will know what i mean... they go fo around $450 and anyone can buy them.... now imagine how a military grade qualitative analysis program with infinite computing capacity would work and how efficient its filters and heuristics can actually be.

you cant fool anyone with the right training, determination and tools from finding out what you are all about.

You can fake e-mail and facebook pretty easily actually.  Spambots do that all the time.

Its not just a matter of registration, because what really matters is the content... if content has recurrences it can easily be filtered out.
<<My image in some places, is of a monster of some kind who wants to pull a string and manipulate people. Nothing could be further from the truth. People are manipulated; I just want them to be manipulated more effectively.>>

-B.F. Skinner

The Johnny


I could go into detail as to how a quantitative/qualitative analysis would function, but only if someone is interested.
<<My image in some places, is of a monster of some kind who wants to pull a string and manipulate people. Nothing could be further from the truth. People are manipulated; I just want them to be manipulated more effectively.>>

-B.F. Skinner

Junkenstein

Quote from: The Johnny on July 16, 2013, 04:28:08 AM

I could go into detail as to how a quantitative/qualitative analysis would function, but only if someone is interested.

I'm someone and I'm interested.
Nine naked Men just walking down the road will cause a heap of trouble for all concerned.

hirley0


Left

Quote from: hirley0 on July 16, 2013, 08:43:30 AM
20130716

Hirley, you made it move... :mrgreen:

I've been sticking the following as my signature on most of my emails:

ap-Stun, stakeout,, Oscor, Merlin, Earth first NTT, SL-1, Rolm, TIE, Tie-fighter, PBX, SLI, NTT, MSCJ, Time, MSEE, Cable & Wireless, CSE, Embassy, Ruby Ridge ETA, Porno, Fax, finks, Fax encryption, fertilizer, white noise, pink noise, CRA, M.P.R.I., top secret, Mossberg, 50BMG,  Unix Security, VIP Protection, SIG, sweep,
NSA Reader Person, you want a PIZZA.  Big, yummy, just dripping with hot cheesy goodness.  You can have it ordered NOW....YUMM...Pizza!, sweeping, TELINT, Audiotel, improvised c4 Harvard, gasoline, 1080H, SWS, Asset,  Satellite imagery, force, Cypherpunks,, halliburton Coderpunks eternity server, Skytel, Yukon ,pressure-cooker,hezbollah Templeton, LUK, Cohiba, Soros, Standford, niche, 51, H&K, USP, ^, sardine, bank, EUB, USP, PCS, NRO, Red Cell, Glock 26, ELF ,snuffle, 50. cal.,  package,credit card, b9, fraud, assasinate,Waco, U-235, virus, anarchy, rogue, mailbomb, 888, Chelsea, MOD, York, plutonium, , explosives, advise, TUSA, HoHoCon, MSW,WORM, MP5K-SD, 1071, WINGS, cdi, DynCorp, UXO, Ti, THAAD, plutonium, package, chosen, PRIME, SURVIAC

...Yes, I am trying to make the NSA a fatter organization...
Hope was the thing with feathers.
I smacked it with a hammer until it was red and squashy

Cramulus

Huh, great minds! I was just talking about this in #discord the other week. I am concerned with both the NSA spying and with the vast and creepy marketing resources which people are collecting under our noses.

The commercial data market is pretty sketchy too. All your commercial transactions, amazon.com visits, facebook likes, many of your web page visits, credit card purchases, etc etc etc, are being recorded and sold between about six giant shadowy companies. These guys make several billion a year trading info with each other. And you don't have the right to access it or any way to control it. It's data about you but it's not yours.

Even before PRISM saw daylight, I've been worried about that commercial data market. I just get this sick feeling that there are a hundred black swans nesting there. Especially now that they are perfecting demographic prediction heuristics.

Marketers have the ability to determine arcane and minute things about you based on kind of arbitrary activity. I mean, there are certain topics that if you "like" on facebook, they can predict with certainty whether or not you are a cigarette smoker. Or that your parents separated before you turned 21. The prediction criteria may not even be related to the topic! In that link above, check out how the criteria which pegs you as a cigarette smoker or drug user isn't even related to drugs or health. You don't know what picture of you your data paints.

So I was thinking, wouldn't it be cool to have a script running in the background which randomly visits amazon.com urls? randomly donates a few pennies to some random kickstarter? Randomly "likes" then "unlikes" some things on FB? Randomly sends mail to a dummy account using certain key words? Randomly googles keywords related to some interest you don't have?

I don't think these sorts of tricks would protect you if somebody was paying really close attention to you. But they might foil broad prediction heuristics and complicate datamining.

I've been saying it for years -- the trick to privacy is chaff -- it's not about hiding your real activity, it's about disguising it behind a smokescreen of fake activity. It's not about less signal, it's about producing more noise.

Junkenstein

Cram, that link is excellent. Pages 11 onwards are gold. Also helps illustrate exactly what metadata is useful for.

Will read the full thing later on, Thanks!
Nine naked Men just walking down the road will cause a heap of trouble for all concerned.