Forum:Importing Wiki

import license notices

As you'll see, I've used the boilerplate added to the bottom of imported pages (using Ontology as an example) as such:




...I did this to reflect that the page may or may not be a verbatim copy.

A friend of mine, who has looked at Wikinfo from time to time (and is not a regular WikiEditor), said "why edit on Wikinfo, when at the bottom of the page it says 'Adapted from Wikipedia'?" The perception (and a correct one, 99.998% of the time) was that it was just a verbatim, possibly out-of-date, copy from Wikipedia, and I'd like the peception here on GetWiki to be much more flattering, while still falling in line with the GNU FDL (because our articles should not be verbatim, out-of-date copies!). We want potential editors to think, and rightly, that text here is NOT a verbatim copy of either Wikipedia or Wikinfo articles, and that they are welcome to edit here as a totally separate space from them. If you think there is a better way to accomplish this with the boilerplate, say so. If changes are needed, we can change all these notices now, before there are too many to change :) -proteus 17:26, 23 Mar 2007 (EDT)

JA: In the early phases of the Citizendium project, when the plan was to fork Wikipedia whole hog, many Wikipedians were saying that anything Citizendium did better would just get merged back into Wikipedia. Never mind all the reasons why none of that ever happened, but it did lead me to devise an import notice that could be used in such situations, to wit:
Portions of the above article are adapted from an earlier version of thePseudopedia article, "Logical graph"weblink used under the GNU Free Documentation License.

This still seems to cover a lot of the possible bases. Jon Awbrey 22:22, 23 Mar 2007 (EDT)

Hmmm, the only problem is that it sounds even more like the article is outdated, or based on an outdated version, whether it is or not. Of course, many things we're likely to be importing don't actually go out of date, but it's beside the point: We're in the realm of perceptions, here. BTW, here's another area where discussion is welcome - when we set up this stuff on Wikinfo, Fred simply demanded that it be a certain way, to the letter. Meh. -proteus 22:47, 23 Mar 2007 (EDT)

JA: I didn't read it that way. "Some" and "portions of" are near synonyms. But your "may have" is not good for a quasi-legal notice. We need to say what is. "Adapted from" means "changed" in the current site but not necessarily updated at the former site -- and this suggests "improved" at this site, or else why change it at all. So it conveys the idea that the current version is improved, and if you know anything about Wikipedia, it's just as likely that the Wikipedia version has degenerated over time. Jon Awbrey 23:32, 23 Mar 2007 (EDT)

JA: I do not know what's going to happen at Wikinfo, so it seems that I will be developing stuff here for the immediate future. If WI does get on its feet again, I may want to export improved content back there again, and so I need a notice that is logically symmetric. There's no need to limit the flow of exchange to one direction. Jon Awbrey 23:32, 23 Mar 2007 (EDT)

JA: Further, using this form consistently means that we can request it of any other GFDL sites that borrow content from here. Wikipedia has been notoriously bad about plagiarism in the past, but we can demand that they credit anything that they borrow. Jon Awbrey 23:32, 23 Mar 2007 (EDT)

It was the "earlier version" part that triggers the impression of "out dated". I'm not sure we can require much of anything of WP, as the mob has a mind of its own ;) -proteus 00:34, 24 Mar 2007 (EDT)

JA: "Portions adapted from an earlier version" is descriptively accurate. If someone assumes that all WP articles are montonically increasing in quality, and in a way that outpaces anybody else's adaptations and revisions, then "earlier" means "less good" to them. That is a person who is assuming that WP is the gold standard, and we cannot help that kind of ignorance. Jon Awbrey 00:54, 24 Mar 2007 (EDT)

(BTW, sorry to keep indenting, I know you hate it) Anyway, I'm thinking more in terms of constant change to one of our articles. Some of the ones I edited on Wikinfo, for example, I'd so drastically changed, the only things in common would be words like "the" and "and" and some verbs and nouns. "Adapted" no longer covers it, and implies that the changes were minor - it's like the Ship of Theseus problem ;) -proteus 01:10, 24 Mar 2007 (EDT)

JA: Citizendium, that just went live, is using the notice: "This article uses content that originally appeared on Wikipedia." Jon Awbrey 23:18, 25 Mar 2007 (EDT)

Yeah, it's that word "earlier" which sounds funny, and "a version" isn't flattering, either. "Adapted from a version of" is kinda okay, and "uses content that originally appeared" is fine, if wordy. Maybe I'm stubborn, but "some content may have been adapted from" seems fine, or "some content adapted from". Like it really matters - the only thing required is the actual link, and even that is debatable. Let's decide on one version that can go on all such pages...maybe just "some content adapted from..."?? -proteus 00:10, 27 Mar 2007 (EDT) about:"portions adapted from the (import) article "X" under the..." -proteus 21:50, 27 Mar 2007 (EDT)

You've been using "some content adapted...", so let's go with that. I'll update the import settings, and then all the articles affected. If not for style, then to indicate the small importance at the bottom of an article, I'll go with a small caps "notices" (not "references" or "external links", which are confusing in relation to actual content), a capitalized "Some", avoid a period with the bullet-point, and use the newer "ur.php" link usage without the "e=" variable. It could even be italicized or made a smaller font...whatever the final form (below), it needs to be consistent across all imports. Edit the specimen below as you see fit, if you disagree... -proteus 11:35, 2 Apr 2007 (EDT)

JA: I wasn't really thinking about this in general, just trying to make sure that what I wrote was accurate on a case by case basis. For now I am only importing the versions of articles that I have personal acquaintance with and some measure of confidence in, as most of the ones that I worked on got seriously degraded by subsequent edits after I left, IMHO. Seems like the "Some content" bit almost always works, though, so long as nobody gets the idea that it's from the most current version. "Notices" seems okay -- can't use "references" because it conflicts with real references -- but the level 4 heading causes it to get outlined under the last heading of the article, which seems confusing. I would like to keep a level 2 heading, and in the same format as the others. Small print is geriatric-challenged, and no trees are saved by using it. And there needs to be a period at the end of the notice, which is kind of an elliptic sentence, I think. Jon Awbrey 18:04, 2 Apr 2007 (EDT)

I can live with the period (complete sentence or not). Also, nothing says it has to be a level X header. It can just be a bold heading, or even no heading at all, so I agree with that. In fact, the horizontal line above it might be all that's necessary, just to separate from the content. With CSS, it could also be set off as a box. Further, "N/notices" sounds too alarming, anyway. Then the bullet comes into question, and the sentence could just be italic under a horizontal line, for example (again, change below as needed). -proteus 18:25, 2 Apr 2007 (EDT)

JA: Zounds goot! Now finally we can get back to that annoying world peace thing. Jon Awbrey 21:12, 2 Apr 2007 (EDT)

Oh yeah, world peace... ;) -proteus 01:39, 3 Apr 2007 (EDT)
- using this list to change all of them...

Some content adapted from the Wikinfo article "Ontology" under the GNU Free Documentation License.

xml errors from wikipedia

There are many XML errors in the feed from Wikipedia. I'm looking into it, but looking at the source on their site here and there reveals that garbage ("binary") characters, or non-internet-friendly language characters, are involved. This happens when people cut-n-paste from something ugly like M$ Word, and then in XML, it breaks, because XML must be well-formed (and standards are something Wikipedians seem incapable of respecting, but don't get me started). This will be most prevalent in language, mathematics, and other articles. It is not a problem with GetWiki or your browser - it's Wikipedia.

So, if you want to import a Wikipedia article and it's showing any "invalid character" error at all, simply go to Wikipedia and get the source from there directly, then paste it here, or import the Wikinfo version (which is unlikely to have the problem, since it was imported there by the same method). While you're at it, try to correct whatever character(s) is/are causing the problem from Wikipedia. Unless I discover otherwise, there is no other workaround for this, and it would have to wait until the article is updated by someone knowledgeable on Wikipedia. -proteus 11:43, 24 Mar 2007 (EDT)

This is fixed (or at least, "corrected"), and was a simple problem with UTF-8 decoding of the WP exports. You'll now see all the garbage characters which are in their articles (which were throwing the errors before), so please, fix them as you find them! -proteus 19:08, 12 Apr 2007 (EDT)

categories and ations

While importing, please leave the category coding intact, as I will be working on a solution (I have been known to remove this, but will no more). With the "cite" coding (which currently displays as blank space), use your own judgement for the article. Certainly "cite"ations are not critical on GetWiki, if none of our pages have them, but a simple solution whereby the basic text is displayed may be coming soon. -proteus 11:48, 24 Mar 2007 (EDT)

for now, this:

NEWS,weblink GetWiki, 1234, 1234, rimric, proteus,
becomes this:

NEWS,weblink GetWiki, 1234, 1234, rimric, proteus,

Finally, I've removed the cite brackets and just left the raw citation intact, and as discussed, Categories are now handled outside the page code itself. So, when importing, you can now delete the category code and leave the citations intact. -proteus 13:15, 9 Nov 2007 (EST)

titles of articles

Okay, let's also settle this question. I think that something like the word "Philosophy" should always be capitalized, thus, an article called "Contemporary philosophy" is just bad form, and should be "Contemporary Philosophy". Something like "Boolean domain" is maybe another question, but it still looks wrong to me (in a title of an article). Anything which is a title of a theory should also be properly capitalized, like "Systems Theory" or "Semiotic Information Theory". Again, it doesn't matter how they do it on Wikipedia (and I wonder why they didn't just continue their bad habit to "Charles pierce" and "Immanuel kant"??? See how stupid that looks??), but anyway, just think of an article here as the title of a book or journal article, where basic capitalization is observed. Is anyone going to be upset if I start moving articles and updating links?? I know it seems anal, but I'm serious about building this wiki the right way - based on my experiences with Wikinfo, I'm sick of hand-me-downs from Wikipedia ;) -proteus 11:57, 2 Apr 2007 (EDT)

JA: I'm just going by the rules that I learned in High School (high school?), and I couldn't care less (colloq. cliche) what they do in Wikiputia. A term like temporary philosophy is a common noun, unless it's the name of a book, Temporary Philosophy : Get It While It's Hot — no, I don't care what APA says either — or a course of study, "Temporary Philosophy 101", in which cases among others it's capitalized as a proper noun. And let's not even get into things like boolean domain and euclidean algorithm, which seem to suffer from a curious custom of honorific decapitation among mathematicians, depending in part on the degree of love inspired by the dearly departed and just how long the eponymous honoree has been dead and gone. Jon Awbrey 12:44, 4 Apr 2007 (EDT)

On this, a couple of questions:
  • What about a field on the import screen allowing a change of title before saving an article for the first time?
  • Are we okay with having a bunch of redirects in the database? Should they be filtered out of page counts and (Special:Allpages)?

titles/topics in text and headers

JA: All those caps just don't look like contemporary English to me. My personal preference has been to use all caps for project names, book titles, etc. and to use lower case for simple topic names of the sort that would not be capitalized in normal text. Jon Awbrey 11:15, 4 Apr 2007 (EDT)

I would sorta agree (though what does contemporary English mean?) in the case of ordinary prose, but much of this WikiStuff isn't ordinary prose. Much is encyclo-speak (which tends to be semi-formal), and another chunk is made up of lists of concepts/page titles, and so on (which are headers of titles as discussed above). If I were writing a book in Philosophy (which I'm known to do), I never allow words like "philosophy" or "logic" to be lowercase unless they are referred to as words (as just now) or if used in a passive way, like "Jon's philosophy on wikis is to not sweat over wikis". Any theory names, like Quantum Mechanics, or Systems Theory, or Set Theory, or Post-Modernism, and so on, I think should always be capitalized here, since we're rarely going to be using that passive sense, and these are pretty much all theory pages. I also think Logic of Information, Logical Conjunction, Triadic Relation, and so on should also be capitalized in this context, as concepts in semi-formal prose, both as titles and in text.

Now, I know I muddy this with my small caps headings (like on this page), but that's a style choice I try to avoid in encyclo-pages - and we can allow almost anything on non-encyclo pages or posted papers/ebooks, user pages, talk pages, etc. I would also agree that I have a tendency to captalize things just a bit more than usual, which I do out of respect for intellectual ideas, I suppose. I've been pushing this capital point so that we could come to a consensus and have all our encyclo-pages follow an attractive, non-Wikipedian form, while posted ebooks, discussions, papers, reviews, and so on could follow any form the authors want to follow. Hope all that made sense... -proteus 12:34, 4 Apr 2007 (EDT)

JA: How about a definition by way of contrast? Here's an example of a non-contemporary English text:
Addressed to an
It is examined whether the Object, Principles, and Inferences of the modern Analysis are more distinctly conceived, or more evidently deduced, than Religious Mysteries and Points of Faith.
By George Berkeley
Edited by David R. Wilkins


Though I am a Stranger to your Person, yet I am not, Sir, a Stranger to the Reputation you have acquired, in that branch of Learning which hath been your peculiar Study; nor to the Authority that you therefore assume in things foreign to your Profession, nor to the Abuse that you, and too many more of the like Character, are known to make of such undue Authority, to the misleading of unwary Persons in matters of the highest Concernment, and whereof your mathematical Knowledge can by no means qualify you to be a competent Judge. Equity indeed and good Sense would incline one to disregard the Judgment of Men, in Points which they have not considered or examined. But several who make the loudest Claim to those Qualities, do, nevertheless, the very thing they would seem to despise, clothing themselves in the Livery of other Mens Opinions, and putting on a general deference for the Judgment of you, Gentlemen, who are presumed to be of all Men the greatest Masters of Reason, to be most conversant about distinct Ideas, and never to take things on trust, but always clearly to see your way, as Men whose constant Employment is the deducing Truth by the justest inference from the most evident Principles. With this bias on their Minds, they submit to your Decisions where you have no right to decide. And that this is one short way of making Infidels I am credibly informed.

Source. George Berkeley, The Analyst, David R. Wilkins (ed.).

Yes, that's the kind of text I grew up reading, and it definitely influenced my usage (and spelling). Berkeley is also one of my favourites, and humour has to be taken into account when reading him, but the selection also proves one of my points, that shooting for "contemporary" English is a moving target and a fool's errand. For one thing, there's Canadian, US, British, South African, Australian and other usages which all might differ a bit. There has to be a middle point between the Wikipedian way, of avoiding capitals at all costs, and The Selection Above. I'm not even sure what you're objecting to... ;) -proteus 13:25, 4 Apr 2007 (EDT)
JA: from the late 60's until just recently i used to write everything in lower case, and i still do email that way today — this was partly due to the influence of some facsimile holograph editions of lewis carroll that i once read and partly due to programming in pascal, in which i sought relief from all that capital letter imperative bossiness of fortran — and it actually took me quite a lot of work re-learning how to write for publication. Jon Awbrey 13:50, 4 Apr 2007 (EDT)

JA: right now it's just a matter of eyestrain and saving the xcess redirs. i just really don't care all that much what other people in their own writing though. Jon Awbrey 13:50, 4 Apr 2007 (EDT)

images from wikinfo

If you're pulling images from Wikinfo, their settings still seem to be messed up, whereby their images are there in the upload directory, but their software doesn't see them (Fred suggested they reupload all their images, like that's going to happen). So, you may have to go to the image page over there (for example, weblink), click on the file link, (change the directory from "images" to "upload" in the address bar if necessary), and then download the image for uploading over here (right click or drag, etc). I suggest you go ahead and bring all the images you want over here (related to the limited-range encyclopedia pages). The ones I recently uploaded for the Philosophy pages were ones I'd "mastered" and uploaded there, for example. It may be possible for me to generate an image import script for Wikinfo/pedia - if you're like me, importing images is a real pain... -proteus 12:09, 2 Apr 2007 (EDT)

say "NO" to wikipedianism

Please clean up Wikipedianism in all imported articles. If not, they will be deleted. GetWiki is not a mirror, so look for pages that fit in and can be customized here. Make sure links are correct for this wiki, not theirs. Make sure that all text makes sense and that a page is worth saving here, based on what GetWiki is about. Many pages are not. -proteus 14:32, 28 Sep 2007 (EDT)

The notice below is now defaulted on imports, until further notice: (can't find: GetWiki:wikipedianism)
PLEASE ensure content you import is accurate, and where bias exists, put it in the proper context. Even on the pages I've edited in Philosophy, I've recently come across some "BS" content which is not welcome here. Do not mirror content here. Make GetWiki interesting by having stuff not found on the "pseudopedias"! -proteus 15:46, 11 Nov 2007 (EST)
