Sanity checks for the wh-question data for the CHILDES lab

I wanted to provide a couple of “hints” of a sort concerning the wh-question data for the CHILDES lab (section two), just as a means of checking whether what you find/found is similar to what I found. The exact numbers you come up with aren’t going to be the same as mine anyway, since so much rides on the exact choices you make about which utterances are included and which are excluded, etc.

However: I found roughly 30 wh-questions in Nina’s early files, and a little under 200 wh-questions in Nina’s later files. I wound up excluding around 20-30% of the utterances at each stage, and of the utterances that count (where the wh-word is not the subject), I found that very, very nearly all were on one side of the divide between having and lacking a subject. In fact, I found that all but one of about 150 utterances across all files, both early ones and late ones, were of the same type.

So, if this is not approximately what you found, then you probably missed something somewhere. There will be variation because there is an element of subjectivity to doing these counts (and that’s not a good thing, but it’s kind of an inescapable thing), but if your numbers are different from these by a lot, then check with me before getting much further.