Astronomers Find a Waterworld Planet With Deep Oceans in the Habitable Zone – Universe Today

I recently came across an article titled “Astronomers Find a Waterworld Planet With Deep Oceans in the Habitable Zone“. Curious what they actually found, I clicked through to the article. It was about what I expected.

The entire subject of discovering exoplanets is one that does not fill me with confidence. I get the basic approach used, which is looking for regular dimming of stars caused by the transit of a planet in front of the star as it orbits the star. And, indeed, you would expect a planet orbiting a star to (slightly) dim the light coming from that star if you’re lucky enough for the planet to pass right in front of it relative to us. That said, when I say slight, I mean slight. To put it into perspective, our sun has a diameter 109 times larger than the diameter of the earth. In terms of cross-sectional area, that means that the earth’s shadow is about 1/10,000th that of the sun’s. It will block out a little more of the sun than that, since it’s a few million miles in front of the sun rather than directly in front of it, but since we’re observing stars that are light-years away, it won’t be that much more. Jupiter, which is nearly as large as planets can get (as a gas giant’s mass goes up much past Jupiter’s, its gravity causes it to contract), would block out about 1/100th of the sun. So what astronomers are looking for is somewhere between a 1% dimming and a 0.01% dimming.

Even less confidence inspiring, when you look into the actual data, the stars in question are generally around 1pixel big in the images that they’re using. This isn’t always the case, of course, but the stars are never more than a few pixels. In the article in question, when the researchers turned to a much higher resolution telescope, they were able to distinguish the two stars of the binary system where the “waterworld” orbits the larger of the two within the habitable zone. (If you’re not familiar, the habitable zone of a star is the distance away where the heat from the star would result in liquid surface water, as we have here on earth. Too close and the planet will be too hot and the oceans will boil off, too far and they will freeze.) Oh, and these two stars are orbiting each other from roughly twice the distance that Pluto is from the sun. And the high resolution telescope was able to make them out as two distinct source of light.

No one has ever seen this supposed “Water world”. What we have is a periodic dimming of the host star. From the magnitude of that dimming we can calculate the size of the thing crossing in front of it. From the period of the dimming and the time between the dimming we can calculate the orbital period and thus the distance from the star. From the size and orbital period we can calculate the mass, and hence the density.

That last part is the basis of the claim for a “water world” came from, by the way. The density of the planet that was detected is too low to be a rocky planet like earth, and too high to be a gaseous planet. Since it’s in the habitable zone of its star, it’s unlikely to be icy, and so it is a good candidate for being a water world. This in no way justifies calling it a water world, nor does it justify the artist’s rendition of what the surface of it might look like that’s in the article (which is just a picture of the sun setting over the ocean here on earth). It also doesn’t justify the Star Trek like artist’s rendition of the planet near to a sun-like star. The star that the planet is orbiting is a red dwarf. They’re called red dwarves because they don’t put out white light like our sun does. If you look up TOI-1452A (the red dwarf star; TOI-1452 b is the planet) it has a surface temperature of 3185k. It’s not that it puts out literally no blue light, but it puts out very little. This is the dingy yellow-orange light of a low wattage “warm white” incandescent bulb. Oh, and the star only puts out 0.7% of the light that our sun does.

These sort of articles really annoy me because they pretend to have an enormous amount of certainty that we don’t have. What’s actually going on is a little bit of data and a whole lot of calculations. This is interesting, but it does a great disservice to people to pretend that what we have is a lot of data. We don’t.

Moreover, these are all unverified calculations. No one alive today is ever going to set eyes on a photograph of one of these planets to get an independent source of data about their size or composition, or even their existence. It took nine years for the New Horizons probe to fly out to Pluto. Here’s the best picture Wikipedia has of Haumea, a dwarf planet in our solar system:

Haumea is only about 10 AU further away from the sun that Pluto is. (An AU is the distance from the earth to the sun.) Here’s Eris, which is more massive than Pluto, though not quite as large, and which is much further away:

Eris is, at its farthest, about twice as far away from the sun as Pluto. And this is the best picture that we have of it. (Or at least it’s the best picture that Wikipedia has.)

If this is the best that we have of dwarf planets in our own solar system, it suggests that a bit of humility is warranted when it comes to conclusions about planets orbiting other stars. Our galaxy is a big place. There’s no reason to suppose that there is nothing besides exoplanets which will regularly result in the slight dimming of a star’s light. That’s not to say that there’s something wrong with going with what we know—that is, with saying that if the slight regular dimming is caused by an exoplanet, then the exoplanet would have such and such properties. If people are going to get tired and drop the “if”, then perhaps it would be better to stop talking about the subject at all.

Dozens of Major Cancer Studies Can’t Be Replicated

I recently came across an interesting article in Science News on widespread replication failure in cancer studies. It’s interesting, though not particularly shocking, that the Replication Crisis has claimed one more field.

If you’re not familiar with the Replication Crisis, it has to do with how it was widely assumed that scientific experiments described in peer-reviewed journals were reproducible—that is, if someone else performed the experiment, they would get the same result. Reproducibility of experiments is the foundation of trust in the sciences. The theory is that once somebody has done the hard work of designing an experiment which produces a useful result, others can merely follow the experimental method to verify that the result really happens and that after an experiment has been widely reproduced, people can be very confident in the result because so many people have seen it for themselves and we have widespread testimony of it. Or, indeed, people can perform these experiments as they work their way through their scientific education.

That’s the theory.

Practice is a bit different.

The problem is that science became a well-funded profession. The consequence is that experiments became extraordinarily expensive and time-intensive to perform. The most obvious example would cloud-chamber experiments in super-colliders. The Large Hadron Collider cost somewhere around $9,000,000,000 to build and requires teams of people to operate. Good luck verifying the experiments it performs for yourself.

Even when you’re on radically smaller scales and don’t require expensive apparatus—say you want to assess the health effects of people cutting out coffee from their diet—putting together as study is enormously time-intensive. And it costs money to recruit people; you generally have to pay them for their participation, and you need someone skilled in periodically assessing whatever health metrics you want to assess. Blood doesn’t draw itself and run lipid panels, after all.

OK, so amateurs don’t replicate experiments anymore. But what about other professionals?

Here we come to one of the problems introduced by “Publish Or Perish”. Academics only get status and money for achieving new results. For the most part people don’t get grants to do experiments that other people have already done and get the same results that they got. This should be a massive monkey wrench in the scientific machine, but for a long time people ignored the problem and papered over it by saying that experiments will get verified when other people try to build on the results of previous experiments and fail.

It turns out that doesn’t work, at least not nearly well enough.

The first field in which people got serious funding to try to actual replicate results to see if they replicate was in psychology, and it turned out that most wouldn’t replicate. To be fair, in many cases this was because the experiment was not well-described enough that one could even set up the same experiment again, though this is, to some degree, defending oneself against a charge of negligence by claiming incompetence. Of those studies which were described well enough that it was possible to try to replicate them, something like less than half replicated. They tended to fail to replicate in one of two ways:

  1. The effect didn’t happen often enough to be statistically significant
  2. The effect was statistically significant but so small as to be practically insignificant

To give a made-up example of the first, if you deprive people of coffee for a few months and one out of a few hundred see a positive result, then it may well be you just chanced onto someone who improved for some other reason while you were trying to study coffee. To give an example of the second, you might get a result like everyone’s systolic blood pressure went down by one tenth of a millimeter of mercury. There’s virtually no way you got a result that common in the group by chance, but it’s utterly irrelevant to any reasonable goal a human being can have.

Psychology does tend to be a particularly bad field when it comes to experimental design and execution, but other fields took note and wanted to make sure that they were as much better than the psychologists as they assumed.

And it turned out that many fields were not.

I find it interesting, though not very surprising, that oncology turns out to be another field in which experiments are failing to replicate. After all, in a field which isn’t completely new, it’s easier to get interesting results that don’t replicate than it is to get interesting results that do.

Awful Scientific Paper: Cognitive Bias in Forensic Pathology Decisions

I came across a rather bad paper recently titled Cognitive Bias in Forensic Pathology Decisions. It’s impressively bad in a number of ways. Here’s the abstract:

Forensic pathologists’ decisions are critical in police investigations and court proceedings as they determine whether an unnatural death of a young child was an accident or homicide. Does cognitive bias affect forensic pathologists’ decision-making? To address this question, we examined all death certificates issued during a 10-year period in the State of Nevada in the United States for children under the age of six. We also conducted an experiment with 133 forensic pathologists in which we tested whether knowledge of irrelevant non-medical information that should have no bearing on forensic pathologists’ decisions influenced their manner of death determinations. The dataset of death certificates indicated that forensic pathologists were more likely to rule “homicide” rather than “accident” for deaths of Black children relative to White children. This may arise because the base-rate expectation creates an a priori cognitive bias to rule that Black children died as a result of homicide, which then perpetuates itself. Corroborating this explanation, the experimental data with the 133 forensic pathologists exhibited biased decisions when given identical medical information but different irrelevant non-medical information about the race of the child and who was the caregiver who brought them to the hospital. These findings together demonstrate how extraneous information can result in cognitive bias in forensic pathology decision-making.

OK, let’s take a look at the actual study. First, it notes that black children’s deaths were more likely to be ruled homicides (instead of accidents) than white children’s deaths, in the state of Nevada, between 2009 and 2019. More accurately, of those deaths of children under 6 which were given some form of unnatural death ruling, the deaths of black children were significantly more likely to be rated a homicide rather than an accident than were the deaths of white children.

It’s worth looking at the actual numbers, though. Of all of the deaths of children under 6 in Nevada between 2009 and 2019, 8.5% of the deaths of black children were ruled a homicide by forensic pathologists while 5.6% of the deaths of white children were ruled a homicide. That’s not a huge difference. They use some statistics to make it look much larger, of course, because they need to justify why they did an experiment on this.

In fairness to the authors, they do correctly note that these statistics don’t really mean much on its own, since black children might have been murdered statistically more often than white children, during those time periods in Nevada. It doesn’t reveal cognitive biases if the pathologists were simply correct about real discrepancies.

So now we come to the experiment: They got 133 forensic pathologists to participate. They took a medical vignette about a child below six who was discovered motionless on the living room floor by their caretaker, brought the ER, and died shortly afterwards. “Postmortem examination determined that the toddler had a skull fracture and subarachnoid hemorrhage of the brain.”

The participants were broken up into two groups, which I will call A and B. 65 people were assigned to A and 68 to B. All participants were given the same vignette, except that, to be consistent with typical medical information, the race of the child was specified. Group A’s information stated that the child was black, while group B’s information stated that the child was white. OK, so they then asked the pathologists to give a ruling on the child’s death as they normally would, right?

No. They included information about the caretaker. This is part of the experiment to determine bias, because information about the caretaker is not medically relevant.

OK, so they said that the caretaker had the same race as the child?

Heh. No. Nothing that would make sense like that.

The caretaker of the black child was described as the mother’s boyfriend, while the caretaker of the white child was the child’s grandmother. Their race was not specified, though for the caretaker of the white child it can be (somewhat) inferred from the blood relation, depending on what drop-of-blood rule one assumes the investigators are using to determine the child is white. Someone who is 1/4 black, where the caretaker grandmother was the black grandparent, might well be identified as white, or perhaps the 1 drop of blood rule is applied at the grandmother could be at most 1/8 black for her grandchild to qualify to the racist experimenters as white. Why do they leave out the race of the caretaker despite clearly wanting to draw conclusions about it? Why, indeed.

More to the point, these are not at all comparable things. It is basic human psychology that people are far less likely to murder their descendants than they are to murder people not related to them. Moreover, males are more likely to commit violent crimes than females are (with some asterisks; there is some evidence to suggest that women are possibly even more likely to hit children than men are but just get away with it more because people prefer to look away when women are violent, but in any event the general expectation is that a male is more likely to be violent than a female is). Finally, young people are significantly more likely to be violent than older people are.

In short, in the vignette given to group A, the dead child is black and the caretaker who brought them in is given 3 characteristics, each of which, on its own, makes violence more statistically likely. In group B, the dead child is white and the caretaker who brought them in is given 3 characteristics, each of which, on its own, makes violence more statistically unlikely. For Pete’s sake, culturally, we use grandmothers as the epitome of non-violence and gentleness! At this point, why didn’t they just give the caretaker of the black child multiple prior convictions for murdering children? Heck, why not have him give such medically extraneous information as repeatedly saying, “I didn’t hit him with the hammer that hard. I don’t get why he’s not moving.” I suppose that would have been too on-the-nose.

Now, given that we’re comparing a child in the care of mom’s boyfriend to a child in the care of the child’s grandmother, what do they call group A? Boyfriend Condition? Nope. Black Condition. Do they call group B Grandma Condition? Nope. White Condition.

OK, so now that we have a setup clearly designed to achieve a result, what are the results?

None of the pathologists rated the death “natural” or “suicide.” 78 of the 133 pathologists ruled the child’s death “undetermined” (38 from group A, 40 from group B). That is, 58.6% of pathologists rules it “undetermined”. Of the minority who ruled conclusively, 23 ruled it homicide and 32 ruled it homicide. (That is, 17.2% of all pathologists ruled it accident and 24% of all pathologists ruled it homicide.)

In group A, 23 pathologists ruled the case homicide, 4 ruled it accident, and 38 ruled it undetermined. In group B, 9 ruled it homicide, 19 ruled it accident, and 40 ruled it undetermined.

This is off from an exactly equal outcome by approximately 15 out of 133 pathologists. I.e. if about 7 pathologists in group A had ruled accident instead of homicide, and 7 pathologists in group B ruled homicide instead of accident, the results would have been equal between both groups. As it was, this is a big enough difference to get statistical significance, which is just a measure of whether the random chance you see 95% of the time is sufficient to entirely explain the results. What it doesn’t do is show a pervasive trend. If 11% of the participants had reversed their ruling, the experiment would have shown that the 18.6% of forensic pathologists on an email list of board-certified pathologists who responded to the study were paragons of impartiality.

There’s an especially interesting aspect to the last paragraph of the conclusion:

Most important is the phenomenon identified in this study, namely demonstrating that biases by medically irrelevant contextual information do affect the conclusions reached by medical examiners. The degree and the detailed nature of these biasing effects require further research, but establishing biases in forensic pathology decision-making—the first study to do so—is not diminished by the potential limitation of not knowing which specific irrelevant information biased them (the race of the child, or/and the nature of the caretaker). Also, one must remember that the experimental study is complemented and corroborated by the data from the death certificates.

The first part is making a fair point, which is that the study does demonstrate that it is possible to bias the forensic pathologist by providing medically irrelevant information, such as the caretaker being far more likely to have intentionally hurt the child. Why didn’t they make all of the children white and just have half of the vignettes including the caretaker with multiple previous felony convictions, who was inebriated, repeatedly state, “I only hit the little brat with a hammer four times”? If we’re only trying to see whether medically irrelevant information can bias the medical examiner, that would do it too. But what’s up with varying the race of the child?

While it’s probably just to be sensationalist because race-based results are currently hot, it may also be a tie-in to that last sentence: “Also, one must remember that the experimental study is complemented and corroborated by the data from the death certificates.” This sentence shows a massive problem with the researcher’s understanding of the nature of research. Two bad data sources which corroborate each other do not improve each other.

To show this, consider a randomly generated data source. Instead of giving a vignette, just have another set of pathologists randomly answer “A”, “B,” or “C”. Then decide that A corresponds to undetermined, B to homicide, and C to accident. There’s a good chance that people won’t pick these evenly, so you’ll get a disparity. If it happens to be the same, it doesn’t bolster the study to say “the results, it must be remembered, also agreed with the completely-blinded study in which pathologists picked a ruling at random, without knowing what ruling they picked”.

Meaningless data does not acquire meaning by being combined with other meaningless data.

The conclusion of the study is, curiously, entirely reasonable. It basically amounts to the observation that if you want a medical examiner making a ruling based strictly on the medical evidence, you should hide all other evidence but the medical evidence from them. This, as the British like to say, no fool ever doubted. If you want someone to make a decision based only on some information, it is a wise course of action to present them only that information. Giving them information that you don’t want them to use is merely asking for trouble. It doesn’t require a badly designed and interpreted study to make this point.